Summary: | ASTERISK-14222: deadlock in res_timing_pthread and chan_sip do_monitor/rettransmit | ||
Reporter: | Tim Ringenbach at Asteria Solutions Group (tim_ringenbach) | Labels: | |
Date Opened: | 2009-05-28 19:01:22 | Date Closed: | 2011-06-07 14:08:11 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Resources/res_timing_pthread |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) backtrace.log ( 1) locks3.txt | |
Description: | I ran into this deadlock while sending about 250 channels of ulaw fax between two asterisk boxes. Removing res_timing_pthread.so and replacing it with res_timing_dahdi.so seemed to make it stop deadlocking. I'll attach the complete backtrace and 'core show locks' as files. But the deadlock seems to be between these two places: __owner = 15306, gdb) bt #0 0xb7ee7410 in __kernel_vsyscall () #1 0xb7cd8881 in select () from /lib/tls/i686/cmov/libc.so.6 #2 0xb70fc4ca in read_pipe (rd_fd=1038, quantity=1, clear=1) at res_timing_pthread.c:378 #3 0xb70fc146 in pthread_timer_disable_continuous (handle=1038) at res_timing_pthread.c:247 #4 0x08187f56 in ast_timer_disable_continuous (handle=0x8f2c0d8) at timing.c:185 ASTERISK-1 0x080a8167 in __ast_read (chan=0x8676368, dropaudio=0) at channel.c:2644 ASTERISK-2 0x080a9e57 in ast_read (chan=0x8676368) at channel.c:2993 ASTERISK-3 0xb35e57e7 in wait_for_answer (in=0xae00e668, outgoing=0x87ba0b0, to=0xb16b3604, peerflags=0xb16b3dac, pa=0xb16b361c, num_in=0xb16b3428, result=0xb16b35e8) at app_dial.c:887 ASTERISK-4 0xb35ebb12 in dial_exec_full (chan=0xae00e668, data=0xb16b60c8, peerflags=0xb16b3dac, continue_exec=0x0) at app_dial.c:1846 ASTERISK-5 0xb35ee3c1 in dial_exec (chan=0xae00e668, data=0xb16b60c8) at app_dial.c:2252 ASTERISK-6 0x08123e32 in pbx_exec (c=0xae00e668, app=0x826a090, data=0xb16b60c8) at pbx.c:1348 ASTERISK-7 0x0812d218 in pbx_extension_helper (c=0xae00e668, con=0x0, context=0xae00eee8 "outcontext", exten=0xae00ef38 "5555555555", priority=3, label=0x0, callerid=0xad3ee170 "2567050287", action=E_SPAWN, found=0xb16b8228, combined_find_spawn=1) at pbx.c:3690 ASTERISK-8 0x0812e7d2 in ast_spawn_extension (c=0xae00e668, context=0xae00eee8 "outcontext", exten=0xae00ef38 "5555555555", priority=3, callerid=0xad3ee170 "2567050287", found=0xb16b8228, combined_find_spawn=1) at pbx.c:4143 ASTERISK-9 0x0812efc5 in __ast_pbx_run (c=0xae00e668, args=0x0) at pbx.c:4233 ASTERISK-10 0x0813077b in pbx_thread (data=0xae00e668) at pbx.c:4520 ASTERISK-11 0x0819353d in dummy_start (data=0xae26e050) at utils.c:968 ASTERISK-12 0xb7ad24fb in start_thread () from /lib/tls/i686/cmov/libpthread.so.0 ASTERISK-13 0xb7cdfe5e in clone () from /lib/tls/i686/cmov/libc.so.6 And Thread 526 (Thread 0xb6fb8b90 (LWP 14285)): ASTERISK-2 0xb7cf35f6 in backtrace () from /lib/tls/i686/cmov/libc.so.6 ASTERISK-3 0x0811025d in ast_bt_get_addresses (bt=0x867652c) at logger.c:1201 ASTERISK-4 0xb6fd8019 in __ast_pthread_mutex_trylock (filename=0xb704bc14 "chan_sip.c", lineno=3481, func=0xb704d15c "retrans_pkt", mutex_name=0xb704d400 "&pkt->owner->owner->lock_dont_use", t=0x8676400) at /usr/src/asterisk-1.6.2.0-beta2/include/asterisk/lock.h:633 ASTERISK-5 0xb6fd7b4a in retrans_pkt (data=0x8895b08) at chan_sip.c:3481 ASTERISK-6 0x0817817d in ast_sched_runq (con=0xb71899a8) at sched.c:620 ASTERISK-7 0xb7036899 in do_monitor (data=0x0) at chan_sip.c:21330 | ||
Comments: | By: Russell Bryant (russell) 2009-05-29 12:02:54 Is it really sitting in select? That code tells select() to return immediately ... By: Tim Ringenbach at Asteria Solutions Group (tim_ringenbach) 2009-05-29 12:35:27 I don't know. I guess I could have just caught it in there. I probably won't have time to do it again and check until at least Tuesday. By: Russell Bryant (russell) 2009-05-29 17:38:22 Alright. Also, if you try again, please try the latest code in the 1.6.2 branch, as I just committed some fixes to res_timing_pthread that could be related. By: Leif Madsen (lmadsen) 2009-06-16 14:03:58 Just ping this issue again to see if we can get a report back from the reporter to determine if the issue is now resolved so we can close this issue. Thanks! By: Leif Madsen (lmadsen) 2009-06-24 13:58:34 Closing this issue for now. The reporter is free to reopen the issue should this still be a problem. Thanks! |