Summary: | ASTERISK-15966: ast->tech_pvt->rtp contains garbage yielding SEGFAULT | ||
Reporter: | Walter Doekes (wdoekes) | Labels: | |
Date Opened: | 2010-04-16 06:12:28 | Date Closed: | 2011-06-07 14:01:06 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_sip/General |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) ast1430-17193-btfull.txt ( 1) ast1430-17193-threadbt.txt ( 2) ast1430-17193-variables.txt | |
Description: | Hi, somehow I can get asterisk-1.4.30-rc3 and earlier (1.4.24) to SEGFAULT. I haven't tried the 1.4.31-rc1 yet. But I don't think you've fixed anything related there. Backtrace info: root@voip-test:/usr/src/asterisk-1.4.30-rc3# asterisk -V Asterisk 1.4.30-rc3 root@voip-test:/usr/src/asterisk-1.4.30-rc3# gdb `which asterisk` /root/asterisk-crash.core ... #0 0x00007f92aedf2b66 in poll () from /lib/libc.so.6 (gdb) thread 6 [Switching to thread 6 (process 26472)]#0 ast_rtp_write (rtp=0x31202f2045544956, _f=0x42532f00) at rtp.c:2875 2875 if (!rtp->them.sin_addr.s_addr) (gdb) bt #0 ast_rtp_write (rtp=0x31202f2045544956, _f=0x42532f00) at rtp.c:2875 #1 0x00007f9297996952 in sip_write (ast=0x14e6360, frame=0x42532f00) at chan_sip.c:3922 #2 0x00007f9297996c55 in sip_rtp_prod (data=<value optimized out>) at chan_sip.c:4650 #3 0x00000000004a92db in ast_sched_runq (con=<value optimized out>) at sched.c:363 #4 0x00007f92979800fa in do_monitor (data=<value optimized out>) at chan_sip.c:17082 ASTERISK-1 0x00000000004b6c7c in dummy_start (data=<value optimized out>) at utils.c:856 ASTERISK-2 0x00007f92af75bfc7 in start_thread () from /lib/libpthread.so.0 ASTERISK-3 0x00007f92aedfb5ad in clone () from /lib/libc.so.6 ASTERISK-4 0x0000000000000000 in ?? () (gdb) list 2874 /* If we have no peer, return immediately */ 2875 if (!rtp->them.sin_addr.s_addr) 2876 return 0; That rtp value is garbage and therefore the process is killed when trying to access it. ****** STEPS TO REPRODUCE ****** I'm not really sure what causes this. But these crashes started to happen when I was playing with INVITE+replaces style call-pickup. My setup is as follows: phone -> opensips -> asterisk The NOTIFY's that hold the call-pickup info contain the asterisk server IP: <dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" version="41" state="full" entity="511599201@sip1.sig.gntel.nl"> <dialog id="15b4840f122793461992f9a20734a144@95.215.204.66" call-id="15b4840f122793461992f9a20734a144@95.215.204.66" local-tag="754751e353271775i3" remote-tag="as093f5e12" direction="recipient"> <state>early</state> <remote> <identity display="pascal211">sip:211@95.215.204.66</identity> <target uri="sip:211@95.215.204.66"/> </remote> <local> <identity>sip:511599201@sip1.sig.gntel.nl</identity> <target uri="sip:511599201@sip1.sig.gntel.nl"/> </local> </dialog> </dialog-info> When doing call pickup, my NATed phone bypasses the OpenSIPS and sends out the INVITE+replaces to the Asterisk machine directly. Call pickup then does not work because (for starters) the Contact/Via contain RFC1918 IP's. This is no problem, because I'm only trying out which call-pickup methods work and which do not. But after a while I get crashes. Unfortunately, the crash happens first after a while and (in the case of the attached core dump) not on such a Replace'ing INVITE. So, I cannot be entirely sure that it's the INVITE+replace that causes the breakage, but it's my best guess at the time. ****** ADDITIONAL INFORMATION ****** If I find more info, I'll add that. I've marked it as private as I don't know how serious this might be. Regards, Walter Doekes OSSO B.V. | ||
Comments: | By: Walter Doekes (wdoekes) 2010-04-16 07:05:00 The core dump: http://wjd.nu/files/2010/04/asterisk17193-rtp-crash.core I hope it can tell someone anything. (Thanks leif for flagging this private.) By: Walter Doekes (wdoekes) 2010-04-16 07:31:51 Okay, the good news is that it crashed without any INVITE+replaces this time. The bad news is that now the reproducibility is completely random to me. I shall now check if the machine (memory, motherboard) is at fault first. By: Walter Doekes (wdoekes) 2010-04-16 08:50:29 FAIL! It was my own broken patch for bug 17012 that causes it. See the sip_rtp_prod in the backtrace that shouldn't be there. Phew. Sorry :-) I hope I haven't wasted too much of your time. By: Leif Madsen (lmadsen) 2010-04-16 08:52:21 Closed based on feedback from reporter. Issue was in a local patch. |