Summary:ASTERISK-04634: [patch] segfault on using SIP_HEADER on dead channel
Reporter:Guenther Starnberger (gst)Labels:
Date Opened:2005-07-20 02:22:18Date Closed:2008-01-15 15:42:16.000-0600
Versions:Frequency of
Environment:Attachments:( 0) chan_sip_func_header_read.diff
( 1) transfer-without-python.txt
( 2) transfer-with-python.txt
Description:When using Asterisk with SIP transfers i'm able to reproduce a problem were the transfer does cause a segfault.

The only entry in the used context looks as follows:

exten => _.,1,Python(incoming_sip_dial,${EXTEN},${SIPDOMAIN},${SIP_HEADER(Proxy-Authorization)},proxy)

The Python App is part of the PyAstre module - because of the backtrace I don't think that PyAstre is responsible for the problem.

I can reproduce the problem by doing the following: A calls B. B puts A on hold and calls C. B transfers A to C. Segfault.

A is a Sipura ATA (1001), B is a IntelliTouch ITC-3002 and C is a Sipura SPA-841. I'm currently only able to reproduce the segfault when calling in this combination.

GDB shows this information:
------ 8< ------ 8< ------ 8< ------ 8< ------
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 524320 (LWP 9238)]
0x409b0776 in __get_header (req=0x17cc, name=0xbbbf64b7
"Proxy-Authorization", start=0xbbbf645c)
   at chan_sip.c:2517
2517    {
------ 8< ------ 8< ------ 8< ------ 8< ------
The backtrace:
------ 8< ------ 8< ------ 8< ------ 8< ------
#0  0x408b0776 in __get_header (req=0x17cc, name=0xbd9f64b7
 "Proxy-Authorization", start=0xbd9f645c)
 at chan_sip.c:2517
#1  0x408b0915 in get_header (req=0xbd9f645c, name=0xbd9f645c "") at
#2  0x408c1dac in func_header_read (chan=0x821b1b0, cmd=0xbd9f64ac
 data=0xbd9f64b7 "Proxy-Authorization", buf=0xbd9f8514 "",
 len=3181339740) at chan_sip.c:8108
#3  0x08080419 in ast_func_read (chan=0xbd9f645c, in=0xbd9f64b7
 workspace=0xbd9f645c "", len=3181339740) at pbx.c:1328
#4  0x08088901 in pbx_substitute_variables_helper_full (c=0x821b1b0,
 cp2=0xbd9fd739 "", count=8170, headp=0x0) at pbx.c:1466
ASTERISK-1  0x080806cd in pbx_substitute_variables_helper (c=0xbd9f645c,
 cp1=0xbd9f645c "", cp2=0xbd9f645c "",
 count=-1113627556) at pbx.c:1547
ASTERISK-2  0x08088e33 in pbx_extension_helper (c=0x821b1b0, con=0xbd9f645c,
 context=0x821b300 "incoming-sip",
 exten=0x821b3f4 "h", priority=1, label=0x0, callerid=0x81789ec "Python",
 at pbx.c:1623
ASTERISK-3  0x08089c86 in __ast_pbx_run (c=0x821b1b0) at pbx.c:2234
ASTERISK-4  0x0808a5d8 in pbx_thread (data=0x821b1b0) at pbx.c:2521
ASTERISK-5  0x40026e51 in pthread_start_thread () from /lib/libpthread.so.0
ASTERISK-6 0x40026ecf in pthread_start_thread_event () from /lib/libpthread.so.0
ASTERISK-7 0x402f792a in clone () from /lib/libc.so.6
------ 8< ------ 8< ------ 8< ------ 8< ------
I am using a CVS checkout from Jul 13. I noticed that in my CDRs the Proxy-Auth argument to the Dial Cmd is truncated (but it seems that my App is able to access the whole string nevertheless).

Do you need any further debugging information?


I've posted the same bug description to the mailinglist yesterday: http://lists.digium.com/pipermail/asterisk-dev/2005-July/014092.html (But the bug tracker seems to be the better place :)
Comments:By: Olle Johansson (oej) 2005-07-20 06:58:14

We need a complete SIP debug up to the point of the crash to see what happens within Asterisk. Set debug to 4 and verbose to 4 and capture the output (read the bug guidelines).

Also, can you try this without your python script and see if you can repeat it?

By: Mark Spencer (markster) 2005-07-20 08:59:30

Also please use the latest CVS head...

By: Guenther Starnberger (gst) 2005-07-20 09:48:09

I've attached two files to this bug. The first one ("transfer-with-python") is the log (sip debug, debug&verbose level = 4) with my Python app in the extensions.conf. In the second one I replaced the Python app with the following entries:

exten => 86,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => 86,2,Dial(SIP/u6@voip.sysfrog.org)
exten => 84,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => 84,2,Dial(SIP/u4@voip.sysfrog.org)
exten => 85,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => 85,2,Dial(SIP/u5@voip.sysfrog.org)
exten => 56,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => 56,2,Dial(SIP/gst@voip.sysfrog.org)
exten => u6,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => u6,2,Dial(SIP/u6@voip.sysfrog.org)
exten => u4,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => u4,2,Dial(SIP/u4@voip.sysfrog.org)
exten => u5,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => u5,2,Dial(SIP/u5@voip.sysfrog.org)
exten => u6,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})
exten => u6,2,Dial(SIP/gst@voip.sysfrog.org)

When not using the Python App there was no segfault, but several of these entries in the log (see transfer-without-python.txt):

"!!!!!!!---------------************* Why are we here with this packet???? BYE"

Finally I tried to use the Python App again, but pass the content of the "Proxy-Auth" header to another Var which I use as argument:

exten => _.,1,Set(proxyauth=${SIP_HEADER(Proxy-Authorization))
exten => _.,2,Python(incoming_sip_dial,${EXTEN},${SIPDOMAIN},${proxyauth},proxy)

With this configuration I don't get a segfault either (but therefore the "why are we here" log-msgs as in "tra44nsfer-without-python").

By: Guenther Starnberger (gst) 2005-07-20 09:49:28

Add-on to my last note:

Although the "!!!!!!!---------------************* Why are we here with this packet???? BYE" stuff was logged the transfer worked fine.

By: Olle Johansson (oej) 2005-07-20 10:03:06

The "Why are we here" is not important. It was a result of a discussion between Kevin and myself. I believe we have the answer to that question now... :-)

By: Olle Johansson (oej) 2005-07-20 10:13:12

When you REFER to extension 85 something happens in one case, but not the other... Strange. The REFER has no PROXY-auth header, so SIP_HEADER should return nothing in both cases. Anyone else that sees something?

By: Guenther Starnberger (gst) 2005-07-20 11:10:42

I just noticed a typo in this code :(

exten => _.,1,Set(proxyauth=${SIP_HEADER(Proxy-Authorization))
exten => _.,2,Python(incoming_sip_dial,${EXTEN},${SIPDOMAIN},${proxyauth},proxy)

the '}' is missing in the Set command, so the SIP_HEADER stuff isn't called.

i'll recheck this tomorrow (with a new cvs checkout) when i have physical access to the phones again.

By: Russell Bryant (russell) 2005-07-20 12:40:02

Just to throw this out there ...

Could this be caused by the fact that we are not holding the lock on the sip_pvt structure in func_header_read?  Then, maybe req is gone by the time we get into __get_header?

By: Guenther Starnberger (gst) 2005-07-21 04:25:39

Ok. I've now updated to CVS-HEAD and fixed the typo in the extensions.conf.

The result is that:
exten => _.,1,Set(proxyauth=${SIP_HEADER(Proxy-Authorization)})
exten => _.,2,Python(incoming_sip_dial,${EXTEN},${SIPDOMAIN},${proxyauth},proxy)
causes the segfault too.

When using the Python Application in the extensions.conf this time I get a little bit more information from GDB than in my first report:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 311315 (LWP 9234)]
0x408bf5de in __get_header (req=0x17d0, name=0xbd5f64a7 "Proxy-Authorization", start=0xbd5f644c) at chan_sip.c:2677
2677            for (pass = 0; name && pass < 2;pass++) {
(gdb) bt
#0  0x408bf5de in __get_header (req=0x17d0, name=0xbd5f64a7 "Proxy-Authorization", start=0xbd5f644c) at chan_sip.c:2677
#1  0x408b5b75 in get_header (req=0x0, name=0x0) at chan_sip.c:2703
#2  0x408c7930 in func_header_read (chan=0x8222948, cmd=0xbd5f649c "SIP_HEADER", data=0xbd5f64a7 "Proxy-Authorization", buf=0xbd5f8504 "",
   len=4096) at chan_sip.c:8161
#3  0x08082469 in ast_func_read (chan=0x0, in=0xbd5f64a7 "Proxy-Authorization", workspace=0x0, len=0) at pbx.c:1328
#4  0x0808a8d7 in pbx_substitute_variables_helper_full (c=0x8222948,
   cp1=0x81d5558 "incoming_sip_dial|${EXTEN}|${SIPDOMAIN}|${SIP_HEADER(Proxy-Authorization)}|proxy", cp2=0xbd5fd729 "", count=8170, headp=0x0)
   at pbx.c:1463
ASTERISK-1  0x0808271d in pbx_substitute_variables_helper (c=0x0, cp1=0x0, cp2=0x0, count=0) at pbx.c:1544
ASTERISK-2  0x0808ae75 in pbx_extension_helper (c=0x8222948, con=0x0, context=0x8222a98 "incoming-sip", exten=0x8222b8c "h", priority=1, label=0x0,
   callerid=0x1 <Address 0x1 out of bounds>, action=8192) at pbx.c:1620
ASTERISK-3  0x0808bda7 in __ast_pbx_run (c=0x8222948) at pbx.c:2149
ASTERISK-4  0x0808c768 in pbx_thread (data=0x8222948) at pbx.c:2436
ASTERISK-5  0x40026e51 in pthread_start_thread () from /lib/libpthread.so.0
ASTERISK-6 0x40026ecf in pthread_start_thread_event () from /lib/libpthread.so.0
ASTERISK-7 0x402f792a in clone () from /lib/libc.so.6

The interesting thing is that the problems appear to be in the hangup extension. I use a wildcard which also matches on the special extensions (because I also use non-numeric extensions), but my Applications detects if it was called with a special extension and exits if this is the case.

For testing purposes I tried to call my Python application with _X. instead of just _. (so that it isn't called for the hangup extension). This worked - no segfault.

Further I tried the following extension configuration:

exten => 86,1,Dial(SIP/u6@voip.sysfrog.org)
exten => 84,1,Dial(SIP/u4@voip.sysfrog.org)
exten => 85,1,Dial(SIP/u5@voip.sysfrog.org)
exten => 56,1,Dial(SIP/gst@voip.sysfrog.org)
exten => u6,1,Dial(SIP/u6@voip.sysfrog.org)
exten => u4,1,Dial(SIP/u4@voip.sysfrog.org)
exten => u5,1,Dial(SIP/u5@voip.sysfrog.org)
exten => u6,1,Dial(SIP/gst@voip.sysfrog.org)
exten => h,1,Set(foo=${SIP_HEADER(Proxy-Authorization)})

This causes a segfault when transfering.

So it seems that the segfault occurs when the SIP header is read in the hangup extension.

By: Guenther Starnberger (gst) 2005-07-21 04:31:28

Another thing which may be related to this:

exten => 100,1,SubString,foo=${SIP_HEADER(From)}|0|-4
exten => 100,2,Dial(Zap/r1/004369910194306)

works fine. But when using:

exten => 100,1,SubString,foo=${SIP_HEADER(Proxy-Authorization)}|0|-4
exten => 100,2,Dial(Zap/r1/004369910194306)

I get a segfault as soon as I call 100 (without doing a transfer).

GDB says:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 311315 (LWP 8928)]
pbx_builtin_setvar_helper (chan=0x70696f76, name=0xbd5f93d8 "foo",
   value=0xbd5f94d4 "Digest username=\"gst\", realm=\"voip.sysfrog.org\", nonce=\"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\", uri=\"sip:100@voip.sysfrog.org\", response=\"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\", algorithm\235\uffff@$8\"\b07\"\bp&\037\b"...) at pbx.c:5661
5661            AST_LIST_TRAVERSE (headp, newvariable, entries) {
(gdb) bt
#0  pbx_builtin_setvar_helper (chan=0x70696f76, name=0xbd5f93d8 "foo",
   value=0xbd5f94d4 "Digest username=\"gst\", realm=\"voip.sysfrog.org\", nonce=\"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\", uri=\"sip:100@voip.sysfrog.org\", response=\"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\", algorithm\235\uffff@$8\"\b07\"\bp&\037\b"...) at pbx.c:5661
#1  0x40bea010 in substring_exec (chan=0x3, data=0x70696f76) at app_substring.c:105
#2  0x40303031 in _argp_unlock_xxx () from /lib/libc.so.6
Previous frame inner to this frame (corrupt stack?)

Just a FYI as the SubString App is deprecated nevertheless.

By: Olle Johansson (oej) 2005-07-21 05:36:18

So this can be caused by calling SIP_HEADER on a dead channel... That's something I did not consider, propably not Russel either, when we created these functions. The packets may very well be gone by then. We need to check if the channel is still alive in the SIP_HEADER function.

By: dbruce (dbruce) 2005-07-23 05:32:40

As oej did not provide a fix for the situation where the channel may be middle of the hangup process, I have provided a patch, for testing, that should remedy the situation. Please test and report back.

patch is based on CVS-HEAD-2005-07-22

By: Guenther Starnberger (gst) 2005-07-25 06:31:48

Thanks - your patch seems to work - the problem with the segfault doesn't occur anymore:

Jul 25 13:14:07 WARNING[26524]: chan_sip.c:8165 func_header_read: SIP channel private structures no longer exist.
   -- Executing Python("SIP/voip.sysfrog.org-081f7730<ZOMBIE>", "incoming_sip_dial|h|||proxy") in new stack

does "ZOMBIE" here just mean that the channel was already destroyed?

i've still got a problem with one-way-audio when doing transfers in the same combination which previously caused the segfault. it works when i set canreinvite to 'no' so the problem is most likely a broken SIP-ALG on our firewall here.

and still another problem with transfers which i can't reproduce currently: after setting canreinvite to no and doing the transfer phone A was able to hear the voice from phone B, but phone B was still connected to the music-on-hold from asterisk. this occured only once - i wasn't able to reproduce it.

By: Olle Johansson (oej) 2005-07-25 08:42:42

Let's take one problem at a time. dbruce's patch fixed the segfault problem. Thank you, dbruce, for producing the patch.

dbruce, you need to confirm that you have a disclaimer on file.

By: dbruce (dbruce) 2005-07-25 10:51:54

sorry about that...

Yes... disclaimer is on file.

By: Kevin P. Fleming (kpfleming) 2005-07-25 13:07:50

I've committed a simpler fix to CVS HEAD; there is no reason to hold the iflock in the function, since we already hold the channel's lock and the pvt structure will not be unlinked from the channel while the channel lock is held.

Thanks dbruce for the patch and everyone else for the debugging (please move discussion of the other potential problem into a new bug, if necessary).

By: Digium Subversion (svnbot) 2008-01-15 15:42:16.000-0600

Repository: asterisk
Revision: 6187

U   trunk/channels/chan_sip.c

r6187 | kpfleming | 2008-01-15 15:42:15 -0600 (Tue, 15 Jan 2008) | 2 lines

ensure SIP_HEADER does not segfault when called on hung-up channels (bug ASTERISK-4634, different fix than posted patch)