[Home]

Summary:ASTERISK-14916: [patch] segfault in 1.6.1.6 in _ao2_find, called from chan_iax2 after approx. 75.000 calls
Reporter:Robert Verspuy (exarv)Labels:
Date Opened:2009-10-01 02:37:55Date Closed:2010-03-11 09:34:43.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/chan_iax2
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk_backtrace_core_30366.zip
( 1) asterisk_backtrace_core_5710.zip
( 2) iax_fix.diff
Description:I'm running Asterisk 1.6.1.6 since 19 days now,
but I've had 3 times a segfault on the same address.

Sep 17 13:03:57 switch02 kernel: asterisk[13597]: segfault at
00002aaa0000000a rip 0000000000435c36 rsp 00000000420b1440 error 4
    In the mean time 75.517 calls were setup.
Sep 22 19:07:13 switch02 kernel: asterisk[23982]: segfault at
00002aaa0000000a rip 0000000000435c36 rsp 0000000042142440 error 4
    In the mean time  76.725 calls were setup.
Sep 28 14:26:16 switch02 kernel: asterisk[2777]: segfault at
00002aaa0000000a rip 0000000000435c36 rsp 0000000042088440 error 4

It's a live server running production traffic. So I don't have much
possibilities to easily test a different version.
Also the issue only happens to me once a week (about once every approx.
75.000 calls).
The segfault didn't happen on the most busiest times (sunday), but just
on the more quiet days.

The last time the server had 50 calls, 97 channels (43 chan_ss7
channels, 46 sip channels and 8 iax2 channels)

Software running:
 - CentOS 5 (latest updates as of 11 sept 2009)
 - asterisk 1.6.1.6
 - chan_ss7 1.2.1
 - dahdi-linux 2.2.0.2
 - dahdi-tools 2.2.0
 - wanpipe 3.5.6




****** ADDITIONAL INFORMATION ******

gdb on coredump:
gdb on coredump:

Core was generated by `/usr/sbin/asterisk -f -vvvg -c'.
Program terminated with signal 11, Segmentation fault.
#0  _ao2_find (c=0x2aaa00000002, arg=0x420884d0, flags=OBJ_POINTER) at astobj2.c:712
712        return __ao2_callback(c,flags, cb_fn, arg, NULL, NULL, 0, NULL);


(gdb) bt
#0  _ao2_find (c=0x2aaa00000002, arg=0x420884d0, flags=OBJ_POINTER) at
astobj2.c:712
#1  0x00002aaac090e2ae in __find_callno (callno=1, dcallno=26995,
sin=0x4208ce80, new=0, sockfd=14, return_locked=0, check_dcallno=1) at
chan_iax2.c:2450
#2  0x00002aaac091b570 in socket_process (thread=0x7c0eaa0) at
chan_iax2.c:2582
#3  0x00002aaac0926459 in iax2_process_thread (data=0x7c0eaa0) at
chan_iax2.c:10933
#4  0x00000000004f28ac in dummy_start (data=<value optimized out>) at
utils.c:968
ASTERISK-1  0x0000003b07a06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-2  0x0000003b06ed309d in clone () from /lib64/libc.so.6


(gdb) print c
$1 = (struct ao2_container *) 0x2aaa00000002


(gdb) ptype c
type = struct ao2_container {
   ao2_hash_fn *hash_fn;
   ao2_callback_fn *cmp_fn;
   int n_buckets;
   int elements;
   int version;
   struct bucket buckets[0];
} *


(gdb) print c.elements
Cannot access memory at address 0x2aaa00000016


(gdb) frame 1
#1  0x00002aaac090e2ae in __find_callno (callno=1, dcallno=26995, sin=0x4208ce80, new=0, sockfd=14, return_locked=0, check_dcallno=1) at chan_iax2.c:2450
2450 if ((pvt = ao2_find(iax_transfercallno_pvts, &tmp_pvt, OBJ_POINTER))) {


(gdb) list
2445 return res;
2446 }
2447 /* this searches for transfer call numbers that might not get caught otherwise */
2448 memset(&tmp_pvt.addr, 0, sizeof(tmp_pvt.addr));
2449 memcpy(&tmp_pvt.transfer, sin, sizeof(tmp_pvt.addr));
2450 if ((pvt = ao2_find(iax_transfercallno_pvts, &tmp_pvt, OBJ_POINTER))) {
2451 if (return_locked) {
2452 ast_mutex_lock(&iaxsl[pvt->callno]);
2453 }
2454 res = pvt->callno;

(gdb) print iax_transfercallno_pvts
$2 = (struct ao2_container *) 0x2aaa00000002


(gdb) ptype iax_transfercallno_pvts
type = struct ao2_container {
   ao2_hash_fn *hash_fn;
   ao2_callback_fn *cmp_fn;
   int n_buckets;
   int elements;
   int version;
   struct bucket buckets[0];
} *


(gdb) print iax_transfercallno_pvts.elements
Cannot access memory at address 0x2aaa00000016


I checked the last 2 core dumps,
and both of them happened, when in the sin variable (in frame 1)
the ip-address of our other voipserver.
This is asterisk 1.4.26.2

So maybe this asterisk 1.4.26.2 is sending some iax traffic, that makes the 1.6.1.6 corrupt the iax_transgercallno_pvts pointer??


Comments:By: Russell Bryant (russell) 2009-10-02 11:36:15

We do not support versions of Asterisk that include 3rd party code (in this case, chan_ss7).  Feel free to open a new issue if you can reproduce the problem without any modifications in use.

By: Robert Verspuy (exarv) 2009-10-05 02:24:16

Wait a minute. Not so fast.

Yes I'm using chan_ss7, but it's just one of the modules.

In the backtrace it's very clear that no chan_ss7 is included.

I already added some extra info, where you can see that the iax_transfercallno_pvts pointer is incorrect / corrupt and not pointing to a correct memory address.

Also it's a live server, so I can't just unload the chan_ss7 module.


By: Robert Verspuy (exarv) 2009-10-05 05:08:20

Just had another crash:

Oct  5 11:19:20 switch02 kernel: asterisk[18505]: segfault at 00002aaa0000000a rip 0000000000435c36 rsp 0000000042388440 error 4
97660 calls and 6 days, 21 hours after the last crash

gdb:
Core was generated by `/usr/sbin/asterisk -f -vvvg -c'.
Program terminated with signal 11, Segmentation fault.
#0  _ao2_find (c=0x2aaa00000002, arg=0x423884d0, flags=OBJ_POINTER) at astobj2.c:712
712 return __ao2_callback(c,flags, cb_fn, arg, NULL, NULL, 0, NULL);


(gdb) bt
#0  _ao2_find (c=0x2aaa00000002, arg=0x423884d0, flags=OBJ_POINTER) at astobj2.c:712
#1  0x00002aaac090e2ae in __find_callno (callno=1, dcallno=17150, sin=0x4238ce80, new=0, sockfd=14, return_locked=0, check_dcallno=1) at chan_iax2.c:2450
#2  0x00002aaac091b570 in socket_process (thread=0x1a07940) at chan_iax2.c:2582
#3  0x00002aaac0926459 in iax2_process_thread (data=0x1a07940) at chan_iax2.c:10933
#4  0x00000000004f28ac in dummy_start (data=<value optimized out>) at utils.c:968
ASTERISK-1  0x0000003b07a06367 in start_thread () from /lib64/libpthread.so.0
ASTERISK-2  0x0000003b06ed309d in clone () from /lib64/libc.so.6


(gdb) print iax_transfercallno_pvts
$1 = (struct ao2_container *) 0x2aaa00000002


(gdb) ptype iax_transfercallno_pvts
type = struct ao2_container {
   ao2_hash_fn *hash_fn;
   ao2_callback_fn *cmp_fn;
   int n_buckets;
   int elements;
   int version;
   struct bucket buckets[0];
} *


(gdb) print iax_transfercallno_pvts.elements
Cannot access memory at address 0x2aaa00000016

By: Robert Verspuy (exarv) 2009-10-05 05:34:01

Some information from the asterisk manager about the iax channel that crashed:

'Event' => 'Newchannel'
'Channel' => 'IAX2/internal_switch01-17150'
'ChannelState' => '0'
'ChannelStateDesc' => 'Down'
'CallerIDNum' => false
'CallerIDName' => false
'AccountCode' => false
'Uniqueid' => '1254734360.127385'

'Event' => 'ChannelUpdate'
'Channel' => false
'Channeltype' => 'IAX2'
'IAX2-callno-local' => '17150'
'IAX2-callno-remote' => '0'
'IAX2-peer' => false

'Event' => 'Newstate'
'Channel' => 'IAX2/internal_switch01-17150'
'ChannelState' => '5'
'ChannelStateDesc' => 'Ringing'
'CallerIDNum' => '4915xxxxxxxxx'
'CallerIDName' => 'xxxxxxxxxxxxx'
'Uniqueid' => '1254734360.127385'

'Event' => 'Dial'
'SubEvent' => 'Begin'
'Channel' => 'SIP/xxxxxxxx-c81678d8'
'Destination' => 'IAX2/internal_switch01-17150'
'CallerIDNum' => '4915xxxxxxxxx'
'CallerIDName' => 'xxxxxxxxxxxxx'
'UniqueID' => '1254734284.127360'
'DestUniqueID' => '1254734360.127385'
'Dialstring' => 'internal_switch01/89986xxxxxxxxxxx'

'Event' => 'NewCallerid'
'Channel' => 'IAX2/internal_switch01-17150'
'CallerIDNum' => '004969xxxxxxxx'
'CallerIDName' => false,
'Uniqueid' => '1254734360.127385'
'CID-CallingPres' => '0 (Presentation Allowed, Not Screened)'

By: Russell Bryant (russell) 2009-10-05 19:24:32

Even though the backtrace does not show a crash in chan_ss7, it is very possible that it could be the cause of memory corruption leading to a crash in chan_iax2.

Our policy is that we don't support systems with 3rd party code in use.  With the extremely high workload on our plate, it's just not worth the potentially wasted time.

If you can reproduce the problem without the module in use, then feel free to reopen this report.

By: David Ruggles (thedavidfactor) 2010-01-04 09:41:09.000-0600

Reopened as requested by Wolfgang on the dev list

By: Wolfgang Pichler (wuwu) 2010-01-05 02:06:26.000-0600

I have requested to reopen the issue because i do have had exact the same crash (backtrace does show exact the same) on a machine running asterisk 1.6.1.6 - without having chan_ss7 or any other third party module loaded. So the crash is not related to something else.
I am also using many iax connections (in trunking mode) - forwarded to sip channels. The crash did also happend after some time - so around after 100.000 calls.
I have had the crash the first time - now i did an update to 1.6.2.0.
I will report if the crash does also happen there.
On the same machine i was hit also by the bug https://issues.asterisk.org/view.php?id=16058

Could this be related ?

By: Robert Verspuy (exarv) 2010-01-06 03:53:10.000-0600

I've changed the IAX2 path to a SIP path on my servers with chan_ss7, and since then I had no crashes anymore. This was easy for me, because I only used IAX2 trunks between our own asterisk servers.

I did not have any music on hold for the iax calls when crashing, so I doubt if this is related to ASTERISK-14976

By: Jens von Bülow (jensvb) 2010-01-19 07:00:22.000-0600

Hi All,

Asterisk 1.4.28 crashed today with the same problem... Full backtrace attached.

<snip>
Thread 1 (process 30394):

#0  0x000000000042d579 in ao2_find (c=0x1800076c2, arg=0x42182f00, flags=OBJ_POINTER) at astobj2.c:571

#1  0x00002aaac611587b in __find_callno (callno=15348, dcallno=0, sin=0x42185cd0, new=0, sockfd=18, return_locked=1, check_dcallno=0) at chan_iax2.c:2391

#2  0x00002aaac6116732 in find_callno_locked (callno=15348, dcallno=0, sin=0x42185cd0, new=0, sockfd=18, full_frame=0) at chan_iax2.c:2544

#3  0x00002aaac6132525 in socket_process (thread=0x2aaab85e73b0) at chan_iax2.c:8427

#4  0x00002aaac613e7dd in iax2_process_thread (data=0x2aaab85e73b0) at chan_iax2.c:10041

ASTERISK-1  0x00000000004cb7a9 in dummy_start (data=0x2aaab85e3560) at utils.c:856

ASTERISK-2  0x0000003811806617 in start_thread () from /lib64/libpthread.so.0

ASTERISK-3  0x00000038110d3c2d in clone () from /lib64/libc.so.6
<snip>

Regards
Jens

By: Jens von Bülow (jensvb) 2010-02-01 04:02:03.000-0600

Hi All,

Asterisk 1.4.28 crashed again today with the same problem... Full backtrace attached.

Regards
Jens

By: David Vossel (dvossel) 2010-02-01 12:48:01.000-0600

It appears that everyone that is experiencing this crash is using IAX in trunking mode.

By: Alisher (licedey) 2010-02-09 13:06:33.000-0600

I am also experiencing the same problem with IAX in trunking mode on Asterisk 1.4.29. There were no problems when we used Asterisk 1.4.20 version. The server started to crash after recent upgrade to latest Asterisk 1.4.29.

By: David Vossel (dvossel) 2010-02-09 16:16:41.000-0600

Do any of you have the IAX_OLD_FIND flag enabled?

By: David Vossel (dvossel) 2010-02-09 16:31:15.000-0600

I uploaded a patch that may resolve the issue. Please test it and let me know if the crash occurs.

By: David Vossel (dvossel) 2010-02-09 16:46:07.000-0600

This is really the only explanation for the issue you all are reporting I can see.  I'm going to close the issue. Re-open it if you experience the crash again.

By: Digium Subversion (svnbot) 2010-02-09 16:55:41.000-0600

Repository: asterisk
Revision: 245792

U   branches/1.4/channels/chan_iax2.c

------------------------------------------------------------------------
r245792 | dvossel | 2010-02-09 16:55:39 -0600 (Tue, 09 Feb 2010) | 12 lines

Fixes iaxs and iaxsl size off by one issue.

2^15 = 32768 which is the maximum allowed iax2 callnumber.
Creating the iaxs and iaxsl array of size 32768 means the maximum
callnumber is actually out of bounds.  This causes a nasty crash.

(closes issue ASTERISK-14916)
Reported by: exarv
Patches:
     iax_fix.diff uploaded by dvossel (license 671)


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=245792

By: Digium Subversion (svnbot) 2010-02-09 17:07:18.000-0600

Repository: asterisk
Revision: 245793

_U  trunk/
U   trunk/channels/chan_iax2.c

------------------------------------------------------------------------
r245793 | dvossel | 2010-02-09 17:07:17 -0600 (Tue, 09 Feb 2010) | 18 lines

Merged revisions 245792 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r245792 | dvossel | 2010-02-09 16:55:38 -0600 (Tue, 09 Feb 2010) | 12 lines
 
 Fixes iaxs and iaxsl size off by one issue.
 
 2^15 = 32768 which is the maximum allowed iax2 callnumber.
 Creating the iaxs and iaxsl array of size 32768 means the maximum
 callnumber is actually out of bounds.  This causes a nasty crash.
 
 (closes issue ASTERISK-14916)
 Reported by: exarv
 Patches:
       iax_fix.diff uploaded by dvossel (license 671)
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=245793

By: Digium Subversion (svnbot) 2010-02-09 17:12:00.000-0600

Repository: asterisk
Revision: 245794

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_iax2.c

------------------------------------------------------------------------
r245794 | dvossel | 2010-02-09 17:11:59 -0600 (Tue, 09 Feb 2010) | 25 lines

Merged revisions 245793 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r245793 | dvossel | 2010-02-09 17:07:17 -0600 (Tue, 09 Feb 2010) | 18 lines
 
 Merged revisions 245792 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r245792 | dvossel | 2010-02-09 16:55:38 -0600 (Tue, 09 Feb 2010) | 12 lines
   
   Fixes iaxs and iaxsl size off by one issue.
   
   2^15 = 32768 which is the maximum allowed iax2 callnumber.
   Creating the iaxs and iaxsl array of size 32768 means the maximum
   callnumber is actually out of bounds.  This causes a nasty crash.
   
   (closes issue ASTERISK-14916)
   Reported by: exarv
   Patches:
         iax_fix.diff uploaded by dvossel (license 671)
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=245794

By: Digium Subversion (svnbot) 2010-02-09 17:13:08.000-0600

Repository: asterisk
Revision: 245795

_U  branches/1.6.1/
U   branches/1.6.1/channels/chan_iax2.c

------------------------------------------------------------------------
r245795 | dvossel | 2010-02-09 17:13:08 -0600 (Tue, 09 Feb 2010) | 25 lines

Merged revisions 245793 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r245793 | dvossel | 2010-02-09 17:07:17 -0600 (Tue, 09 Feb 2010) | 18 lines
 
 Merged revisions 245792 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r245792 | dvossel | 2010-02-09 16:55:38 -0600 (Tue, 09 Feb 2010) | 12 lines
   
   Fixes iaxs and iaxsl size off by one issue.
   
   2^15 = 32768 which is the maximum allowed iax2 callnumber.
   Creating the iaxs and iaxsl array of size 32768 means the maximum
   callnumber is actually out of bounds.  This causes a nasty crash.
   
   (closes issue ASTERISK-14916)
   Reported by: exarv
   Patches:
         iax_fix.diff uploaded by dvossel (license 671)
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=245795

By: Digium Subversion (svnbot) 2010-02-09 17:14:10.000-0600

Repository: asterisk
Revision: 245796

_U  branches/1.6.0/
U   branches/1.6.0/channels/chan_iax2.c

------------------------------------------------------------------------
r245796 | dvossel | 2010-02-09 17:14:10 -0600 (Tue, 09 Feb 2010) | 25 lines

Merged revisions 245793 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r245793 | dvossel | 2010-02-09 17:07:17 -0600 (Tue, 09 Feb 2010) | 18 lines
 
 Merged revisions 245792 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r245792 | dvossel | 2010-02-09 16:55:38 -0600 (Tue, 09 Feb 2010) | 12 lines
   
   Fixes iaxs and iaxsl size off by one issue.
   
   2^15 = 32768 which is the maximum allowed iax2 callnumber.
   Creating the iaxs and iaxsl array of size 32768 means the maximum
   callnumber is actually out of bounds.  This causes a nasty crash.
   
   (closes issue ASTERISK-14916)
   Reported by: exarv
   Patches:
         iax_fix.diff uploaded by dvossel (license 671)
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=245796