[Home]

Summary:ASTERISK-08609: Core dumped when too many calls are queued
Reporter:Rus Rus (harbour)Labels:
Date Opened:2007-01-19 04:14:24.000-0600Date Closed:2007-07-11 19:59:15
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) all.bt
( 1) all.bt.full
( 2) bt
( 3) bt.full
( 4) trace
Description:Setup is :

sipp box -> [ asterisk box + misdn ] -> pri -> pbx

Asterisk dumps core while running sipp with '-sn uac -r x -l 30', where x may be from 3 (tooks hour to segfault), to  300 (tooks minute to segfault). The pri link has only 30 channels, but sipp trying to occupy more, so somewhere the race  occured. Additionally I'm observing huge memleaks, running top on another terminal, but this must be investigating after crash fix. The last message is :

.....
Jan 19 11:35:00 WARNING[30444]: channel.c:897 ast_channel_free: Unable to
find channel in list
.....

Core backtrace is attached. The segfault is 100% reproducable. System is :

Athlon 4200 64 X2 CPU
2GB RAM
Linux 2.6.19.2 kernel SMP, no-preempt
gcc 3.4.6 + latest binutils

Can supply any additional info.

****** ADDITIONAL INFORMATION ******

#0  0xb7d79847 in raise () from /lib/tls/libc.so.6
(gdb) backtrace
#0  0xb7d79847 in raise () from /lib/tls/libc.so.6
#1  0xb7d7b0d9 in abort () from /lib/tls/libc.so.6
#2  0xb7dad616 in __libc_message () from /lib/tls/libc.so.6
#3  0xb7db3d4f in _int_free () from /lib/tls/libc.so.6
#4  0xb7db40ea in free () from /lib/tls/libc.so.6
ASTERISK-1  0x0806152f in ast_channel_free (chan=0x81afa08) at channel.c:953
ASTERISK-2  0x080680bd in ast_hangup (chan=0x81afa08) at channel.c:1386
ASTERISK-3  0xb67b1dec in dial_exec_full (chan=0x8297e18, data=0x8253bc8,
   peerflags=0xb62f6f18) at app_dial.c:1165
ASTERISK-4  0xb67b55ed in dial_exec (chan=0x0, data=0x6) at app_dial.c:1655
ASTERISK-5  0x0809157d in pbx_extension_helper (c=0x8297e18, con=0x0,
   context=0x8297f68 "default", exten=0x829805c "660015", priority=1,
   label=0x0, callerid=0xb62fb0e0 "mISDN/1/660015", action=0) at pbx.c:554
ASTERISK-6 0x08092846 in __ast_pbx_run (c=0x8297e18) at pbx.c:2231
ASTERISK-7 0x080943bc in pbx_thread (data=0x0) at pbx.c:2518
ASTERISK-8 0xb7f3d20e in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-9 0xb7e1a0de in clone () from /lib/tls/libc.so.6

Comments:By: Clod Patry (junky) 2007-01-24 08:12:26.000-0600

could ya attach a thread apply all bt too?

By: Rus Rus (harbour) 2007-01-24 08:21:47.000-0600

Sorry, can't understand you question. If you want additional info - please tell what kind of it should be.

By: Clod Patry (junky) 2007-01-24 08:52:03.000-0600

Please read backtrace.txt (or README.backtrace) in your * src doc/ dir.

By: Rus Rus (harbour) 2007-01-25 08:33:42.000-0600

'thread apply all bt' trace is attached

By: Serge Vecher (serge-v) 2007-01-30 14:17:23.000-0600

This may hold a clue to why the crash occurs:

#2  0x08061030 in channel_find_locked (prev=0x0, name=0xb7b553e0 "mISDN/1-",
   namelen=8, context=0x0, exten=0x0) at channel.c:775
#3  0x08061251 in ast_get_channel_by_name_prefix_locked (
   name=0xc2f4c88 "?!\032\bx???u9901", namelen=-516) at channel.c:802

1. Has asterisk been compiled with 'make dont-optimize'. If not, can you please recompile and redo the backtraces as per junky's instructions?

By: Rus Rus (harbour) 2007-01-31 07:17:01.000-0600

Recompiled SVN-branch-1.2-r51271 with 'make dont-optimize'. The trace is attached.

By: Rus Rus (harbour) 2007-03-06 05:34:19.000-0600

Injected some debug printfs in channel.c, just before crash :

.......
[Mar  6 13:19:51] WARNING[9180]: channel.c:956 ast_channel_free: chan before
free - 0x9a929c8
[Mar  6 13:19:51] WARNING[9179]: channel.c:956 ast_channel_free: chan before
free - 0xb5677f48
[Mar  6 13:19:51] WARNING[9180]: channel.c:956 ast_channel_free: chan before
free - 0xb5662ed0
[Mar  6 13:19:51] WARNING[3359]: channel.c:956 ast_channel_free: chan before
free - 0x83e5038
P[ 1]  No free channel at the moment @ send_event
P[ 1]  --> * Theres no Channel at the moment .. !
[Mar  6 13:19:51] WARNING[9183]: channel.c:956 ast_channel_free: chan before
free - 0x83e5038
[Mar  6 13:19:51] WARNING[9182]: channel.c:901 ast_channel_free: Unable to
find
channel in list 0x83e5038
[Mar  6 13:19:51] WARNING[9182]: channel.c:956 ast_channel_free: chan before free - 0x83e5038
Asterisk2*CLI>
Disconnected from Asterisk server
.........

As somebody (anybody still interested ?!) can notice, asterisk crashed because of trying to free already freed channel. So I have some questions :
- what is the purpose continuing freeing channel when it is not found in the list ?
- how to debug this crash further ? Seems like locking strategy in asterisk has some fundamental design flaw/bug ;(

P.S. All this is applied to 'stable' and svn 1.2 branches.

By: Russell Bryant (russell) 2007-06-06 11:59:37

This should be fixed in 1.2, 1.4, and trunk in revisions 67717, 67716, and 67715.    Let us know if you have any further problems