[Home]

Summary:ASTERISK-10404: segmentation faults on installation with 3000 calls/day.
Reporter:Peter Kozak (spag)Labels:
Date Opened:2007-09-28 06:57:30Date Closed:2007-11-05 14:12:57.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) ivan_bt_full_10840.log
Description:Asterisk crashes 2-3 times a day on debian (sarge) system (core2duo 2GHz, 4GB RAM).

Tested on Asterisk 1.4.10.1 and 1.4.11

Aterisk crashes only at normal office hours (3000 calls a day), never when the system is idle.

I wasn't able to reproduce this crashes, except waiting patiently until it happens again.

Sometimes the asterisk process is simply not responding (and eating 100% of CPU) sometime it crashes with signal 11.

SSH access to the affected machine will be granted on request.


Core dumped:

Core was generated by `/usr/sbin/asterisk -f -vg'.
Program terminated with signal 11, Segmentation fault.

(gdb) bt full
#0  0xb7d95709 in free () from /lib/tls/i686/cmov/libc.so.6
No symbol table info available.
#1  0x080a7d68 in ast_frame_free (fr=0xa0059a4, cache=1) at frame.c:360
       __PRETTY_FUNCTION__ = "ast_frame_free"
#2  0x080891b8 in ast_generic_bridge (c0=0xa002da0, c1=0xa004b40, config=0xb62c6270, fo=0xb62c5f20, rc=0xb62c5f1c, bridge_end=
     {tv_sec = 0, tv_usec = 0}) at /var/tmp/src/asterisk-1.4.10.1/include/asterisk/frame.h:390
       who = (struct ast_channel *) 0xa004b40
       other = (struct ast_channel *) 0xa002da0
       cs = {0xa002da0, 0xa004b40, 0xa004b40}
       f = (struct ast_frame *) 0xa0059a4
       res = AST_BRIDGE_COMPLETE
       o0nativeformats = 8
       o1nativeformats = 64
       watch_c0_dtmf = 0
       watch_c1_dtmf = 0
       pvt0 = (void *) 0x9de8f90
       pvt1 = (void *) 0x10
       frame_put_in_jb = 0
       jb_in_use = 0
       to = -1
       __PRETTY_FUNCTION__ = "ast_generic_bridge"
#3  0x0808a20f in ast_channel_bridge (c0=0xa002da0, c1=0xa004b40, config=0xb62c6270, fo=0xb62c5f20, rc=0xb62c5f1c) at channel.c:4294
       now = {tv_sec = 0, tv_usec = 0}
       to = -1
       who = (struct ast_channel *) 0x0
       res = AST_BRIDGE_COMPLETE
       nativefailed = 0
       firstpass = 1
       o0nativeformats = 8
       o1nativeformats = 64
       time_left_ms = 0
       nexteventts = {tv_sec = 0, tv_usec = 0}
       caller_warning = 0 '\0'
       callee_warning = 0 '\0'
       __PRETTY_FUNCTION__ = "ast_channel_bridge"
#4  0xb7736cf1 in ast_bridge_call (chan=0xa002da0, peer=0xa004b40, config=0xb62c6270) at res_features.c:1394
       other = (struct ast_channel *) 0x130f
       f = (struct ast_frame *) 0x0
       who = (struct ast_channel *) 0x8141a9f
       chan_featurecode = '\0' <repeats 11 times>
       peer_featurecode = '\0' <repeats 11 times>
       res = 0
       diff = -1

hasfeatures = 0
       hadfeatures = 0
       aoh = (struct ast_option_header *) 0xb62c62a4
       backup_config = {features_caller = {flags = 0}, features_callee = {flags = 0}, start_time = {tv_sec = 0, tv_usec = 0},
 feature_timer = 0, timelimit = 0, play_warning = 0, warning_freq = 0, warning_sound = 0x0, end_sound = 0x0, start_sound = 0x0,
 firstpass = 0, flags = 0}
       bridge_cdr = (struct ast_cdr *) 0xb62c5f78
       __PRETTY_FUNCTION__ = "ast_bridge_call"
ASTERISK-1  0xb6aae4e8 in dial_exec_full (chan=0xa002da0, data=0xb62c8ff8, peerflags=0xb62c6e64, continue_exec=0x0) at app_dial.c:1651
       config = {features_caller = {flags = 0}, features_callee = {flags = 0}, start_time = {tv_sec = 1190822756, tv_usec = 9094},
 feature_timer = 0, timelimit = 0, play_warning = 0, warning_freq = 0, warning_sound = 0x0, end_sound = 0x0, start_sound = 0x0,
 firstpass = 0, flags = 0}
       number = 0x9f678d1 "iaxmodem01/718"
       end_time = 42
       answer_time = 1190822756
       res = 0
       u = (struct ast_module_user *) 0x8769b08
       rest = 0x0
       cur = 0x0
       outgoing = (struct dial_localuser *) 0x0
       peer = (struct ast_channel *) 0xa004b40
       to = -1
       numbusy = 0
       numcongestion = 0
       numnochan = 0
       cause = 0
       numsubst = "iaxmodem01/718\000\b$$?\017\023\000\000<8\024\b??000\000\000\000\000\000\000\000\035.?4m,Sep 26 18:05:55\000-\000\000\000?,4?-\000\000\000\204?\001\000\000\000\000`-\000\000\000??\024l,\225??\000`-\000\000\000l\031\000\n-\000\0003\000`??\000\000\000?3 l,/?Dl,\036??\000`-\000\000\000\000\000\000\000??9$$?$\024\b"...
       cidname = '\0' <repeats 79 times>
       privdb_val = 0
       calldurationlimit = 0
       timelimit = 0
       play_warning = 0
       warning_freq = 0
       warning_sound = 0x0
       end_sound = 0x0
       start_sound = 0x0
       dtmfcalled = 0x0
       dtmfcalling = 0x0
       status = "ANSWER\000R\000GS", '\0' <repeats 244 times>
       play_to_caller = 0
       play_to_callee = 0

sentringing = 0
       moh = 0
       outbound_group = 0x0
       result = 0
       start_time = 1190822755
       privintro = "@g,?000?\003\000\000\000}\020?f,Pf,y?\223\037?\017\000\000\000??@?r?024\br?024\b\002\000\000\000??234l,xl,_\234l,p?024\b\002\000\000\000??\003(\025\b\003(\025\b\002\000\000\000??001(\025\b\002\000\000\000Pl,_?,\001(\025\b\002", '\0' <repeats 11 times>, "?202>\000\024l,\035\000\000\000\000\000\000\000\000\200l,0m,\000\000\000\000E\003\000\000\000\000\000\000\000\020\000\000\b\000\000\000\000\000\000\000c\203F\000"...
       privcid = '\0' <repeats 18 times>, " s*\000\000\000\000\000\000\000Pe,??e,c???f,?000?\003\000\000\000}\020?e,\001[?-j,\236f,\002", '\0' <repeats 19 times>, "\030\000\000S??Z?m\031\000\n\001\000\000\000???\206?\206\001\000\000\000??\024\b\000\000\000\000?,_?\206\000\000\000\000\030\000\000\000\214j,7\203F\n\b\000\000\000\000\000\000\000\000\000\000?202>\000\201\000\000\001", '\0' <repeats 23 times>, "E\003\000\000\000\000"...
       parse = 0xb62c5fe0 "IAX2"
       opermode = 0
       args = {argc = 1, argv = 0xb62c6510, peers = 0xb62c5fe0 "IAX2", timeout = 0x0, options = 0x0, url = 0x0}
       opts = {flags = 0}
       opt_args = {0x814b0c4 "%s", 0xb62c69cc ",$\024\b$\024\b\030j,3\234\020\b?", 0x0, 0x0, 0x0,
 0x1 <Address 0x1 out of bounds>, 0xb62c69bc "}\202\020\b", 0xb62c6590 "m\031", 0xb7debe63 "\207?211?201"}
       __PRETTY_FUNCTION__ = "dial_exec_full"
ASTERISK-2  0xb6aae77c in dial_exec (chan=0xa002da0, data=0xb62c8ff8) at app_dial.c:1705
       peerflags = {flags = 0}
ASTERISK-3  0x080c45ee in pbx_exec (c=0xa002da0, app=0x81b2da8, data=0xb62c8ff8) at pbx.c:532
       res = 0
       saved_c_appl = 0x0
       saved_c_data = 0x0
ASTERISK-4  0x080c82fc in pbx_extension_helper (c=0xa002da0, con=0x0, context=0xa002fc8 "default", exten=0xa003018 "6718", priority=3, label=0x0,
   callerid=0x9674b78 "04321902549", action=E_SPAWN) at pbx.c:1833
       e = (struct ast_exten *) 0x82abd70
       app = (struct ast_app *) 0x81b2da8
       res = 8195840
       q = {incstack = {0x81e61b4 "default", 0x821964c "to-gateway", 0x82a4064 "systemalarm", 0x82a4874 "test",
   0x82a4ba4 "cluster-watchdog", 0x81fcaf4 "to-internal-nobody", 0x82783dc "to-conferences", 0x0 <repeats 121 times>}, stacklen = 7,
 status = 5, swo = 0x0, data = 0x0, foundcontext = 0x81e64f6 "to-internal-users"}
       passdata = "IAX2/iaxmodem01/718", '\0' <repeats 8172 times>
       matching_action = 0
       __PRETTY_FUNCTION__ = "pbx_extension_helper"
ASTERISK-5  0x080c96dc in ast_spawn_extension (c=0xa002da0, context=0xa002fc8 "default", exten=0xa003018 "6718", priority=3,
   callerid=0x9674b78 "04321902549") at pbx.c:2288
No locals.
ASTERISK-6 0x080c9bac in __ast_pbx_run (c=0xa002da0) at pbx.c:2388
       dst_exten = "\034\000\000\000\001\000\000\000?\b7\000\n\t\000\000\000R?025\b?\025\b?237\024?023\b\000\000\000\000\001\000\000\000?237\024?023\bl?023\b\b,?t\020\b?\025\bG\000\000\000\004?025\b#?025\bh*\027\bG\000\000\000\004?025\b9,\024?023\b8

,}\202\020\b\000\000\000\000\027 H,}\202\020\b\000\000\000\000$Z?X,,\024?023\bl?023\bx,3\234\020\b?\000\n\000\000\000\000???\202\002\000\000?001\000\000\023?025\b?H\000\000\000??...
       pos = 0
       digit = 0
       found = 1
       res = 0
       autoloopflag = 0
       error = 0
       __PRETTY_FUNCTION__ = "__ast_pbx_run"
ASTERISK-7 0x080ca9c9 in pbx_thread (data=0xa002da0) at pbx.c:2603
       c = (struct ast_channel *) 0xa002da0
ASTERISK-8 0x08109f7c in dummy_start (data=0x8967be0) at utils.c:775
       _buffer = {__routine = 0x8069860 <ast_unregister_thread>, __arg = 0xb62cbbb0, __canceltype = -1208157023, __prev = 0x0}
       ret = (void *) 0xb7e5b360
       a = {start_routine = 0x80ca9b2 <pbx_thread>, data = 0xa002da0,
 name = 0xa000e18 "pbx_thread", ' ' <repeats 11 times>, "started at [ 2627] pbx.c ast_pbx_start()"}
       lock_info = (struct thr_lock_info *) 0xa003708
       __PRETTY_FUNCTION__ = "dummy_start"
ASTERISK-9 0xb7fd0240 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
No symbol table info available.
ASTERISK-10 0xb7dfb4ae in clone () from /lib/tls/i686/cmov/libc.so.6
No symbol table info available.
Comments:By: Peter Kozak (spag) 2007-09-28 07:49:10

Sorry, the debian distribution is not Sarge, but Etch!



By: Russell Bryant (russell) 2007-10-10 12:00:40

I would be interested in ssh access so that I can look at the core dump with gdb to see if I can determine more about what is happening.  Feel free to contact me at russell@digium.com.

By: pkempgen (pkempgen) 2007-10-15 15:04:23

I suggest we close this issue because there were no segfaults which look
like this one since we moved the system from Dell + Debian to a SLES 10
machine.
Sorry for not being able to provide any other core dumps.

The Dell PowerEdge 2950 gave us a lot of trouble.
Should a note be added to
http://www.digium.com/en/docs/misc/compatibility_notes.php
even if the problem does not seem to be related to the TE220b?
(kept crashing even without the card)



By: Volnikov Ivan (ivan) 2007-10-25 02:06:52

Yesterday I have seen precisely same crash (see attached ivan_bt_full_10840.log) in our Asterisk 1.4.11 (with some my patches) - OS: Fedora Core 6.0, CPU: Intel Pentium 4 3.0GHz (Multitheading), RAM: 2G.

By: Digium Subversion (svnbot) 2007-11-01 14:29:04

Repository: asterisk
Revision: 88153

U   team/russell/readq-1.4/main/channel.c

------------------------------------------------------------------------
r88153 | russell | 2007-11-01 14:29:02 -0500 (Thu, 01 Nov 2007) | 15 lines

The readq handling in ast_do_masquerade() got broken when the code was converted
to use the AST_LIST macros.  Furthermore, the actual operation performed was
extremely bizarre.  I have re-written the readq handling in ast_do_masquerade()
to make it safe so that the readq list does not get corrupted, as well as
simplified and documented the code. There is also another fix for list handling
for channel datastores.

(related to issues ASTERISK-10489, ASTERISK-10193, ASTERISK-10012, and the 2nd backtrace of ASTERISK-10616)
(potentially related to issues ASTERISK-9737 and ASTERISK-10404)

For users involved with any of the bug reports I have listed, please give this
code a try:

$ svn co http://svn.digium.com/svn/asterisk/team/russell/readq-1.4

------------------------------------------------------------------------

By: Digium Subversion (svnbot) 2007-11-05 14:10:22.000-0600

Repository: asterisk
Revision: 88709

U   branches/1.4/main/channel.c

------------------------------------------------------------------------
r88709 | russell | 2007-11-05 14:10:17 -0600 (Mon, 05 Nov 2007) | 20 lines

Merge the last bit of changes from asterisk/team/russell/readq-1.4

The issue here is that the channel frame readq handling got broken when the
code was converted to use the linked list macros.  It caused corruption of the
list head and tail pointers.  So, I fixed up the usage of the linked list
macros and in passing, simplified the code.  I also documented what the code
is doing, as it was a bit difficult to figure out at first.

This bug showed itself with crashes showing messed up head/tail pointers for
the readq.  However, there are a couple of crashes that aren't quite as obvious,
but I think may be related.  So, if your bug gets closed by this commit, but
you still have a problem, please reopen or create a new bug report.

(closes issue ASTERISK-10489)
(closes issue ASTERISK-10193)
(closes issue ASTERISK-10012)
(closes issue ASTERISK-10616)
(closes issue ASTERISK-9737)
(closes issue ASTERISK-10404)

------------------------------------------------------------------------

By: Digium Subversion (svnbot) 2007-11-05 14:12:57.000-0600

Repository: asterisk
Revision: 88710

_U  trunk/
U   trunk/main/channel.c

------------------------------------------------------------------------
r88710 | russell | 2007-11-05 14:12:56 -0600 (Mon, 05 Nov 2007) | 28 lines

Merged revisions 88709 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r88709 | russell | 2007-11-05 14:11:04 -0600 (Mon, 05 Nov 2007) | 20 lines

Merge the last bit of changes from asterisk/team/russell/readq-1.4

The issue here is that the channel frame readq handling got broken when the
code was converted to use the linked list macros.  It caused corruption of the
list head and tail pointers.  So, I fixed up the usage of the linked list
macros and in passing, simplified the code.  I also documented what the code
is doing, as it was a bit difficult to figure out at first.

This bug showed itself with crashes showing messed up head/tail pointers for
the readq.  However, there are a couple of crashes that aren't quite as obvious,
but I think may be related.  So, if your bug gets closed by this commit, but
you still have a problem, please reopen or create a new bug report.

(closes issue ASTERISK-10489)
(closes issue ASTERISK-10193)
(closes issue ASTERISK-10012)
(closes issue ASTERISK-10616)
(closes issue ASTERISK-9737)
(closes issue ASTERISK-10404)

........

------------------------------------------------------------------------