Summary:ASTERISK-03745: core dump following cb_extensionstate action?
Reporter:k3v (k3v)Labels:
Date Opened:2005-03-22 17:41:35.000-0600Date Closed:2011-06-07 14:00:31
Versions:Frequency of
Description:Two core dumps, about 24 hours apart.  First was during an extensions reload:

(gdb) bt
#0  0x0805405b in ast_sched_del (con=0x29, id=0) at sched.c:292
#1  0x40bb6ccd in cb_extensionstate (context=0x8897e04 "anonymous", exten=0x29 <Address 0x29 out of bounds>, state=0, data=0x88ee9b8) at chan_sip.c:886
#2  0x08094b0b in ast_remove_hint (e=0x88725f8) at pbx.c:2044
#3  0x0808e99d in __ast_context_destroy (con=0x0, registrar=0x40bf55cd "pbx_config") at pbx.c:4999
#4  0x0808a49c in ast_merge_contexts_and_delete (extcontexts=0x40bf7c20, registrar=0x40bf55cd "pbx_config") at pbx.c:3510
ASTERISK-1  0x40bf42bb in pbx_load_module () at pbx_config.c:1780
ASTERISK-2  0x40bf4d58 in handle_reload_extensions (fd=15, argc=2, argv=0xbd7ff694) at pbx_config.c:1434
ASTERISK-3  0x08097944 in ast_cli_command (fd=15, s=0x29 <Address 0x29 out of bounds>) at cli.c:1278
ASTERISK-4  0x080c4368 in netconsole (vconsole=0x8123940) at asterisk.c:291
ASTERISK-5  0x40024e51 in pthread_start_thread () from /lib/libpthread.so.0
ASTERISK-6 0x401ec6ea in clone () from /lib/libc.so.6

The next, more recently, occured when a zap channel (courtesy phone) hung up after calling me:

(gdb) bt          
#0  0x4018c8a4 in strncasecmp () from /lib/libc.so.6
#1  0x40a6d178 in __get_header (req=0x8158dc0, name=0x40aa37e3 "From", start=0xbd3fcddc) at chan_sip.c:2286
#2  0x40a6d1e5 in get_header (req=0xbd3ffd90, name=0xbd3ffd90 "@É\"@@Ã\"@\234ý?½") at chan_sip.c:2309
#3  0x40a94179 in transmit_state_notify (p=0x8157448, state=0, full=1) at chan_sip.c:4066
#4  0x40a94c4c in cb_extensionstate (context=0x8898fd4 "anonymous", exten=0xbd3ffd90 "@É\"@@Ã\"@\234ý?½", state=0, data=0x8157448) at chan_sip.c:5161
ASTERISK-1  0x08086661 in ast_device_state_changed (fmt=0xbd3ffd90 "@É\"@@Ã\"@\234ý?½") at pbx.c:1767
ASTERISK-2  0x0806007a in ast_channel_free (chan=0x0) at channel.c:675
ASTERISK-3  0x080606c2 in ast_hangup (chan=0x40e04a80) at channel.c:777
ASTERISK-4  0x08087a61 in ast_pbx_run (c=0x40e04a80) at pbx.c:2347
ASTERISK-5  0x407ac6bd in ss_thread (data=0x40e04a80) at chan_zap.c:5174
ASTERISK-6 0x40024e51 in pthread_start_thread () from /lib/libpthread.so.0
ASTERISK-7 0x401ec6ea in clone () from /lib/libc.so.6

It looks like the channel structure is corrupted in this case (more p *chan stuff can be posted if necessary), but the extensions reload above had nothing to do with channels.

My chan_sip has a tweak to stop the overzealous ast_quiet_chan() behavior that kills musiconhold at inappropriate times, but I don't suspect it has any affect here since these dumps have nothing to do with transfers.

I also have a modified app_queue, so handing over the cores may not help anyone look at this.  I can give access to the machine upon request, <xkev> in irc or blackham@gmail.com.

I've pulled out all my hints for now.  Hopefully it'll save the phones for a while.


cvs head 3/11/2005, plus minor patching.
Comments:By: k3v (k3v) 2005-03-22 17:45:46.000-0600

oh, and there is a tad of realtime in [anonymous]:

;; unknown outside callers go here (mostly SIP unknowns) ;;
switch => Realtime/@ext_realtime; email address SIP dialing
include => pbx_features;
include => pbx;
include => pbx_extens_error_mainmenu;
exten => i,1,Goto(main,1);
exten => t,1,Goto(main,1);
;;; something in hint stuff is causing segfaults ;;;
;;; comment this out for now ;;;
;#include "ext_presence.conf"
;^ yes, this goes here

;presence hints for extens that don't do SIMPLE
;the "asterisk+" tells the SER proxy to send the subscribe here
exten => asterisk+112,hint,SIP/112
exten => asterisk+112,1,NoOp(${HINT});
exten => asterisk+105,hint,SIP/105
exten => asterisk+105,1,NoOp(${HINT});
exten => asterisk+147,hint,SIP/147
exten => asterisk+147,1,NoOp(${HINT});
exten => asterisk+142H,hint,SIP/142H
exten => asterisk+142H,1,NoOp(${HINT});
exten => asterisk+1709,hint,Zap/105
exten => asterisk+1709,1,NoOp(${HINT});
exten => asterisk+1710,hint,Zap/106
exten => asterisk+1710,1,NoOp(${HINT});
exten => asterisk+1711,hint,Zap/107
exten => asterisk+1711,1,NoOp(${HINT});
exten => asterisk+1712,hint,Zap/108
exten => asterisk+1712,1,NoOp(${HINT});

By: Mark Spencer (markster) 2005-03-22 18:00:01.000-0600

Are you doing reloads and/or using SIP for realtime?

By: k3v (k3v) 2005-03-22 18:21:12.000-0600

I reload extensions periodically, and reload sip rarely.  I try to avoid full reloads, but I did do one today.  I do not have any realtime sipfriends, only some call logic in the dialplan I use app_realtime to retrieve, and the one switch in [anonymous].

By: Olle Johansson (oej) 2005-03-23 10:33:09.000-0600

I see two dates in the bug report - is this really latest CVS head with anthms realtime patch?

By: Olle Johansson (oej) 2005-03-23 10:37:59.000-0600

It is not latest cvs head (according to conversation on IRC). Please check with latest, since we don't want to re-fix things that already are fixed.

By: Olle Johansson (oej) 2005-03-24 17:54:51.000-0600

If you file a bug under crash, we really would like faster feedback so we can remove a potential crash from our code. If this is still a bug in latest CVS head, we really need to fix it!

By: Mark Spencer (markster) 2005-03-26 01:20:06.000-0600

Also, please use UNPATCHED cvs head.

By: k3v (k3v) 2005-04-03 11:30:32

Pulling the hints fixed this crash.  Other pressing issues and the current stability (10 days up, no problem) have lead me to neglect this bug.  I'll see if I can find a way to reproduce this in staging.  After dumping all calls a half dozen times in a week in production, I can't play around there. :)

By: Clod Patry (junky) 2005-04-09 18:38:24

k3v: if that fixed the crash, can we close this bug, is that all right for you now?

By: k3v (k3v) 2005-04-10 16:08:59

yeah, that's fine.  I can't commit to when I can dedicate more time to working on this.  the workaround is acceptable for now.

By: nick (nick) 2005-04-10 17:03:18

Closed at poster's request. Please reopen if you're able to reproduce this behavior with an unpatched and current HEAD box.