Summary:ASTERISK-06482: crash in chan_sip using Monitor()
Reporter:Roy Sigurd Karlsbakk (rkarlsba)Labels:
Date Opened:2006-03-06 10:00:32.000-0600Date Closed:2006-05-16 09:39:49
Versions:Frequency of
Environment:Attachments:( 0) anotherbt.txt

my * server just crashed and dumped its core. nothing particular was being done at the moment.

this is also patched with the generic jb patch from ASTERISK-3764



#0  0x00002aaaaaccc8d0 in pthread_mutex_lock () from /lib/libpthread.so.0
No symbol table info available.
#1  0x00002aaaabbaaaa0 in ast_mutex_lock (pmutex=0x60) at lock.h:592
No locals.
#2  0x00002aaaabbbc86e in expire_register (data=0x2aaab662b0bc) at chan_sip.c:5665
       newcount = 0
       peer = (struct sip_peer *) 0x0
       __PRETTY_FUNCTION__ = "expire_register"
#3  0x000000000041271e in ast_sched_runq (con=0x68e38c) at sched.c:373
       current = (struct sched *) 0x2aaab8b4155c
       tv = {tv_sec = 1141655956, tv_usec = 658760}
       x = 0
       res = 6904716
#4  0x00002aaaabbd5962 in do_monitor (data=0x0) at chan_sip.c:11325
       res = 1
       sip = (struct sip_pvt *) 0x0
       peer = (struct sip_peer *) 0x0
       t = 1141655956
       fastrestart = 0
       lastpeernum = -1
       curpeernum = 1986
       reloading = 0
       __PRETTY_FUNCTION__ = "do_monitor"
Comments:By: Olle Johansson (oej) 2006-03-06 10:09:01.000-0600

Line 5665 is ASTOBJ_UNREF in his chan_sip with the patch. Realtime in use on this server.

By: Olle Johansson (oej) 2006-03-06 10:09:10.000-0600

Line 5665 is ASTOBJ_UNREF in his chan_sip with the patch. Realtime in use on this server.

By: Olle Johansson (oej) 2006-03-06 12:19:25.000-0600

Roy: Are *ALL* peers that register REALTIME peers?

By: Olle Johansson (oej) 2006-03-06 12:22:47.000-0600

Realtime caching?

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-03-06 13:45:20.000-0600

there are a handful sip clients in sip.conf, such as the pstn gateways, but all clients are fetched from realtime. we use realtime caching, yes

By: Serge Vecher (serge-v) 2006-05-01 15:23:42

rkarlsba: are you still experiencing crashes here?

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-05-01 17:48:02

not recently
please keep this open

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-05-07 08:19:00

Got another one last friday. same stuff happens. see attached bt

By: Andrey S Pankov (casper) 2006-05-07 09:15:17

I am a bird, I can fly!
How this is related to the first crash as well as to the second?

If you are using gcc >= 4.0 please build your asterisk with 'make dont-optimize'
and then report. And please do not mix everything in one issue. This is not your
_personal_ support page, this is a bugtracker!!! Could you see it crashes in
_different_ functions, even both are in chan_sip.

Can you confirm you are running the latest SVN 1.2 branch >= r25323?
What gcc version/distro/platform?

I'm sorry I am so brute... but it should be clear enough asterisk bugtracker
is not a support forum for FC4+ based distros and their buggy compilers.

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-05-07 12:13:16

sipgw2:~# asterisk -V
Asterisk 1.2.6
sipgw2:~# asterisk -rx 'show version'
   -- Remote UNIX connection
Asterisk 1.2.6 built by root @ sipgw2 on a x86_64 running Linux on 2006-04-11 12:58:26 UTC
Verbosity is at least 3
sipgw2:~# gcc --version
gcc (GCC) 3.4.4 20050314 (prerelease) (Debian 3.4.3-13)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
sipgw2:/usr/src/asterisk-1.2.6# grep -- -O Makefile

and this is debian sarge

AFAICS the problem is in app_monitor somewhere

I'm aware of this not being the latest version and so on, but do you know if there are any relevant fixes in recent versions in the 1.2 tree? I haven't come across any.......

Also, please note that the system is in production and can't be upgraded too often.


By: Andrey S Pankov (casper) 2006-05-07 12:24:41

Anyway please build at least (if you are unable to get a fresh copy of
SVN 1.2 branch) dont-optimize.

Maybe a x86_64 issue as well. I can't see how this can be related to app_monitor.

Are you using debian's bristuffed version? You need to compile asterisk from the
sources provided by asterisk.org/digium.com. If you are using debian specific
package please try to file a bugreport to their bugtracker. (Am I right? Please
correct me if not.)

By: Roy Sigurd Karlsbakk (rkarlsba) 2006-05-07 12:32:49

The distro is Debian Sarge, but asterisk is all compiled from the source.

Do you know any relevant fixes in 1.2?

I'm RoyK on IRC at #asterisk


By: Serge Vecher (serge-v) 2006-05-07 12:48:19

>wait, what's app_monitor? I only see app_mixmonitor in the sources ...
ok, answered by own question, it's in res_monitor not app_monitor.

roy: could you please try to replace Monitor with MixMonitor in your dialplan? I remember reading that Monitor is going away in the next stable in favor of the latter.

By: Andrey S Pankov (casper) 2006-05-07 12:51:16

Lots of fixes there... and lots of them may be relevant...

By: Andrey S Pankov (casper) 2006-05-07 12:56:17

Your 2nd crash is easy to fix. as you can see "retrans_pkt (data=0x0)".
Then you have "struct sip_pkt *pkt=data,". That's obvious it will crash
at "ast_mutex_lock(&pkt->owner->lock);".

But we need to now _WHY_ data==NULL since it never should.

By: Serge Vecher (serge-v) 2006-05-16 09:39:24

alright: I think this bug ready for closure. Nobody else is experiencing this particular crash in chan_sip. Perhaps the root of the cause is the rtp-jitterbuffer patch, in which case the discussion needs to continue in that bug report.

roy: if you are able to reproduce this problem on an unpatched stable branch with rev > 27000, please reopen the bug with non-optimized backtrace attached.