Summary:ASTERISK-16806: SIp channel stop work
Reporter:Matteo (mpiazzatnetbug)Labels:
Date Opened:2010-10-13 07:29:14Date Closed:2011-01-20 08:49:21.000-0600
Versions:Frequency of
Environment:Attachments:( 0) gdb_debug_asterisk_1.4.26.3.txt

It's a production enviroment.
We start asterisk in safe mode.
Every minutes with sipsak we check if sip the channel replys to a keep alive sip message. If doesn't replay we kill the process with -6 option and the asterisk re-starts himself.

Attached yuo find the gdp output of the core file produced by the operation.
I'm not able to understand if the gdp outout is helpful or not. The problem is happened after 4 weeks of uptime. We have 1020 peer registered on the asterisk and an average of 30 channels up during office time.

I have some doubts for example about the lines as:
orig_channame = "SIP/openser_comtn-b0bd5470\000?\000\000\000\000\020?DA\000\000\000\000%4D\000\000\000\000\000`\002\221??*\000\000\b?B\000\000\000\000\000`\002\221??*\000\000?>N\000\000\000\000"
Comments:By: Paul Belanger (pabelanger) 2010-10-13 10:22:23

Why are you killing the process over using 'core stop now'? If you suspect asterisk is locked, then debug that issue.  killing the process and moving on seems like the wrong approach.

I would try to reproduce your issue on the latest 1.4 release also.

By: Matteo (mpiazzatnetbug) 2010-10-13 10:32:31

the issue is that if I have a production machine with 1000 phones, and some of them are critical, and the sip stak stops to work my first priority is not debug the issue but restart the service. I'm using the kill command becuase it's produce a core file. If I use a command like stop now I will not have a core dump file and any info at all about.
The only possible choice that I see it's automatize the gdb debug in some way before to kill the process.

By: Paul Belanger (pabelanger) 2010-10-13 11:52:22

To triage the issue, we'll need the debug information before you restart your service.  Should not be a problem automated it, be sure to check for locks before gdb and restarting asterisk.

However, we'll need you to try and reproduce using the latest 1.4 release.  I'm remembering an issue a few month ago with MoH, looking at your backtrace.

Debugging deadlocks:

Please select DEBUG_THREADS and DONT_OPTIMIZE in the Compiler Flags section of menuselect. Recompile and install Asterisk (i.e. make install)

This will then give you the console command:

core show locks

When the symptoms of the deadlock present themselves again, please provide output of the deadlock via:

# asterisk -rx "core show locks" | tee /tmp/core-show-locks.txt

# gdb -se "asterisk" <pid of asterisk> | tee /tmp/backtrace.txt

gdb> bt
gdb> bt full
gdb> thread apply all bt

Then attach the core-show-locks.txt and backtrace.txt files to this issue. Thanks!

By: David Woolley (davidw) 2010-10-13 11:54:38

If asterisk generates a core file itself, it will often only dump the running thread.  gcore is a fast way of ensuring you get all the threads.

By: Leif Madsen (lmadsen) 2011-01-06 14:52:34.000-0600

Looks like we're still waiting for information here.

By: Matteo (mpiazzatnetbug) 2011-01-08 05:43:55.000-0600

Not able to reproduce it.