Summary:ASTERISK-09718: Asterisk hang/stop working on console suspend
Reporter:jeffery palmer (darren1713)Labels:
Date Opened:2007-06-20 10:25:58Date Closed:2011-06-07 14:02:57
Status:Closed/CompleteComponents:. I did not set the category correctly.
Versions:Frequency of
Environment:Attachments:( 0) logger_lockup_1_4_r103728_thread_bt_full.txt
Description:Reproduce problem:
1) Start screen "screen"
2) Start asterisk "asterisk -ggvvvvvvc"
3) Press Ctrl-A then Escape to suspend console

This doesn't sound like a big problem, but it is because this same situation happens when you ssh into a box, loose the connection because of internet failure, or whatever other reason.

After 5-10 minutes of asterisk trying to write to the suspended console, everything hangs, nothing crashes, but asterisk stops making calls, accepting calls, or doing any sort of processing.

My first evaluation of this is that there are blocking writes to the console, and there are no checks to see if the file descriptor can write or not.
Comments:By: Jason Parker (jparker) 2007-06-20 13:38:15

Closing, since this is a duplicate of ASTERISK-9710

By: jeffery palmer (darren1713) 2008-03-18 10:37:52

Sorry I didn't keep up with this bug report, but this problem is not a duplicate of bug 0010010. * does not consume 100% cpu when this happens, it just stops accepting calls, stops responding, and just hangs.

This is still reproducible in SVN-trunk

I have not dug through the source on this, but since the logger is not vital for anything, if it can't write to the console or a file, it should just forget it and continue on, allowing * to continue.

Two other logger problems:
#1 rm /var/log/asterisk/full while * is running
* will not re-open the log file and continue

#2 rm/var/log/asterisk/full, then in * console, logger reload
* will still not re-open the log file and continue. If I open logger.conf, and change the filename to "full2" and then logger reload, it will then create and open /var/log/asterisk/full2 and continue logging.

This may seem irrelevant but considering we have a production system that sometimes needs to be debugged, the "everything" log can easily be 1GB per day, so this log file needs to be removed and restarted without killing * for us to debug with something manageable.

By: Tzafrir Cohen (tzafrir) 2008-03-31 04:19:24

The logger normally runs in a separate thread. Perhaps the issue that a certain lock is held by the thread writing to the console, and now other threads wait on that lock?

An obvious workaround for your issue is, well, not to use -c . Or use -F to force Asterisk to daemonize.

This avoids the need for screen. You can still tail -f log files as you please.

If you have any separate issues, please open separate bugs for them. At first glance it appears to be a valid issue (but I haven't tried to reproduce it yet).

By: Joshua C. Colp (jcolp) 2008-04-01 10:41:35

The logger under Asterisk 1.4 blocks upon ast_log calls until it is finished printing to the console. Higher versions operate in a separate thread.

By: Mark Michelson (mmichelson) 2008-04-07 14:10:09

Let's focus for now on the hanging that's occurring. In order to diagnose the problem, we'll need to see a "thread apply all bt full" from gdb when this problem occurs. Also, if it's possible to get any sort of console output, "core show locks" output would probably also be helpful.


By: jeffery palmer (darren1713) 2008-04-18 17:06:41

What version of 1.4 or what svn revision did the logger get split into a seperate thread? I'm running 1.4.19 which still exhibits the problem.

By: Mark Michelson (mmichelson) 2008-04-18 17:24:08

No version of 1.4 logs messages in a separate thread. Only trunk/1.6.0 log messages in a separate thread. If you're still experiencing the problem in 1.4.19, please upload the output requested. Thanks!

By: jeffery palmer (darren1713) 2008-04-18 17:49:01

Attached is the gdb backtrace. There is no "core show locks" command in my current 1.4.

By: Mark Michelson (mmichelson) 2008-04-21 08:57:28

If there's no "core show locks" command in your current 1.4 version, then that means you either

1) are using a very old version of 1.4 and need to upgrade or
2) have not compiled Asterisk with DEBUG_THREADS selected in Menuselect.

If 2) is the case, then recompile with DONT_OPTIMIZE and DEBUG_THREADS enabled in Menuselect. Then reproduce the bug if possible and upload the backtrace as you did before as well as the output from "core show locks" if possible. The backtrace you provided was created from an optimized build and therefore leaves out a lot of useful information.

If 1) is the case, then upgrade to the latest 1.4 version and then do everything specified above for the 2) case.


By: Tilghman Lesher (tilghman) 2008-04-29 21:27:30

No response from reporter.  Reopen if/when you are able to provide the requested information.