Summary: | ASTERISK-08748: chan_skinny randomly crashing server | ||
Reporter: | sbisker (sbisker) | Labels: | |
Date Opened: | 2007-02-07 10:24:33.000-0600 | Date Closed: | 2007-06-30 09:20:07 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_skinny |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) bt_full.txt ( 1) bt-full.txt ( 2) debug.txt ( 3) transmit.patch | |
Description: | With the release version of asterisk. chan_skinny is randomly crashing asterisk Unoptimized backtrace is attached. | ||
Comments: | By: dea (dea) 2007-02-07 11:36:04.000-0600 sbisker- Your backtrace shows why, without really showing why. Can you upload a log with both verbose and debug set to at least 3? Somehow this call managed to get started without a session(bad). The verbose/debug log should confirm this, and hopefully point to where additional checks for the existance of the session should be performed. By: sbisker (sbisker) 2007-02-07 12:12:22.000-0600 I have set the options for asterisk to -vvvvvgdddddnp When it crashes again, I will attach /var/log/asterisk/messages . -sb By: Serge Vecher (serge-v) 2007-02-07 16:24:02.000-0600 it is best to set debug output for console in logger.conf and upload a console log instead. By: Anthony LaMantia-2 (anthonyl) 2007-02-08 11:54:41.000-0600 it would seem the line s = d->session; is borked i assume the problem comes from the fact l->session is never checked after calling line_by_deviceid in handle_stimulus_message. or the sessions may just be being destoryed it should be checked for before calling transmit_response or inside transmit_response for safety anyway considering how this code is laid out. By: Damien Wedhorn (wedhorn) 2007-02-08 18:22:58.000-0600 I agree, I think it is fairly easy to add a check in transmit_response. There are many functions calling transmit_response so it would make sense to do the check in there. We may want to pass back an error so the calling function (skinny_new) is at least aware that the session has been dropped. By: sbisker (sbisker) 2007-02-23 09:20:13.000-0600 I uploaded the debug trace prior to the system crashing. Any luck on providing the fix in transmit_response? By: sbisker (sbisker) 2007-03-05 12:12:12.000-0600 Just checking status on this one. By: Damien Wedhorn (wedhorn) 2007-03-05 14:17:38.000-0600 Added small patch. If there is no session it should log it ("transmit response: no session") and continue without transmitting the message. Compiled ok, not tested. Seeing as this is so intermittent, can you test and monitor you logs for the above message. Just a general observation, we tend to have errors thrown that are not utilised. In transmit_response there are now a couple of situations that will return an error and nothing is done with the error. By: sbisker (sbisker) 2007-03-05 14:32:38.000-0600 Patched against 1.4.0. I will monitor the logs for the message. By: sbisker (sbisker) 2007-03-06 08:45:28.000-0600 Same problem. The patch didn't stop asterisk from core dumping. Further there were no messages in the logs that had "no session" in the entry. By: dea (dea) 2007-03-06 11:30:04.000-0600 The test for (!s) and return needs to be above ast_mutex_lock(&s->lock); (line 1393) in transmit_response() otherwise we attempt to lock a non-existant session, which is what the bt shows. sbisker, if you can move the five new lines in wedhorn's patch up above the ast_mutex_lock, it should work. By: Russell Bryant (russell) 2007-03-06 12:03:58.000-0600 This crash should not happen anymore as of rev 58023 and 58025. However, this problem is surely indicative of a deeper problem. If you can isolate any situation where this occurs and you get a WARNING message, please let us know. |