Summary: | ASTERISK-15066: full system crash every other day | ||
Reporter: | moshe Teitelbaum (moshe) | Labels: | |
Date Opened: | 2009-11-03 00:19:56.000-0600 | Date Closed: | 2009-11-17 07:54:42.000-0600 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_local |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) backtrace.txt | |
Description: | we have a server running as MTE with 52 tenants and over 200 extensions recently asterisk is crashing every other day, Obviously we have no way of knowing how to recreate the problem. Its possible that whatever causes the crash can be repeated with 100% crash-success, but we still cant figure out what specifically is causing it. following is the last CLI before the crash. [Nov 2 15:15:42] ERROR[1928]: /usr/src/asterisk/1.4.26/asterisk-1.4.26.1/include/asterisk/lock.h431 __ast_pthread_mutex_lock: chan_local.c line 542 (local_hangup): Error obtaining mutex: Invalid argument [Nov 2 15:15:42] ERROR[1928]: /usr/src/asterisk/1.4.26/asterisk-1.4.26.1/include/asterisk/lock.h514 __ast_pthread_mutex_unlock: chan_local.c line 597 (local_hangup): mutex '&p->lock' freed more times than we've locked! [Nov 2 15:15:42] ERROR[1928]: /usr/src/asterisk/1.4.26/asterisk-1.4.26.1/include/asterisk/lock.h531 __ast_pthread_mutex_unlock: chan_local.c line 597 (local_hangup): Error releasing mutex: Invalid argument [Nov 2 15:15:42] ERROR[1928]: /usr/src/asterisk/1.4.26/asterisk-1.4.26.1/include/asterisk/lock.h319 __ast_pthread_mutex_destroy: chan_local.c line 158 (local_pvt_destroy): Error: attempt to destroy invalid mutex '&pvt->lock'. i could supply with backtrace | ||
Comments: | By: moshe Teitelbaum (moshe) 2009-11-03 10:12:24.000-0600 im not sure if it is related but every other call is getting the following warning [Nov 3 11:05:35] WARNING[12706]: app_dial.c:1275 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown) By: moshe Teitelbaum (moshe) 2009-11-04 08:29:57.000-0600 additional errors coming up every now and than , and again not sure if it is related [Nov 4 09:14:43] ERROR[27826]: utils.c:966 ast_carefulwrite: write() returned error: Connection reset by peer [Nov 4 09:14:43] ERROR[27826]: utils.c:966 ast_carefulwrite: write() returned error: Broken pipe as well as the following error which is kind of new to me ( i haven't seen it till today) [Nov 4 05:56:52] WARNING[1979]: chan_sip.c:7053 determine_firstline_parts: Bad request protocol OK i would like to know how i could expedite things around hare thanks By: Joshua C. Colp (jcolp) 2009-11-04 15:14:54.000-0600 Can you please try to reproduce this issue with 1.4 from SVN? it looks like something that has already been fixed. Thanks! By: Erik Smith (eeman) 2009-11-05 11:34:17.000-0600 file, are you referring to a changelog remark in SVN regarding issue 16027? This is a heavy-use production box and he is weary of buying new problems with SVN Branch, if its this particular issue fix can he just remove the 1 line in chan_sip.c detailed in the notes of the issue? By: Leif Madsen (lmadsen) 2009-11-06 09:25:57.000-0600 Just assigned to file for comment back. Move back to appropriate status after commenting. Thanks! By: Leif Madsen (lmadsen) 2009-11-13 08:48:25.000-0600 Do you happen to be using an AGI here? If so, this could possibly be related to a couple other issues I've just found. By: Erik Smith (eeman) 2009-11-13 09:00:58.000-0600 negative, this is just a macro for a ringgroup that invokes a bunch of local/exten@context technologies. By: Leif Madsen (lmadsen) 2009-11-13 10:36:36.000-0600 OK thanks, so this is a separate issue. By: Leif Madsen (lmadsen) 2009-11-17 07:31:50.000-0600 Are you able to test on the latest release candidates? There is a feeling this may already be fixed. Thanks! By: Erik Smith (eeman) 2009-11-17 07:38:01.000-0600 well what I did was I backported the patch in 16027 + if (c) { ast_channel_unlock(c); + } and recompiled. I have found that sometimes upgrading causes one to buy a new problem in the trade-off. We havent had a crash since but its only been 4 production days. However, it used to crash every 2 - 3 production days. If there is no crash by friday, november 20th; I will assume it resolved the problem. By: Leif Madsen (lmadsen) 2009-11-17 07:54:41.000-0600 I'm going to close this issue as that is the one I figured had resolved this. If you're still having issues going forward, please open a new issue, but for now this one is resolved. Thanks for reporting back! |