Summary: | ASTERISK-18882: Asterisk lock during production | ||
Reporter: | Andrew Parisio (parisioa) | Labels: | |
Date Opened: | 2011-11-17 16:58:06.000-0600 | Date Closed: | 2011-12-20 14:18:21.000-0600 |
Priority: | Critical | Regression? | |
Status: | Closed/Complete | Components: | Channels/chan_local Core/PBX |
Versions: | 1.6.2.20 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) locks.txt ( 1) threads.txt | |
Description: | During production asterisk hangs and simply stops processing calls. Existing calls do not drop, and asterisk will not restart or stop with a kill (it must be kill -9'd). This has happened every other day or so for the last two weeks, except for today where it has happened twice. I had debug threads & don't opt turned on to catch the attached core show threads & core show locks. | ||
Comments: | By: Andrew Parisio (parisioa) 2011-11-17 16:59:28.508-0600 Leif: Please don't body slam me over this one. By: Matt Jordan (mjordan) 2011-11-17 17:02:19.546-0600 Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions. After testing with Asterisk 1.8, if you find this problem has not been resolved, please open a new issue against Asterisk 1.8. By: Richard Mudgett (rmudgett) 2011-11-17 18:01:14.468-0600 I think you are in a deadlock avoidance loop in chan_local that can never be resolved because of other channel locks held by ast_do_masquerade(). Locking in this area of code is very different in v1.8. By: Andrew Parisio (parisioa) 2011-11-17 18:18:46.902-0600 Having looked at the logs a little closer it appears to happen around the time a reload occurs (sip reload & dialplan reload), although it doesn't happen at every reload, just every once in a while, seemingly randomly. I'll test upgrading to 1.8 and see if it continues to happen. By: Leif Madsen (lmadsen) 2011-11-18 07:55:14.432-0600 Andrew: I promise nothing! By: Andrew Parisio (parisioa) 2011-11-22 12:10:14.602-0600 We upgraded to 1.8.7.1 in production today and are testing it out. Given the short week we have light call volume and may not trigger it anyway so it may not be confirmed fixed until next week (we didn't trigger it yesterday in 1.6.2.20). By: Leif Madsen (lmadsen) 2011-12-20 08:52:25.339-0600 Ping? By: Andrew Parisio (parisioa) 2011-12-20 12:20:45.174-0600 We haven't had a lock since so it appears as though the issue was resolved somewhere in 1.8. Thanks! By: Matt Jordan (mjordan) 2011-12-20 14:18:21.350-0600 Per Andrew, this appears to be resolved in the 1.8 branch |