Summary: | ASTERISK-25435: Asterisk periodically hangs. UDP Recv-Q greatly exceeds zero. | ||||
Reporter: | Dmitriy Serov (Demon) | Labels: | |||
Date Opened: | 2015-09-30 02:24:05 | Date Closed: | 2015-10-08 13:14:06 | ||
Priority: | Major | Regression? | No | ||
Status: | Closed/Complete | Components: | |||
Versions: | 13.5.0 13.6.0 | Frequency of Occurrence | Frequent | ||
Related Issues: |
| ||||
Environment: | Attachments: | ( 0) 2015_09_29__21_42_01.full.tail.txt ( 1) 2015_09_29__21_42_01.netstat.txt ( 2) 2015_09_29__21_43_01.backtrace-threads.txt ( 3) 2015_09_29__21_43_01.full.tail.txt ( 4) 2015_09_29__21_43_01.locks.txt ( 5) 2015_09_29__21_43_01.netstat.txt ( 6) 2015_09_29__21_44_07.backtrace-threads.txt ( 7) 2015_09_29__21_44_07.full.tail.txt | |||
Description: | Asterisk periodic hangs.
UDP Recv-Q greatly exceeds zero. No errors in log (like DNS error, function getaddr). The system behavior is very similar to ASTERISK-25421. STUN is Off Backtraces attached. | ||||
Comments: | By: Asterisk Team (asteriskteam) 2015-09-30 02:24:06.253-0500 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Dmitriy Serov (Demon) 2015-09-30 02:28:12.364-0500 Watchdog monitoring netstat Recv-Q length. 21:42:01 - size exceeded (netstat result, full log tail attached) 21:43:01 - size still exceeded (netstat result, full log tail, backtrace, locks attached) 21:44:07 - size still exceeded (full log tail, backtrace attached). Asterisk was killed -9 The situation is repeated at least once a day By: Mark Michelson (mmichelson) 2015-10-01 15:44:06.011-0500 It looks like the problem is that the send_request_wrapper structure in res_pjsip.c has its mutex created using pj_mutex_create_simple(). The lock is then attempted to be locked recursively (see thread 14 of 2015_09_29__21_43_01.backtrace-threads.txt), which results in the thread blocking forever. There are two potential solutions here: 1) Declare the lock using pj_mutex_create_recursive() so that this will not cause a deadlock 2) Use an ast_mutex_t, which is always created recursive. This also would allow for the lock to show up in 'core show locks' output. By: Richard Mudgett (rmudgett) 2015-10-07 12:42:27.872-0500 Patch up on gerrit to fix the deadlock identified by [~mmichelson]: https://gerrit.asterisk.org/#/c/1412/ v13 https://gerrit.asterisk.org/#/c/1413/ master |