Summary: | ASTERISK-26835: res_rtp_asterisk: Crash when freeing RTCP address string | ||||||
Reporter: | Niklas Larsson (pnlarsson) | Labels: | |||||
Date Opened: | 2017-03-03 02:02:06.000-0600 | Date Closed: | 2017-04-21 13:12:05 | ||||
Priority: | Major | Regression? | |||||
Status: | Closed/Complete | Components: | Resources/res_rtp_asterisk | ||||
Versions: | 13.14.0 | Frequency of Occurrence | Occasional | ||||
Related Issues: |
| ||||||
Environment: | Debian 8 | Attachments: | ( 0) 0001-res_rtp_asterisk-Set-rtp-rtcp-to-NULL-to-prevent-dou.patch ( 1) backtrace_20170306_clean.txt ( 2) backtrace_core.uc51-2017-03-17T08-34-30+0100.txt ( 3) backtrace_core.uc62-2017-03-02T11-47-33+0100.txt ( 4) backtrace-threads-clean.txt ( 5) core_show_locks.txt ( 6) core-asterisk-running-2017-03-31T09-33-43-0400-brief.txt ( 7) core-asterisk-running-2017-03-31T09-33-43-0400-full.txt ( 8) core-asterisk-running-2017-03-31T09-33-43-0400-locks.txt ( 9) core-asterisk-running-2017-03-31T09-33-43-0400-thread1.txt (10) core-asterisk-running-2017-04-03T09-07-25-0400-brief.txt (11) core-asterisk-running-2017-04-03T09-07-25-0400-full.txt (12) core-asterisk-running-2017-04-03T09-07-25-0400-locks.txt (13) core-asterisk-running-2017-04-03T09-07-25-0400-thread1.txt (14) example.mp3 | ||||
Description: | Now and then we get this segfaults and it has have been around for some versions (at least 13.13 could be before as well). | ||||||
Comments: | By: Asterisk Team (asteriskteam) 2017-03-03 02:02:06.861-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Ross Beer (rossbeer) 2017-03-06 06:33:08.874-0600 I am experiencing the same issue, please see attached backtrace. By: Sean Bright (seanbright) 2017-03-08 15:52:46.529-0600 I've attached a total stab-in-the-dark patch. Could you give it a whirl and let me know. By: Ross Beer (rossbeer) 2017-03-08 16:33:39.590-0600 I've applied the patch, I'll let you know the outcome. Thank you for your assistance. By: Sean Bright (seanbright) 2017-03-08 16:36:01.483-0600 It's either going to not have any affect whatsoever, or it may reduce the occurrence of the problem, but I don't believe it is a complete fix. If it reduces the occurrences, it's just turning a timing problem into a slightly less likely timing problem. By: Ross Beer (rossbeer) 2017-03-13 12:26:55.520-0500 The patch is still in place with no further crashes. Could this patch put on to gerrit for inclusion in the releases? By: Sean Bright (seanbright) 2017-03-13 14:02:12.811-0500 If it actually fixed the problem I would submit for inclusion, but it doesn't. It just makes the chances of a crash less likely. Until there is an actual fix, I would suggest this issue remain open. In the meantime, you're obviously free to continue patching your local installation. By: Niklas Larsson (pnlarsson) 2017-03-17 02:46:02.164-0500 With the stab in the dark patch applied By: Richard Mudgett (rmudgett) 2017-03-27 17:22:56.596-0500 A patch is up for review at https://gerrit.asterisk.org/#/c/5341/ It needs some real-world testing. I have run it through the testsuite a couple times and done some test calls. By: Sebastian Gutierrez (sum) 2017-03-30 09:50:16.830-0500 tested the patch in production to see if resolve my issue of dtls timeout crash but was worse, I think a deadlock occurred and have to go back to a previous version, it processed more than 1000 calls By: Richard Mudgett (rmudgett) 2017-03-30 15:41:13.584-0500 New patch up on gerrit. Still at https://gerrit.asterisk.org/#/c/5341/ By: Sebastian Gutierrez (sum) 2017-03-31 08:48:34.261-0500 deadlocked again, this time we have all traces, dont optimize, debug threads, and if needed the core dump has malloc debug By: Sebastian Gutierrez (sum) 2017-04-03 09:09:02.592-0500 With the latest patch I had some rtp issues, having the calls with cuts (seems like missing packets), I attach all logs and an mp3, all calls were the same and going back to a previous version solve the issue. By: Richard Mudgett (rmudgett) 2017-04-03 12:59:55.357-0500 New patch to fix another race condition segfault exposed by the patch up on gerrit. By: Richard Mudgett (rmudgett) 2017-04-04 16:53:54.920-0500 New patch to fix another race condition segfault exposed by the patch up on gerrit. By: Richard Mudgett (rmudgett) 2017-04-06 13:17:25.959-0500 New patch up on gerrit. The new patch adds more protection from reinvites restarting ICE negotiations. By: Sebastian Gutierrez (sum) 2017-04-10 14:24:57.027-0500 I had the same crash I was getting with patchset 5 haven't tested patchset 6 (I will today) {noformat} #0 dtls_srtp_handle_timeout (instance=instance@entry=0x7f387c0131a0, rtcp=rtcp@entry=1) at res_rtp_asterisk.c:2050 #1 0x00007f38256f1408 in dtls_srtp_handle_rtcp_timeout (data=0x7f387c0131a0) at res_rtp_asterisk.c:2085 #2 0x00000000005b0fab in ast_sched_runq (con=0x1e89570) at sched.c:783 #3 0x00007f381ddfff8e in do_monitor (data=data@entry=0x0) at chan_sip.c:29615 #4 0x00000000005e817d in dummy_start (data=<optimized out>) at utils.c:1235 #5 0x00007f388aefb6ba in start_thread (arg=0x7f38156c4700) at pthread_create.c:333 #6 0x00007f388a4e482d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 {noformat} didnt get the full data because I think there are several full logs of the issue, If needed I will take them. By: Richard Mudgett (rmudgett) 2017-04-12 12:35:45.618-0500 Patch version 6 up on gerrit is expected to be merged in a few days (after a merge conflict is resolved). For those testing the patch, I haven't heard about how well the patch is working for you. By: Ross Beer (rossbeer) 2017-04-12 12:41:53.849-0500 I've been testing this from around Patch V2 and it has resolved the crash and deadlocks I was getting. I don't use ICE and therefore didn't have any of the issues relating to that. By: Ross Beer (rossbeer) 2017-04-14 02:31:12.025-0500 Bad news, when testing with Asterisk GIT-13-13.15.0-rc1-79-g5e2a8efM using patchset 7 there is a deadlock. However using Asterisk GIT-13-13.15.0-rc1-64-ge851412M with patchset 6 there is no deadlock. This may mean the rebase has caused issues between patchset 6 and 7 or another piece of code has been committed that is causing a new deadlock. By: Richard Mudgett (rmudgett) 2017-04-14 09:32:33.352-0500 [~rossbeer] You should know by now what kind of information we need to fix issues: See "Getting Information For A Deadlock" on this page https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information By: Ross Beer (rossbeer) 2017-04-14 09:43:31.303-0500 @Richard Mudgett I did get core dump, however, I did the backtrace after downgrading to a previous version, therefore the stack was corrupt. I will install the latest GIT version plus patchset 7 and will wait for a further deadlock. Once I have the required information I will update the ticket. By: Ross Beer (rossbeer) 2017-04-14 17:13:46.272-0500 Richard, please find attached the backtrace and 'core show locks' By: Richard Mudgett (rmudgett) 2017-04-14 17:28:17.577-0500 The patch for ASTERISK-26923 is causing the deadlock. Commit 3e7c396a51b240088c475dd53e7bac9869376129 Revert that commit and you shouldn't have a deadlock anymore. By: Ross Beer (rossbeer) 2017-04-18 04:24:40.196-0500 I can confirm reverting the commit has resolved the deadlock. There have been no further issues with this patch since the 14th April. By: Friendly Automation (friendly-automation) 2017-04-21 13:12:06.564-0500 Change 5342 merged by George Joseph: rtp_engine/res_rtp_asterisk: Fix RTP struct reentrancy crashes. [https://gerrit.asterisk.org/5342|https://gerrit.asterisk.org/5342] By: Friendly Automation (friendly-automation) 2017-04-21 13:12:26.104-0500 Change 5341 merged by George Joseph: rtp_engine/res_rtp_asterisk: Fix RTP struct reentrancy crashes. [https://gerrit.asterisk.org/5341|https://gerrit.asterisk.org/5341] By: Friendly Automation (friendly-automation) 2017-04-21 15:47:42.528-0500 Change 5343 merged by George Joseph: rtp_engine/res_rtp_asterisk: Fix RTP struct reentrancy crashes. [https://gerrit.asterisk.org/5343|https://gerrit.asterisk.org/5343] |