ASTERISK-24498: Segmentation fault in res_hep

[Home]

Summary: ASTERISK-24498: Segmentation fault in res_hep_rtcp on attended transfer

Reporter: Beppo Mazzucato (beppo.it) Labels:

Date Opened: 2014-11-05 09:21:10.000-0600 Date Closed: 2014-11-12 18:02:04.000-0600

Priority: Major Regression? Yes

Status: Closed/Complete Components: Resources/res_hep_rtcp

Versions: 13.0.0 Frequency of
Occurrence Constant

Related
Issues:
is a clone of ASTERISK-24508 pjsip - REFER request from SNOM is rejected with "400 bad request" - DEBUG shows "Received a REFER without a parseable Refer-To"

is duplicated by ASTERISK-24629 Asterisk crashing randomly, appears related to res_hep_rtcp

is duplicated by ASTERISK-24420 segfault when Zoiper joins ConfBridge

is related to ASTERISK-24489 Crash: Asterisk crashes when converting RTCP packet to JSON for res_hep_rtcp and report blocks are greater than 1

Environment: Asterisk 13.0.0 on CentOS 6.5 extension 601 - SNOM 710 extension 602 - yealink T46 extension 603 - Jitsi Attachments: ( 0) ASTERISK-24498-13.diff
( 1) backtrace.txt
( 2) before_patch_ASTERISK-24498-13.txt
( 3) log.txt
( 4) log2.txt
( 5) snom.pcap

Description: Asterisk crash trying to perform an attended transfer
ext 602 call ext 601
ext 601 put the call on hold
ext 601 call extension 603
when ext 603 answers asterisk crashes

Unattended transfer works properly

If the attended transfer is made by the yealink phone (in other words echanging the roles of ext 601 and ext 602 above) it works properly

Same scenario doesn't crash with asterisk 11.13.1

I'm attaching log and backtrace

Comments: By: Beppo Mazzucato (beppo.it) 2014-11-05 09:24:30.751-0600

Log and backtrace
By: Matt Jordan (mjordan) 2014-11-05 09:51:25.904-0600

Well, that's a bit odd. Somehow the RTCP payload looks to be 'bad', and when {{res_hep_rtcp}} went to decode it things went upside down:

{noformat}
#0 0x000000000058e51b in rtcp_report_to_json (msg=0x7fa410002658, sanitize=0x0) at rtp_engine.c:1913
1913 snprintf(str_lsr, sizeof(str_lsr), "%u", payload->report->report_block[i]->lsr);
#0 0x000000000058e51b in rtcp_report_to_json (msg=0x7fa410002658, sanitize=0x0) at rtp_engine.c:1913
json_report_block = 0x7fa400006128
str_lsr = "0\000蹣\177\000\000\024\270]1\377\177\000\000\260*蹣\177\000\000\000\000\000\000\000\000\000"
payload = 0x7fa410002468
json_rtcp_report = 0x0
json_rtcp_report_blocks = 0x7fa400002a68
json_rtcp_sender_info = 0x0
json_channel = 0x0
i = 1
{noformat}

I'm wondering what {{payload}} actually is at this point.

In the core file, can you use {{gdb}} to print the following:
{noformat}
# frame 0
# print *payload
# print payload->report
# print payload->report->report_block[0]
{noformat}
By: Beppo Mazzucato (beppo.it) 2014-11-05 10:25:45.902-0600

here you are
{noformat}
(gdb) frame 0
#0 0x000000000058e51b in rtcp_report_to_json (msg=0x7fa410002658, sanitize=0x0) at rtp_engine.c:1913
1913 snprintf(str_lsr, sizeof(str_lsr), "%u", payload->report->report_block[i]->lsr);
(gdb) print *payload
$1 = {snapshot = 0x7fa410005448, report = 0x7fa410003b88, blob = 0x7fa410001fb8}
(gdb) print payload->report
$2 = (struct ast_rtp_rtcp_report *) 0x7fa410003b88
(gdb) print payload->report->report_block[0]
$3 = (struct ast_rtp_rtcp_report_block *) 0x7fa4100025a0
(gdb)
{noformat}

By: Matt Jordan (mjordan) 2014-11-05 17:28:42.339-0600

Er... well that's odd :-) It actually has something valid in it at that point, which is not what I expected at all.

What is the value of {{payload->report->report_block[i]->lsr}}?

(Note: {{i}} should be equal to 0, since we only support a single report block - but if isn't 0 then that would also explain the crash)
By: Beppo Mazzucato (beppo.it) 2014-11-06 01:46:12.590-0600

well i is equal to 1 and this is the crash cause ... sorry for not being able to help you more in depth but I'm not knowledgeable enough on this module ..... let me know if there is anything else I could do to help you.
{noformat}
(gdb) print i
$1 = 1
(gdb) print payload->report->report_block[i]
$2 = (struct ast_rtp_rtcp_report_block *) 0x0
(gdb) print payload->report->report_block[i]->lsr
Cannot access memory at address 0x14
{noformat}
By: Matt Jordan (mjordan) 2014-11-06 20:53:44.792-0600

Well, {{i}} really shouldn't be {{1}} unless something sent us more report blocks then we can handle... which is possible. We wouldn't know for sure without looking at a pcap of the RTCP message traffic.

I've attached a patch here that should prevent this from happening. Before applying it, can you verify one last value in {{gdb}}:

{noformat}
# print payload->report->reception_report_count
{noformat}

Thanks!
By: Beppo Mazzucato (beppo.it) 2014-11-07 07:26:45.129-0600

This is the requested output
(gdb) print payload->report->reception_report_count
$1 = 2
(gdb)

I'm attaching the pcap capture for your further investigation.

I tested the patch ... it doesn't crash anymore but the transfer doesn't complete in fact asterisk reply with a "400 Bad Request" to the REFER message from the snom phone. I'm attaching the log showing this, please let me know if you need the pcap.
By: Beppo Mazzucato (beppo.it) 2014-11-07 07:28:47.666-0600

pcap of the case causing the crash (before pathing) and log of the case after patching.
By: Matt Jordan (mjordan) 2014-11-07 09:12:07.943-0600

So, this is now two separate issues. I'm going to create a different issue and link it to this one for the {{REFER}} request handling.

That being said, it is quite possible that the {{REFER}} request being sent to us is *not* good. We'll need a full DEBUG log for that - I'll comment on the new issue for that.
By: Gregory Malsack (gmalsack) 2014-11-11 08:57:21.211-0600

It would appear the patch ASTERISK-24498-13 has resolved this problem. I was not able to place a single call from windows zoiper prior to the patch, however after applying the patch I was able to make 2 successful calls back to back.

Thank You!
Greg