Summary: | ASTERISK-25823: SIGSEGV, Segmentation fault. - ../sysdeps/x86_64/strlen.S: No such file or directory. | ||||
Reporter: | Andreas Krüger (woopstar) | Labels: | |||
Date Opened: | 2016-03-01 06:35:22.000-0600 | Date Closed: | 2017-04-27 16:18:09 | ||
Priority: | Major | Regression? | |||
Status: | Closed/Complete | Components: | Resources/res_pjsip Resources/res_pjsip_caller_id | ||
Versions: | 13.7.2 | Frequency of Occurrence | Constant | ||
Related Issues: |
| ||||
Environment: | Ubuntu 14.04 - Asterisk 13.7.2 | Attachments: | |||
Description: | Asterisk is crashing with the following error, when we're trying to transfer a call. It seems it tries to call strlen which is either not available or the variable is null?
{code} #0 strlen () at ../sysdeps/x86_64/strlen.S:106 #1 0x00007fffaa40e64f in modify_id_header (pool=0x7fffec006ea0, id=id@entry=0x7fffb135fa90, id_hdr=0x7fffec007588) at res_pjsip_caller_id.c:415 #2 0x00007fffaa40ee6a in caller_id_outgoing_request (session=0x7fffec00d3c8, tdata=0x7fffec006f48) at res_pjsip_caller_id.c:683 #3 0x00007fffb2513e98 in handle_outgoing_request (session=0x7fffec00d3c8, tdata=0x7fffec006f48) at res_pjsip_session.c:2251 #4 0x00007fffb25157f3 in ast_sip_session_send_request_with_cb (session=0x7fffec00d3c8, tdata=0x7fffec006f48, on_response=<optimized out>) at res_pjsip_session.c:1089 #5 0x00007fff9d2a68d6 in call (data=0x7fffb800fe18) at chan_pjsip.c:1658 #6 0x00000000005e936c in ast_taskprocessor_execute (tps=0x7fffec00ddd8) at taskprocessor.c:784 #7 0x00000000005f22cb in execute_tasks (data=0x7fffec00ddd8) at threadpool.c:1320 #8 0x00000000005e936c in ast_taskprocessor_execute (tps=0xaaaa08) at taskprocessor.c:784 #9 0x00000000005f0412 in threadpool_execute (pool=0xaaac08) at threadpool.c:351 #10 0x00000000005f1be6 in worker_active (worker=0x7fffe0000f28) at threadpool.c:1103 #11 0x00000000005f19a3 in worker_start (arg=0x7fffe0000f28) at threadpool.c:1023 #12 0x00000000005fdf6e in dummy_start (data=0x7fffe0000eb0) at utils.c:1237 #13 0x00007ffff60580a5 in start_thread (arg=0x7fffb1360700) at pthread_create.c:309 #14 0x00007ffff563bcfd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 {code} And the full bt is here: {code} #0 strlen () at ../sysdeps/x86_64/strlen.S:106 No locals. #1 0x00007fffaa40e64f in modify_id_header (pool=0x7fffec006ea0, id=id@entry=0x7fffb135fa90, id_hdr=0x7fffec007588) at res_pjsip_caller_id.c:415 name_buf_len = <optimized out> name_buf = <optimized out> id_name_addr = 0x7fffec007610 #2 0x00007fffaa40ee6a in caller_id_outgoing_request (session=0x7fffec00d3c8, tdata=0x7fffec006f48) at res_pjsip_caller_id.c:683 from = 0x7fffec007588 dlg = 0x7fffec00e3c8 effective_id = {name = {str = 0x0, char_set = 1, presentation = 0, valid = 1 '\001'}, number = {str = 0x7fffb8024f70 "22343661", plan = 0, presentation = 0, valid = 1 '\001'}, subaddress = {str = 0x0, type = 0, odd_even_indicator = 0 '\000', valid = 0 '\000'}, tag = 0x0} connected_id = {name = {str = 0x0, char_set = 1, presentation = 0, valid = 1 '\001'}, number = {str = 0x7fffec0205a0 "22343661", plan = 0, presentation = 0, valid = 1 '\001'}, subaddress = {str = 0x0, type = 0, odd_even_indicator = 0 '\000', valid = 0 '\000'}, tag = 0x0} tdata = 0x7fffec006f48 session = 0x7fffec00d3c8 #3 0x00007fffb2513e98 in handle_outgoing_request (session=0x7fffec00d3c8, tdata=0x7fffec006f48) at res_pjsip_session.c:2251 supplement = 0x7fffec00e270 req = {method = {id = PJSIP_INVITE_METHOD, name = {ptr = 0x7fffb3bb819b "INVITE", slen = 6}}, uri = 0x7fffec007480} __PRETTY_FUNCTION__ = "handle_outgoing_request" #4 0x00007fffb25157f3 in ast_sip_session_send_request_with_cb (session=0x7fffec00d3c8, tdata=0x7fffec006f48, on_response=<optimized out>) at res_pjsip_session.c:1089 on_response = <optimized out> tdata = 0x7fffec006f48 session = 0x7fffec00d3c8 inv_session = <optimized out> #5 0x00007fff9d2a68d6 in call (data=0x7fffb800fe18) at chan_pjsip.c:1658 channel = 0x7fffb800fe18 session = 0x7fffec00d3c8 pvt = <optimized out> tdata = 0x7fffec006f48 res = 0 #6 0x00000000005e936c in ast_taskprocessor_execute (tps=0x7fffec00ddd8) at taskprocessor.c:784 local = {local_data = 0x0, data = 0x5fc633 <ast_threadstorage_set_ptr+60>} t = 0x7fffb80015f0 size = 1 __PRETTY_FUNCTION__ = "ast_taskprocessor_execute" #7 0x00000000005f22cb in execute_tasks (data=0x7fffec00ddd8) at threadpool.c:1320 tps = 0x7fffec00ddd8 #8 0x00000000005e936c in ast_taskprocessor_execute (tps=0xaaaa08) at taskprocessor.c:784 local = {local_data = 0x0, data = 0xaaabe8} t = 0x7fffb8024f00 size = 11185160 __PRETTY_FUNCTION__ = "ast_taskprocessor_execute" #9 0x00000000005f0412 in threadpool_execute (pool=0xaaac08) at threadpool.c:351 __PRETTY_FUNCTION__ = "threadpool_execute" #10 0x00000000005f1be6 in worker_active (worker=0x7fffe0000f28) at threadpool.c:1103 alive = 0 #11 0x00000000005f19a3 in worker_start (arg=0x7fffe0000f28) at threadpool.c:1023 worker = 0x7fffe0000f28 __PRETTY_FUNCTION__ = "worker_start" #12 0x00000000005fdf6e in dummy_start (data=0x7fffe0000eb0) at utils.c:1237 __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {140736166496000, -2886013801213841322, 1, 0, 140736166496704, 140736166496000, -2886013801239007146, 2886160201149067350}, __mask_was_saved = 0}}, __pad = {0x7fffb135fef0, 0x0, 0x0, 0x0}} __cancel_routine = 0x451320 <ast_unregister_thread> __cancel_arg = 0x7fffb1360700 __not_first_call = 0 ret = 0x0 a = {start_routine = 0x5f191c <worker_start>, data = 0x7fffe0000f28, name = 0x7fffe00008f0 "worker_start started at [ 1077] threadpool.c worker_thread_start()"} #13 0x00007ffff60580a5 in start_thread (arg=0x7fffb1360700) at pthread_create.c:309 __res = <optimized out> pd = 0x7fffb1360700 now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140736166496000, 2886160671884849238, 1, 0, 140736166496704, 140736166496000, -2886013801215938474, -2886139004426899370}, mask_was_saved = 0}}, priv = {pad = { 0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> pagesize_m1 = <optimized out> sp = <optimized out> freesize = <optimized out> __PRETTY_FUNCTION__ = "start_thread" #14 0x00007ffff563bcfd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 {code} It seems it was trigged, if we did not set the name on the channel. When we configured the name property on the channel, everything worked. So i guess according to Asterisk, the problem is in the file {code}res_pjsip_caller_id.c{code} at line 415. The validation check {code}if (id->name.valid) {{code} is true, but {code}id->name.str{code} is 0x0 when name is not set. This causes strlen to fail in the line {code}int name_buf_len = strlen(id->name.str) * 2 + 1;{code} The following code was bogus: {code} // OutboundCallerID is fetched from mysql though ODBC, example: 22556644 if ("${OutboundCallerID}" != "") { Set(CALLERID(num)=${OutboundCallerID}); } Dial(PJSIP/${number}@${TrunkName},${DIALTIMEOUT},${DIALOPTIONS}U(onConnect,${CallInfoId})); {code} And the following works: {code} // OutboundCallerID is fetched from mysql though ODBC, example: 22556644 if ("${OutboundCallerID}" != "") { Set(CALLERID(num)=${OutboundCallerID}); Set(CALLERID(name)=${OutboundCallerID}); } Dial(PJSIP/${number}@${TrunkName},${DIALTIMEOUT},${DIALOPTIONS}U(onConnect,${CallInfoId})); {code} Regarding the "pjsip set logger on" - it never reaches to this. No SIP packages is set, so this is just empty. | ||||
Comments: | By: Asterisk Team (asteriskteam) 2016-03-01 06:35:23.622-0600 The severity of this issue has been automatically downgraded from "Blocker" to "Major". The "Blocker" severity is reserved for issues which have been determined to block the next release of Asterisk. This severity can only be set by privileged users. If this issue is deemed to block the next release it will be updated accordingly during the triage process. By: Asterisk Team (asteriskteam) 2016-03-01 06:35:24.031-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Andreas Krüger (woopstar) 2016-03-01 06:36:58.557-0600 Forgot to add, that GDB said, that strlen.S was not found as file. {code} Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffb1360700 (LWP 47935)] strlen () at ../sysdeps/x86_64/strlen.S:106 106 ../sysdeps/x86_64/strlen.S: No such file or directory. {code} By: Rusty Newton (rnewton) 2016-03-01 15:04:44.383-0600 Sounds like are able to reproduce the crash on a transfer. Can you provide the dialplan that is used for the transfer along with a debug trace of the call in progress? https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information Please follow those instructions exactly for the debug trace. Make sure to include the "pjsip set logger on" output and verify "DEBUG" messages are in the log. By: Andreas Krüger (woopstar) 2016-03-02 04:01:39.952-0600 Hi Rusty, You're right that this is caused on a transfer. <snip> [Edit by Rusty - Moved your comment up into the description since it helps a ton. Thanks!] By: Rusty Newton (rnewton) 2016-03-02 07:36:35.118-0600 Thanks! By: Andreas Krüger (woopstar) 2016-03-02 07:40:44.556-0600 Sure! :) Hope it will be fixed soon and released It prop. not that a common error though. By: George Joseph (gjoseph) 2016-03-14 19:50:41.194-0500 I fixed a similar issue last month. Can you check if that fix resolved your issue as well? Pull the current 13 git branch to check. By: Andreas Krüger (woopstar) 2016-03-15 03:31:23.503-0500 Do you remember the commit id? Would else have to check a lot :( By: George Joseph (gjoseph) 2016-03-15 08:39:36.661-0500 34c64707d1aa346fb0e9c7f97e375d22dedf67d9 res_pjsip_caller_id: Fix segfault when replacing rpid or pai header By: Andreas Krüger (woopstar) 2016-03-29 08:56:41.436-0500 Problem still exists. And now it has been moved to: https://github.com/asterisk/asterisk/commit/f0799da3ac7e13638838063fe3cf0d041daea520#diff-dfef39ef55b6e289b29d9093cee50662R3917 The strlen function must validate, that it actually contains a string to do strlen on. By: Kevin Harwell (kharwell) 2017-04-26 13:11:53.845-0500 I have been unable to duplicate this problem. [~woopstar] Do you know if this still happens for you in the latest release of Asterisk 13 (13.7.2 is a few versions back now and some work has been done in the pjsip callerid area since then)? Also, I've been looking through the code to see how the caller id name string could be NULL, but considered valid. One way is if in the dialplan it was specified as such. Do you see or have anywhere in your dialplan that sets the valid flag? Something similar to the following: {noformat} Set(CALLERID(name-valid)=1 {noformat} By: Kevin Harwell (kharwell) 2017-04-26 13:53:01.996-0500 I take back some of the previous comment. There does appear to be other checks against caller id being NULL strewn throughout the code even when marked valid. So doing the same in this instance would not be unprecedented. That is also the quick and easy fix. It'd be nice to know the root cause of how it came to be in that state, but that may not be possible. So barring that and/or more details I'll put a NULL check in. Which no matter the root cause is probably the right fix anyway given that via dialplan one can specify the caller id name to be valid when it is not. By: Andreas Krüger (woopstar) 2017-04-26 13:55:53.783-0500 The problem was us not setting the name right. I believe you still need to do the strlen check and null check, but after we were setting the name the problem was never happening again. Does that answer suffice ? By: Kevin Harwell (kharwell) 2017-04-26 14:28:10.372-0500 Yeah that's fine. I have the info I need. Patch is up for review: https://gerrit.asterisk.org/#/c/5541 By: Friendly Automation (friendly-automation) 2017-04-27 16:18:09.934-0500 Change 5541 merged by Jenkins2: res_pjsip/res_pjsip_callerid: NULL check on caller id name string [https://gerrit.asterisk.org/5541|https://gerrit.asterisk.org/5541] By: Friendly Automation (friendly-automation) 2017-04-27 16:43:27.004-0500 Change 5542 merged by Jenkins2: res_pjsip/res_pjsip_callerid: NULL check on caller id name string [https://gerrit.asterisk.org/5542|https://gerrit.asterisk.org/5542] By: Friendly Automation (friendly-automation) 2017-04-27 16:48:56.659-0500 Change 5543 merged by Jenkins2: res_pjsip/res_pjsip_callerid: NULL check on caller id name string [https://gerrit.asterisk.org/5543|https://gerrit.asterisk.org/5543] |