Summary: | ASTERISK-25183: PJSIP: Crash on NULL channel in chan_pjsip_incoming_response despite previous checks for NULL channel | ||||
Reporter: | Matt Jordan (mjordan) | Labels: | |||
Date Opened: | 2015-06-22 09:39:08 | Date Closed: | 2015-07-27 11:32:41 | ||
Priority: | Major | Regression? | |||
Status: | Closed/Complete | Components: | Channels/chan_pjsip | ||
Versions: | Frequency of Occurrence | ||||
Related Issues: |
| ||||
Environment: | Attachments: | ( 0) backtrace_2003.txt ( 1) full.txt ( 2) messages.txt | |||
Description: | Note that this was caught by the {{channels/pjsip/basic_calls/outgoing/off-nominal/bob_incompatible_codecs}} test in the Test Suite.
A crash occurred in the previously mentioned test due to the channel being NULL and its name being retrieved: {code} #0 0x000000000054494c in ast_channel_name (chan=0x0) at channel_internal_api.c:476 476 DEFINE_STRINGFIELD_GETTER_FOR(name); #0 0x000000000054494c in ast_channel_name (chan=0x0) at channel_internal_api.c:476 No locals. #1 0x00007f42987ebce1 in chan_pjsip_incoming_response (session=0x7f42d000d3b8, rdata=0x7f4308022d98) at chan_pjsip.c:2224 status = {code = 200, reason = {ptr = 0x7f4308024100 "OK", slen = 2}} cause_code = 0x7f42da36d670 data_size = 102 __PRETTY_FUNCTION__ = "chan_pjsip_incoming_response" #2 0x00007f42de1a9078 in handle_incoming_response (session=0x7f42d000d3b8, rdata=0x7f4308022d98, type=PJSIP_EVENT_TSX_STATE, response_priority=AST_SIP_SESSION_AFTER_MEDIA) at res_pjsip_session.c:2187 supplement = 0x7f42d000f7c0 status = {code = 200, reason = {ptr = 0x7f4308024100 "OK", slen = 2}} __PRETTY_FUNCTION__ = "handle_incoming_response" #3 0x00007f42de1a923f in handle_incoming (session=0x7f42d000d3b8, rdata=0x7f4308022d98, type=PJSIP_EVENT_TSX_STATE, response_priority=AST_SIP_SESSION_AFTER_MEDIA) at res_pjsip_session.c:2201 __PRETTY_FUNCTION__ = "handle_incoming" {code} In Asterisk 13, this corresponds to this line of code: {code} /* Build and send the tech-specific cause information */ /* size of the string making up the cause code is "SIP " number + " " + reason length */ data_size += 4 + 4 + pj_strlen(&status.reason); cause_code = ast_alloca(data_size); memset(cause_code, 0, data_size); ast_copy_string(cause_code->chan_name, ast_channel_name(session->channel), AST_CHANNEL_NAME); // THIS LINE HERE {code} However, we previously explicitly check that the channel is non-NULL before proceeding in this function: {code} if (!session->channel) { return; } {code} Which ... doesn't make much sense. Even if we had a reference counting issue, this should have pointed to garbage. However, we can see that we are hanging up a channel at this moment in time: {code} Thread 70 (Thread 0x7f42da3ea700 (LWP 7686)): #0 0x00000000005fee68 in __ast_pthread_mutex_lock (filename=0x7fa55b "astmm.c", lineno=360, func=0x7fb2a7 "region_free", mutex_name=0x7fa5cb "®lock", t=0xadfb40) at lock.c:313 #1 0x000000000047bd8e in region_free (freed=0xb17040, reg=0x7f42f800a580) at astmm.c:360 #2 0x000000000047c4e3 in __ast_free_region (ptr=0x7f42f800a610, file=0x7fb5ab "astobj2.c", lineno=461, func=0x7fb840 "internal_ao2_ref") at astmm.c:479 #3 0x000000000047c81e in __ast_free (ptr=0x7f42f800a610, file=0x7fb5ab "astobj2.c", lineno=461, func=0x7fb840 "internal_ao2_ref") at astmm.c:532 #4 0x000000000048142d in internal_ao2_ref (user_data=0x7f42f800a668, delta=-1, file=0x7fb5ab "astobj2.c", line=516, func=0x7fb823 "__ao2_ref") at astobj2.c:461 #5 0x0000000000481969 in __ao2_ref (user_data=0x7f42f800a668, delta=-1) at astobj2.c:516 #6 0x0000000000481a4a in __ao2_cleanup (obj=0x7f42f800a668) at astobj2.c:529 #7 0x00007f42987e9ced in hangup (data=0x7f4314006578) at chan_pjsip.c:1744 #8 0x000000000072a453 in ast_taskprocessor_execute (tps=0x7f42d000e698) at taskprocessor.c:768 #9 0x000000000073dba0 in execute_tasks (data=0x7f42d000e698) at threadpool.c:1157 #10 0x000000000072a453 in ast_taskprocessor_execute (tps=0x1484fa8) at taskprocessor.c:768 #11 0x000000000073b0c5 in threadpool_execute (pool=0x1653518) at threadpool.c:351 #12 0x000000000073d677 in worker_active (worker=0x7f42cc001cc8) at threadpool.c:1075 #13 0x000000000073d2c2 in worker_start (arg=0x7f42cc001cc8) at threadpool.c:995 #14 0x0000000000750540 in dummy_start (data=0x7f42cc001f10) at utils.c:1237 #15 0x00000034ac6079d1 in start_thread () from /lib64/libpthread.so.0 #16 0x00000034ac2e89dd in clone () from /lib64/libc.so.6 {code} Which means that we can probably still skip past the first check on line {{2195}}, and have the {{hangup}} callback nuke out the {{session->channel}} pointer. Egads. Logs and backtrace attached. | ||||
Comments: | By: Asterisk Team (asteriskteam) 2015-07-26 11:53:12.558-0500 This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable. |