Summary: | ASTERISK-25323: Asterisk: ongoing segfaults uncovered by CHAOS_DEBUG | ||
Reporter: | Scott Griepentrog (sgriepentrog) | Labels: | |
Date Opened: | 2015-08-14 08:18:19 | Date Closed: | |
Priority: | Minor | Regression? | |
Status: | Open/New | Components: | General |
Versions: | 13.0.0 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) backtrace-core.11062.txt ( 1) backtrace-core.12412.txt ( 2) backtrace-core.12729.txt ( 3) backtrace-core.29894.txt ( 4) full-log-core.12729.txt | |
Description: | Ongoing use of CHAOS_DEBUG on a test server is uncovering extremely unlikely scenarios that can result in a segfault, by use of randomly simulating a failed allocation.
Rather than create a new issue for each case, I'm grouping them under this single issue as they are found. | ||
Comments: | By: Scott Griepentrog (sgriepentrog) 2015-08-17 10:20:42.618-0500 Instance #1: crash in ast_str_hash due to null str {noformat} Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `asterisk -fvvvvvdddddgn'. Program terminated with signal 11, Segmentation fault. #0 0x00007fe9dc111f4a in ast_str_hash (str=0x0) at /root/13-c57b78d4c94e592d069521123a24fcb80524a893/include/asterisk/strings.h:1180 1180 while (*str) #0 0x00007fe9dc111f4a in ast_str_hash (str=0x0) at /root/13-c57b78d4c94e592d069521123a24fcb80524a893/include/asterisk/strings.h:1180 hash = 5381 #1 0x00007fe9dc111f99 in sorcery_memory_hash (obj=0x0, flags=64) at res_sorcery_memory.c:88 id = 0x0 #2 0x000000000045ffb6 in hash_ao2_find_first (self=0x1ed5568, flags=65, arg=0x0, state=0x7fe9d270f5e0) at astobj2_hash.c:388 node = 0x1ed5510 bucket_cur = 0 cmp = 7185311 #3 0x000000000045e34f in internal_ao2_traverse (self=0x1ed5568, flags=65, cb_fn=0x7fe9dc111f9b <sorcery_memory_cmp>, arg=0x0, data=0x0, type=AO2_CALLBACK_DEFAULT, tag=0x0, file=0x0, line=0, func=0x0) at astobj2_container.c:341 ret = 0x0 cb_default = 0x7fe9dc111f9b <sorcery_memory_cmp> cb_withdata = 0x0 node = 0x7fe9d270f7b0 traversal_state = 0x7fe9d270f5e0 orig_lock = AO2_LOCK_REQ_MUTEX multi_container = 0x0 multi_iterator = 0x0 __PRETTY_FUNCTION__ = "internal_ao2_traverse" #4 0x000000000045e6bb in __ao2_callback (c=0x1ed5568, flags=65, cb_fn=0x7fe9dc111f9b <sorcery_memory_cmp>, arg=0x0) at astobj2_container.c:452 No locals. #5 0x000000000045e86e in __ao2_find (c=0x1ed5568, arg=0x0, flags=65) at astobj2_container.c:493 arged = 0x0 __PRETTY_FUNCTION__ = "__ao2_find" #6 0x00007fe9dc11240a in sorcery_memory_update (sorcery=0x1e77d28, data=0x1ed5568, object=0x7fe9f4003ca0) at res_sorcery_memory.c:193 existing = 0x0 __PRETTY_FUNCTION__ = "sorcery_memory_update" #7 0x00000000005c16e7 in sorcery_wizard_update (obj=0x1ed4ae8, arg=0x7fe9d270f880, flags=0) at sorcery.c:2045 object_wizard = 0x1ed4ae8 details = 0x7fe9d270f880 __PRETTY_FUNCTION__ = "sorcery_wizard_update" #8 0x00000000005c1829 in ast_sorcery_update (sorcery=0x1e77d28, object=0x7fe9f4003ca0) at sorcery.c:2068 details = 0x7fe9f4003ca0 object_type = 0x1ed4b68 object_wizard = 0x0 found_wizard = 0x1ed4ae8 i = 0 sdetails = {sorcery = 0x1e77d28, obj = 0x7fe9f4003ca0} __PRETTY_FUNCTION__ = "ast_sorcery_update" #9 0x00007fe9d32cd802 in update_contact_status (contact=0x1fae9b0, value=value@entry=AVAILABLE) at res_pjsip/pjsip_options.c:162 status = 0x7fe9f4001dd0 update = 0x7fe9f4003ca0 __PRETTY_FUNCTION__ = "update_contact_status" #10 0x00007fe9d32cdd62 in qualify_contact_cb (token=0x1fae9b0, e=<optimized out>) at res_pjsip/pjsip_options.c:303 contact = 0x1fae9b0 __PRETTY_FUNCTION__ = "qualify_contact_cb" #11 0x00007fe9d32c8e00 in send_request_cb (token=0x7fe9f4000c90, e=0x7fe9d270f9c0) at res_pjsip.c:3224 req_data = 0x7fe9f4000c90 tsx = 0x1c24e78 challenge = 0x7fe9e0051828 tdata = 0x7fe9f4000cc0 supplement = <optimized out> endpoint = <optimized out> res = <optimized out> __PRETTY_FUNCTION__ = "send_request_cb" #12 0x00007fe9d32c8657 in endpt_send_request_cb (token=0x7fe9f4000ce0, e=0x7fe9d270f9c0) at res_pjsip.c:3005 req_wrapper = 0x7fe9f4000ce0 __PRETTY_FUNCTION__ = "endpt_send_request_cb" #13 0x00007fe9df28697a in tsx_set_state () from /lib64/libpjsip.so.2 No symbol table info available. #14 0x00007fe9df2885f6 in tsx_on_state_proceeding_uac () from /lib64/libpjsip.so.2 No symbol table info available. #15 0x00007fe9df28883d in tsx_on_state_calling () from /lib64/libpjsip.so.2 No symbol table info available. #16 0x00007fe9df289d0f in pjsip_tsx_recv_msg () from /lib64/libpjsip.so.2 No symbol table info available. #17 0x00007fe9df289db5 in mod_tsx_layer_on_rx_response () from /lib64/libpjsip.so.2 No symbol table info available. #18 0x00007fe9df27445f in pjsip_endpt_process_rx_data () from /lib64/libpjsip.so.2 No symbol table info available. #19 0x00007fe9d32d27b9 in distribute (data=0x7fe9e0051828) at res_pjsip/pjsip_distributor.c:439 param = {start_prio = 0, start_mod = 0x7fe9d34f0760 <distributor_mod>, idx_after_start = 1, silent = 0} handled = 0 rdata = 0x7fe9e0051828 is_request = 0 is_ack = 0 endpoint = <optimized out> #20 0x00000000005e0e92 in ast_taskprocessor_execute (tps=0x1e762b8) at taskprocessor.c:768 local = {local_data = 0x7fe9d27109c0, data = 0x5f6c79 <ast_threadstorage_set_ptr+60>} t = 0x7fe9e04f25d0 size = 0 __PRETTY_FUNCTION__ = "ast_taskprocessor_execute" #21 0x00000000005ecccd in execute_tasks (data=0x1e762b8) at threadpool.c:1269 tps = 0x1e762b8 #22 0x00000000005e0e92 in ast_taskprocessor_execute (tps=0x1e765b8) at taskprocessor.c:768 local = {local_data = 0x55cc0c76, data = 0x1e60a80} t = 0x7fe9e04b7e20 size = 0 __PRETTY_FUNCTION__ = "ast_taskprocessor_execute" #23 0x00000000005eb144 in threadpool_execute (pool=0x1e60ad8) at threadpool.c:351 __PRETTY_FUNCTION__ = "threadpool_execute" #24 0x00000000005ec662 in worker_active (worker=0x7fe9e8006bc8) at threadpool.c:1075 alive = 0 #25 0x00000000005ec41f in worker_start (arg=0x7fe9e8006bc8) at threadpool.c:995 worker = 0x7fe9e8006bc8 __PRETTY_FUNCTION__ = "worker_start" #26 0x00000000005f855b in dummy_start (data=0x7fe9e801c030) at utils.c:1237 __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, 7449420058881167766, 0, 140642234730944, 140642234730240, 20, 7449420058906333590, -7443873705584488042}, __mask_was_saved = 0}}, __pad = {0x7fe9d270fdf0, 0x0, 0x0, 0x7fe9ff6162e8 <__pthread_keys+8>}} __cancel_routine = 0x4510a7 <ast_unregister_thread> __cancel_arg = 0x7fe9d2710700 __not_first_call = 0 ret = 0x7fe9fe9a7860 <internal_trans_names.8316> a = {start_routine = 0x5ec398 <worker_start>, data = 0x7fe9e8006bc8, name = 0x7fe9e8002fc0 "worker_start started at [ 1049] threadpool.c worker_thread_start()"} #27 0x00007fe9ff406df5 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #28 0x00007fe9fe6e71ad in clone () from /lib64/libc.so.6 {noformat} By: Richard Mudgett (rmudgett) 2015-08-17 10:39:51.825-0500 [~sgriepentrog] The crash is caused because {{ast_sorcery_alloc()}} does not check the {{ast_strdup()}} return value that assigns the sorcery id member string for failure. By: Scott Griepentrog (sgriepentrog) 2015-08-18 17:26:17.805-0500 Instance #2: assert from base_process_dial_end [^backtrace-core.12412.txt] This may be considered less of a fixable bug then an intended crash as it is an assert not normally applied unless DO_CRASH is enabled. By: Scott Griepentrog (sgriepentrog) 2015-08-18 17:50:46.391-0500 Instance #3: crash on null contact_hdr uri [^backtrace-core.29894.txt] By: Scott Griepentrog (sgriepentrog) 2015-08-26 14:15:52.596-0500 Instance #4: reference to pvt when channel is NULL [^backtrace-core.12729.txt] [^full-log-core.12729.txt] By: Scott Griepentrog (sgriepentrog) 2015-08-26 15:25:18.138-0500 Instance #5: strndup alloc failed in get_media_encryption_type [^backtrace-core.11062.txt] By: Scott Griepentrog (sgriepentrog) 2015-09-08 10:46:14.236-0500 I'm including malloc failure related crashes on this issue, even where they were not triggered by CHAOS_DEBUG but through an honest malloc failure, detected by a crash on a ram limited (no swap) test machine under stress. |