Summary: | ASTERISK-26679: Crash on invalid contact domain (pjsip aor) | ||||
Reporter: | Dmitriy (kasper) | Labels: | |||
Date Opened: | 2016-12-27 13:02:36.000-0600 | Date Closed: | 2017-02-02 09:36:28.000-0600 | ||
Priority: | Major | Regression? | |||
Status: | Closed/Complete | Components: | Resources/res_pjsip | ||
Versions: | 13.13.1 | Frequency of Occurrence | Frequent | ||
Related Issues: |
| ||||
Environment: | debian 8.1 | Attachments: | ( 0) backtrace.txt | ||
Description: | contact=sip:sipuser
{quote} [Dec 27 21:38:38] WARNING[24817]: pjproject:0 <?>: tsx0x7f5494015 Failed to send Request msg OPTIONS/cseq=20579 (tdta0x7f549401d650)! err=320053 (DNS "Name Error" (PJLIB_UTIL_EDNS_NXDOMAIN)) [Dec 27 21:39:38] WARNING[24818]: pjproject:0 <?>: tsx0x7f5494015 .Failed to send Request msg OPTIONS/cseq=59404 (tdta0x7f549401d650)! err=320053 (DNS "Name Error" (PJLIB_UTIL_EDNS_NXDOMAIN)) [Dec 27 21:39:38] ERROR[24818]: res_pjsip.c:3591 endpt_send_request: Error 320053 'DNS "Name Error" (PJLIB_UTIL_EDNS_NXDOMAIN)' sending OPTIONS request to endpoint extline-42 [Dec 27 21:39:38] ERROR[24818]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f549401d5b0 (0) Got 19 backtrace records #0: [0x5f9365] main/utils.c:2459 __ast_assert_failed() (0x5f92d8+8D) #1: [0x45c06c] main/astobj2.c:132 INTERNAL_OBJ() #2: [0x45c0c7] main/astobj2.c:152 __ao2_lock() (0x45c09c+2B) #3: [0x7f547f66cb6a] res/res_pjsip.c:3600 endpt_send_request() #4: [0x7f547f66d022] res/res_pjsip.c:3720 ast_sip_send_out_of_dialog_request() (0x7f547f66ce71+1B1) #5: [0x7f547f670ffb] res_pjsip/pjsip_options.c:444 qualify_contact() #6: [0x7f547f671155] res_pjsip/pjsip_options.c:519 qualify_contact_task() #7: [0x5e0e83] main/taskprocessor.c:969 ast_taskprocessor_execute() (0x5e0d73+110) #8: [0x5e9942] main/threadpool.c:1322 execute_tasks() #9: [0x5e0e83] main/taskprocessor.c:969 ast_taskprocessor_execute() (0x5e0d73+110) #10: [0x5e7c6f] main/threadpool.c:351 threadpool_execute() #11: [0x5e92c6] main/threadpool.c:1105 worker_active() #12: [0x5e907f] main/threadpool.c:1025 worker_start() #13: [0x5f6477] main/utils.c:1235 dummy_start() Pbx*CLI> Disconnected from Asterisk server Asterisk cleanly ending (0). Executing last minute cleanups root@Pbx:/tmp# {quote} | ||||
Comments: | By: Asterisk Team (asteriskteam) 2016-12-27 13:02:37.155-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Mark Michelson (mmichelson) 2016-12-28 12:49:21.521-0600 Linking this with ASTERISK-26675 and ASTERISK-26669 since they may originate from the same issue. By: Mark Michelson (mmichelson) 2016-12-28 13:04:29.112-0600 We require additional debug to continue with triage of your issue. Please follow the instructions on the wiki [1] for how to collect debugging information from Asterisk. For expediency, where possible, attach the debug with a '.txt' file extension so that the debug will be usable for further analysis. Thanks! [1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information The backtrace is incredibly helpful, but a full debug will provide us with more context for the crash, and we might be able to reproduce or solve the issue better. By: Jacek Konieczny (jkonieczny) 2017-01-10 02:47:26.872-0600 I have experienced similar crashes due to a misconfigured contact hostname or broken nameservers. The last one: {noformat} # coredumpctl gdb -1 PID: 8722 (asterisk) UID: 188 (asterisk) GID: 188 (asterisk) Signal: 11 (SEGV) Timestamp: Tue 2017-01-10 01:29:19 CET (8h ago) Command Line: /usr/sbin/asterisk -U asterisk -G asterisk -g -f Executable: /usr/sbin/asterisk Control Group: / Slice: -.slice Boot ID: ec363aa2f81a46d998f3dec80b9b10f3 Machine ID: 1b7ba458f42347e897c9190248b35978 Hostname: x12c-4puntnuldemo Coredump: /var/lib/systemd/coredump/core.asterisk.188.ec363aa2f81a46d998f3dec80b9b10f3.8722.1484008159000000.lz4 Message: Process 8722 (asterisk) of user 188 dumped core. Stack trace of thread 8754: #0 0x00000000f63b06e1 cancel (libpj.so.2) #1 0x00000000f63b0d2b cancel_timer (libpj.so.2) #2 0x00000000f635db90 endpt_send_request (res_pjsip.so) #3 0x00000000f6360a43 ast_sip_send_out_of_dialog_request (res_pjsip.so) #4 0x00000000f636460c qualify_contact (res_pjsip.so) #5 0x00000000f63648eb qualify_contact_task (res_pjsip.so) #6 0x00000000081d741c ast_taskprocessor_execute (asterisk) #7 0x00000000081de52f execute_tasks (asterisk) #8 0x00000000081d741c ast_taskprocessor_execute (asterisk) #9 0x00000000081dde9f threadpool_execute (asterisk) #10 0x00000000081e6910 dummy_start (asterisk) #11 0x00000000f70e5220 start_thread (libpthread.so.0) #12 0x00000000f6f885ee __clone (libc.so.6) [...] Program terminated with signal SIGSEGV, Segmentation fault. #0 cancel (ht=ht@entry=0x866280c, entry=entry@entry=0x6478302b, flags=flags@entry=7) at ../src/pj/timer.c:331 331 if (entry->_timer_id < 0 || (pj_size_t)entry->_timer_id > ht->max_size) { $1 = (pj_timer_entry *) 0x6478302b (gdb) print entry->_timer_id Cannot access memory at address 0x64783037 (gdb) up #1 0xf63b0d2b in cancel_timer (id_val=id_val@entry=0, flags=6, entry=0x6478302b, ht=0x866280c) at ../src/pj/timer.c:581 581 count = cancel(ht, entry, flags | F_DONT_CALL); (gdb) print *entry Cannot access memory at address 0x6478302b (gdb) up #2 pj_timer_heap_cancel_if_active (ht=0x866280c, entry=entry@entry=0x6478302b, id_val=id_val@entry=0) at ../src/pj/timer.c:605 605 return cancel_timer(ht, entry, F_SET_ID | F_DONT_ASSERT, id_val); (gdb) up #3 0xf635db90 in endpt_send_request (endpoint=endpoint@entry=0x89bc080, tdata=tdata@entry=0xf3b3e91c, timeout=timeout@entry=3000, token=token@entry=0x8979600, cb=0xf635dc09 <send_request_cb>) at res_pjsip.c:3600 3600 timers_cancelled = pj_timer_heap_cancel_if_active( (gdb) print req_wrapper $2 = (struct send_request_wrapper *) 0x89a64f0 (gdb) print *req_wrapper $3 = { token = 0x726f7373, callback = 0x6578655f, cb_called = 1702131043, timeout_timer = 0x6478302b, timeout = 1528834402, tdata = 0x31387830 } (gdb) print *req_wrapper->timeout_timer Cannot access memory at address 0x6478302b {noformat} And the log fragment: {noformat} Jan 10 01:29:19 x12c-4puntnuldemo asterisk[8722]: ERROR[8754]: res_pjsip.c:3591 in endpt_send_request: Error 320053 'DNS "Name Error" (PJLIB_UTIL_EDNS_NXDOMAIN)' sending OPTIONS request to endpoint GSMINBOUND Jan 10 01:29:19 x12c-4puntnuldemo asterisk[8722]: [Jan 10 01:29:19] ERROR[8754]: res_pjsip.c:3591 endpt_send_request: Error 320053 'DNS "Name Error" (PJLIB_UTIL_EDNS_NXDOMAIN)' sending OPTIONS request to endpoint GSMINBOUND Jan 10 01:29:19 x12c-4puntnuldemo asterisk[8722]: FRACK!, Failed assertion bad magic number 0x0 for object 0x89a64f0 (0) at line 131 in INTERNAL_OBJ of astobj2.c Jan 10 01:29:19 x12c-4puntnuldemo asterisk[8722]: ERROR[8754]: astobj2.c:131 in INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x0 for object 0x89a64f0 (0) Jan 10 01:29:19 x12c-4puntnuldemo asterisk[8722]: [Jan 10 01:29:19] ERROR[8754]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x0 for object 0x89a64f0 (0) {noformat} There are more 'DNS "Name Error"' and 'FRACK!' in the logs, it does not end with an Asterisk crash every time. By: George Joseph (gjoseph) 2017-01-31 10:24:45.336-0600 I can reproduce this at will. Simply create an aor with a bad hostname and either wait for it to qualify or do a 'pjsip qualify' from the CLI. May take 2 tries. An outbound call to the aor seems to fail gracefully so it may have to do with an out-of-dialog request. . By: George Joseph (gjoseph) 2017-01-31 11:52:21.521-0600 patch up courtesy of [~jcolp] https://gerrit.asterisk.org/4846 By: Friendly Automation (friendly-automation) 2017-02-02 09:36:28.722-0600 Change 4847 merged by Joshua Colp: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4847|https://gerrit.asterisk.org/4847] By: Friendly Automation (friendly-automation) 2017-02-02 10:45:55.948-0600 Change 4848 merged by zuul: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4848|https://gerrit.asterisk.org/4848] By: Friendly Automation (friendly-automation) 2017-02-02 10:50:55.463-0600 Change 4846 merged by Joshua Colp: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4846|https://gerrit.asterisk.org/4846] By: Friendly Automation (friendly-automation) 2017-02-02 11:52:20.447-0600 Change 4868 merged by zuul: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4868|https://gerrit.asterisk.org/4868] |