Summary: | ASTERISK-26699: res_pjsip: Assertion when sending OPTIONS request to endpoint | ||||||
Reporter: | Ross Beer (rossbeer) | Labels: | |||||
Date Opened: | 2017-01-06 04:11:14.000-0600 | Date Closed: | 2017-02-02 09:36:26.000-0600 | ||||
Priority: | Major | Regression? | |||||
Status: | Closed/Complete | Components: | Resources/res_pjsip | ||||
Versions: | 13.13.1 | Frequency of Occurrence | Occasional | ||||
Related Issues: |
| ||||||
Environment: | Fedora 23, PJSIP Trunk 2.5.5 (Commit 5510) or Bundled PJSIP | Attachments: | ( 0) assertions-1.txt ( 1) assertions-2.txt ( 2) backtrace_20170126_clean.txt ( 3) backtrace_20170131_clean.txt ( 4) hpbx06_backtrace_options_clean.txt ( 5) syslog.txt | ||||
Description: | Every so often an asserting happens when sending OPTIONS pings to endpoints.
The follwoing snippit shows the asterisk log output: {noformat} [2017-01-06 08:44:47] ERROR[12271]: res_pjsip.c:3613 endpt_send_request: Error 120001 'Operation not permitted' sending OPTIONS request to endpoint offnetVo xboneIA02 [2017-01-06 08:44:47] ERROR[12271]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f2f48fbe190 (0) Got 19 backtrace records #0: [0x606b37] main/utils.c:2471 __ast_assert_failed() (0x606aaa+8D) #1: [0x45dea2] main/astobj2.c:132 INTERNAL_OBJ() #2: [0x45defd] main/astobj2.c:152 __ao2_lock() (0x45ded2+2B) #3: [0x7f2f20319189] res/res_pjsip.c:3622 endpt_send_request() #4: [0x7f2f2031b838] res/res_pjsip.c:3742 ast_sip_send_out_of_dialog_request() (0x7f2f2031b6d0+168) #5: [0x7f2f2031eb5a] res_pjsip/pjsip_options.c:444 qualify_contact() {noformat} | ||||||
Comments: | By: Asterisk Team (asteriskteam) 2017-01-06 04:11:16.142-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Rusty Newton (rnewton) 2017-01-06 20:00:00.562-0600 If we can get some more of the debug log including the SIP trace of the OPTIONS that would help us out. Thanks! By: Ross Beer (rossbeer) 2017-01-09 05:21:56.154-0600 Unfortunately, I can't reproduce on the fly, the issue happens randomly so I can't keep a SIP trace running due to the vast amount of traffic By: Rusty Newton (rnewton) 2017-01-10 14:01:38.817-0600 Okay, opening this one up as triage from [~jcolp] determined we may be able to investigate with what we have here. Obviously any further data, debug or SIP trace would be helpful if it turns out you can get it. The more the better, the faster a resolution will be found. Thanks! By: Ross Beer (rossbeer) 2017-01-13 10:15:00.452-0600 Just had another crash, the following log was shown on the CLI: {noformat} [2017-01-12 15:39:28] ERROR[182933]: res_pjsip.c:3614 endpt_send_request: Error 120001 'Operation not permitted' sending OPTIONS request to endpoint [2017-01-12 15:39:28] ERROR[182933]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f0ac407fc00 (0) Got 19 backtrace records #0: [0x606dba] main/utils.c:2471 __ast_assert_failed() (0x606d2d+8D) #1: [0x45e0be] main/astobj2.c:132 INTERNAL_OBJ() #2: [0x45e119] main/astobj2.c:152 __ao2_lock() (0x45e0ee+2B) #3: [0x7f0af4ea7209] res/res_pjsip.c:3623 endpt_send_request() {noformat} By: Ross Beer (rossbeer) 2017-01-13 10:39:08.209-0600 This issue is happening across multiple asterisk instances. Back trace shows: {noformat} Program terminated with signal SIGSEGV, Segmentation fault. #0 pj_atomic_dec_and_get (atomic_var=0x30) at ../src/pj/os_core_unix.c:962 962 pj_mutex_lock( atomic_var->mutex ); [Current thread is 1 (Thread 0x7f0a797e0700 (LWP 182933))] #0 0x00007f0b1cab0279 in pj_atomic_dec_and_get (atomic_var=0x30) at ../src/pj/os_core_unix.c:962 new_value = <optimized out> #1 0x00007f0b1cab583d in pj_grp_lock_dec_ref (glock=0x7f0ac4753d00) at ../src/pj/lock.c:554 cnt = <optimized out> #2 0x00007f0b1cab583d in pj_grp_lock_dec_ref (glock=0x7f0ac4753d00) at ../src/pj/lock.c:631 #3 0x00007f0b1cabf790 in cancel_timer (ht=0x155e500, entry=0x7f0ac4000078, flags=flags@entry=6, id_val=id_val@entry=0) at ../src/pj/timer.c:588 grp_lock = <optimized out> #4 0x00007f0b1cabfa3c in pj_timer_heap_cancel_if_active (id_val=id_val@entry=0, flags=6, entry=entry@entry=0x7f0ac4000078, ht=<optimized out>) at ../src/pj/timer.c:606 #5 0x00007f0b1cabfa3c in pj_timer_heap_cancel_if_active (ht=<optimized out>, entry=entry@entry=0x7f0ac4000078, id_val=id_val@entry=0) at ../src/pj/timer.c:605 #6 0x00007f0af4ea7222 in endpt_send_request (endpoint=endpoint@entry=0x23fddc8, tdata=tdata@entry=0x7f0aac3d78e8, timeout=timeout@entry=3000, token=token@entry=0x7f0ac430c978, cb=0x7f0af4ea7290 <send_request_cb>) at res_pjsip.c:3623 timers_cancelled = <optimized out> errmsg = "Operation not permitted\000hK2\002\000\000\000\000\210\206=\254\n\177\000\000@x=\254\n\177\000\000\004\000\000\000\005\000\000\000PK2\002\000\000\000\000\360\372}y\n\177\000\000r\352E\000\000\000\000" req_wrapper = 0x7f0ac407fc00 ret_val = 120001 endpt = 0x155e218 selector = {type = PJSIP_TPSELECTOR_TRANSPORT, u = {transport = 0x2373548, listener = 0x2373548, ptr = 0x2373548}} #7 0x00007f0af4ea98c8 in ast_sip_send_out_of_dialog_request (tdata=0x7f0aac3d78e8, endpoint=endpoint@entry=0x23fddc8, timeout=3000, token=token@entry=0x2324b68, callback=callback@entry=0x7f0af4eacec0 <qualify_contact_cb>) at res_pjsip.c:3743 supplement = 0x0 contact = 0x2324b68 __PRETTY_FUNCTION__ = "ast_sip_send_out_of_dialog_request" #8 0x00007f0af4eacbea in qualify_contact (endpoint=endpoint@entry=0x0, contact=contact@entry=0x2324b68) at res_pjsip/pjsip_options.c:444 tdata = 0x7f0aac3d78e8 endpoint_local = 0x23fddc8 __PRETTY_FUNCTION__ = "qualify_contact" #9 0x00007f0af4eace63 in qualify_contact_task (obj=0x2324b68) at res_pjsip/pjsip_options.c:519 contact = 0x2324b68 res = <optimized out> #10 0x00000000005ee6fc in ast_taskprocessor_execute (tps=0x155cfb0) at taskprocessor.c:967 local = {local_data = 0x7f0a797e0700, data = 0x5f5e7f <ast_threadstorage_set_ptr+60>} t = 0x7f0aa80126b0 size = 22400944 __PRETTY_FUNCTION__ = "ast_taskprocessor_execute" #11 0x00000000005f82b1 in execute_tasks (data=0x155cfb0) at threadpool.c:1322 tps = 0x155cfb0 #12 0x00000000005ee6fc in ast_taskprocessor_execute (tps=0x155c3c0) at taskprocessor.c:967 local = {local_data = 0x7f0a797dfc80, data = 0x155aa90} t = 0x7f0aa8005f80 size = 22391496 __PRETTY_FUNCTION__ = "ast_taskprocessor_execute" #13 0x00000000005f6566 in threadpool_execute (pool=0x155aae0) at threadpool.c:351 __PRETTY_FUNCTION__ = "threadpool_execute" #14 0x00000000005f7c1d in worker_active (worker=0x7f0ae40061d0) at threadpool.c:1105 alive = 0 #15 0x00000000005f79d5 in worker_start (arg=0x7f0ae40061d0) at threadpool.c:1024 worker = 0x7f0ae40061d0 saved_state = (DEAD | unknown: 32520) __PRETTY_FUNCTION__ = "worker_start" #16 0x0000000000603e98 in dummy_start (data=0x7f0ae40064d0) at utils.c:1235 __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, -2806034405047432239, 139686362798719, 139682964702976, 507904, 507904, -2806034405072598063, 2817569640175733713}, __mask_was_saved = 0}}, __pad = {0x7f0a797dfdf0, 0x0, 0x1, 0x7f0bf3c326e8 <__pthread_keys+1032>}} __cancel_routine = 0x4527db <ast_unregister_thread> __cancel_arg = 0x7f0a797e0700 __not_first_call = 0 ret = 0x7f0bf30118d8 a = {start_routine = 0x5f794e <worker_start>, data = 0x7f0ae40061d0, name = 0x7f0ae40020e0 "worker_start started at [ 1079] threadpool.c worker_thread_start()"} #17 0x00007f0bf3a2061a in start_thread () at /lib64/libpthread.so.0 #18 0x00007f0bf2d5c5fd in clone () at /lib64/libc.so.6 {noformat} By: Dmitriy Serov (Demon) 2017-01-18 13:57:56.551-0600 Installing asterisk 13.13.1 on ubuntu server with bundled pjproject. Asterisk segfaults a lot of time (attachment syslog.txt) syslog.txt - grep from syslog assertions-1.txt, assertions-2.txt - cut from asterisk logs. FRACK was after each "res_pjsip.c: Error 320047 'No answer record in the DNS response (PJLIB_UTIL_EDNSNOANSWERREC)' sending OPTIONS request" (why it's ERROR?) but does not after: res_pjsip.c: Error 171005 'Missing route set (for tel: URI) (PJSIP_ENOROUTESET)' sending OPTIONS request to endpoint I tried compile and run without optimizations and better trace... * dosn't start at all. Hangs on startup. Systemd restarts it. By: Ross Beer (rossbeer) 2017-01-26 10:23:59.179-0600 Just had another crash, using: Asterisk GIT-13-13.12.2-250-gd30bef1 PJSUIP Trunk 2.5.5 (Commit 5510) Please see attached backtrace. By: Ross Beer (rossbeer) 2017-01-31 08:10:22.263-0600 This issue is still occurring, the version in use now is: Asterisk GIT-13-13.12.2-311-g56e925f PJSIP Bundled Please see attached latest backtrace. By: Friendly Automation (friendly-automation) 2017-02-02 09:36:27.036-0600 Change 4847 merged by Joshua Colp: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4847|https://gerrit.asterisk.org/4847] By: Friendly Automation (friendly-automation) 2017-02-02 10:45:54.438-0600 Change 4848 merged by zuul: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4848|https://gerrit.asterisk.org/4848] By: Friendly Automation (friendly-automation) 2017-02-02 10:50:53.733-0600 Change 4846 merged by Joshua Colp: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4846|https://gerrit.asterisk.org/4846] By: Friendly Automation (friendly-automation) 2017-02-02 11:52:18.685-0600 Change 4868 merged by zuul: res_pjsip: Handle invocation of callback on outgoing request when error occurs. [https://gerrit.asterisk.org/4868|https://gerrit.asterisk.org/4868] |