Summary: | ASTERISK-30055: res_rtp_asterisk: HIgh CPU load when TURN server unreachable | ||
Reporter: | Vitezslav Novy (vnovy) | Labels: | webrtc |
Date Opened: | 2022-05-11 02:04:19 | Date Closed: | |
Priority: | Minor | Regression? | |
Status: | Open/New | Components: | Resources/res_rtp_asterisk |
Versions: | 18.10.1 | Frequency of Occurrence | Constant |
Related Issues: | |||
Environment: | LInux, Debian 11 bullseye | Attachments: | |
Description: | We experience high CPU load caused by asterisk when TURN server is unreachable and chan_sip and webrtc phone tries to make a call.
I have identified a thread causing the high CPU load {noformat} #0 futex_abstimed_wait_cancelable (private=0x0, abstime=0x7f414c7ad880, expected=0x0, futex_word=0x7f41502a99cc) at ../sysdeps/unix/sysv/linux/futex-internal.h:205 #1 __pthread_cond_wait_common (abstime=0x7f414c7ad880, mutex=0x7f41502a3470, cond=0x7f41502a99a0) at pthread_cond_wait.c:539 #2 __pthread_cond_timedwait (cond=cond@entry=0x7f41502a99a0, mutex=mutex@entry=0x7f41502a3470, abstime=abstime@entry=0x7f414c7ad880) at pthread_cond_wait.c:667 #3 0x0000556f29295da9 in __ast_cond_timedwait (filename=filename@entry=0x7f417c0c4010 "res_rtp_asterisk.c", lineno=lineno@entry=0x696, func=func@entry=0x7f417c0c8930 <__PRETTY_FUNCTION__.41691> "ast_rtp_ice_turn_request", cond_name=cond_name@entry=0x7f417c0c4075 "&rtp->cond", mutex_name=mutex_name@entry=0x7f417c0c5ca8 "ao2_object_get_lockaddr(instance)", cond=cond@entry=0x7f41502a99a0, t=0x7f41502a3470, abstime=abstime@entry=0x7f414c7ad880) at lock.c:653 #4 0x00007f417c0b21de in ast_rtp_ice_turn_request (instance=instance@entry=0x7f41502a34b0, component=component@entry=AST_RTP_ICE_COMPONENT_RTCP, transport=transport@entry=AST_TRANSPORT_TCP, server=0x556f2b8cbab0 "10.144.31.1", port=0xd96, username=0x556f2b8cbac0 "<redacted>", password=0x556f2b8cbad0 "<redacted>") at res_rtp_asterisk.c:1686 #5 0x00007f417c0b4e81 in rtp_add_candidates_to_ice (instance=instance@entry=0x7f41502a34b0, rtp=rtp@entry=0x7f41502a7550, addr=<optimized out>, port=<optimized out>, component=component@entry=0x2, transport=transport@entry=0x1) at ./third-party/pjproject/source/pjlib/include/pj/string.h:284 #6 0x00007f417c0b584a in ast_rtp_prop_set (instance=0x7f41502a34b0, property=<optimized out>, value=<optimized out>) at res_rtp_asterisk.c:8265 #7 0x0000556f292d32e0 in ast_rtp_instance_set_prop (instance=0x7f41502a34b0, property=property@entry=AST_RTP_PROPERTY_RTCP, value=value@entry=0x1) at rtp_engine.c:711 #8 0x00007f417da8f6d8 in dialog_initialize_rtp (dialog=dialog@entry=0x7f415000a240) at chan_sip.c:6086 #9 0x00007f417daedd91 in dialog_initialize_rtp (dialog=0x7f415000a240) at chan_sip.c:19464 #10 check_peer_ok (p=p@entry=0x7f415000a240, of=<optimized out>, req=req@entry=0x7f414c7b0280, sipmethod=sipmethod@entry=0x5, addr=addr@entry=0x7f414c7b01f0, authpeer=authpeer@entry=0x7f414c7ae528, reliable=<optimized out>, uri2=<optimized out>, uri2@entry=0x7f414c7ae2f0 "sip:*55@default", calleridname=0x7f414c7ae370 "XiVO Assistant") at chan_sip.c:19464 #11 0x00007f417daef628 in check_user_full (p=p@entry=0x7f415000a240, req=req@entry=0x7f414c7b0280, sipmethod=sipmethod@entry=0x5, uri=uri@entry=0x7f41500091ff "sip:*55@default", reliable=reliable@entry=XMIT_RELIABLE, addr=addr@entry=0x7f414c7b01f0, authpeer=<optimized out>, authpeer@entry=0x7f414c7ae528) at chan_sip.c:19587 #12 0x00007f417daf1b91 in handle_request_invite (p=p@entry=0x7f415000a240, req=req@entry=0x7f414c7b0280, addr=addr@entry=0x7f414c7b01f0, seqno=<optimized out>, recount=recount@entry=0x7f414c7b01a0, e=e@entry=0x7f41500091ff "sip:*55@default", nounlock=<optimized out>, nounlock@entry=0x7f414c7b01a4) at chan_sip.c:26700 #13 0x00007f417daf902a in handle_incoming (p=0x7f415000a240, req=0x7f414c7b0280, addr=<optimized out>, recount=<optimized out>, nounlock=<optimized out>) at chan_sip.c:29267 #14 0x00007f417dafb468 in handle_request_do (req=req@entry=0x7f414c7b0280, addr=addr@entry=0x7f414c7b01f0) at chan_sip.c:29475 #15 0x00007f417dafcd5a in sipsock_read (id=<optimized out>, fd=<optimized out>, events=<optimized out>, ignore=<optimized out>) at chan_sip.c:29406 #16 0x0000556f2928aa48 in ast_io_wait (ioc=0x556f2bbe5040, howlong=<optimized out>) at io.c:297 #17 0x00007f417dad2aeb in do_monitor (data=data@entry=0x0) at chan_sip.c:30053 #18 0x0000556f2933a485 in dummy_start (data=<optimized out>) at utils.c:1299 #19 0x00007f419e89ffa3 in start_thread (arg=<optimized out>) at pthread_create.c:486 #20 0x00007f419e476eff in inotify_add_watch () at ../sysdeps/unix/syscall-template.S:78 #21 0x0000000000000000 in ?? () {noformat} It seems parameter of ast_cond_timedwait is used incorrectly because it is absolute time when to wakeup, so it must be updated every loop iteration. Now it is not updated and after first iteration timespec is in past an no ast_cod_timedwait does not sleep any more. Moreover when I fixed timespec parameter and the thread did not use all CPU any more, I realized that waiting for TURN completely blocks SIP traffic processing. It is true for chan_sip, I do not know about chan_pjsip. As a acceptable workaround I decided to wait only 2 sec for TURN server, to not block chan_sip thread. But maybe it is not correct solution. The problem with incorrect timespec in ast_cond_timedwait repeats several times in res/res_rtp_asterisk.c | ||
Comments: | By: Asterisk Team (asteriskteam) 2022-05-11 02:04:20.143-0500 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur. Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/]. By: Joshua C. Colp (jcolp) 2022-05-11 03:55:58.398-0500 Yes, there does seem to be an issue. In general both STUN and TURN support are blocking operations, so if STUN is down it will also block for a period of time. Is there a particular reason you require TURN support in the first place? Does Asterisk not have its ports forwarded? By: Jirka Hlavacek (jirka) 2022-05-11 06:41:31.142-0500 Thanks for your answer. We force TURN as we want to sanitize the RTP flow before reaching Asterisk. Asterisk ports are not forwarded, Asterisk is not reachable from outside world. By: Vitezslav Novy (vnovy) 2022-05-11 09:19:19.714-0500 So our TURN usage is described in comment of Jirka Hlavacek |