[Home]

Summary:ASTERISK-30025: PJSIP affected by delays in DNS resolution
Reporter:Nicholas Carr (connectsmart)Labels:
Date Opened:2022-04-25 00:10:35Date Closed:2022-04-26 04:34:34
Priority:MinorRegression?
Status:Closed/CompleteComponents:pjproject/pjsip
Versions:16.15.0 16.22.0 Frequency of
Occurrence
Related
Issues:
Environment:Hosted - Microsoft Azure (DS2 v2 - 2vCPU, 8GB, STD SSD) CentOS 7.9Attachments:
Description:Issue causing multiple pjsip trunks to go unreachable at random times. Trunk would return after a random period of time.
Whilst the issue first appeared on 16.22.0, we have been able to recreate the issue on 16.15.0 as well.
Tested IP based and outbound auth based trunks, both are affected.
Tested multiple carriers, all seemed to be affected.
PJSIP logger showed registrations and SIP OPTIONS all seem to be sending and receiving just fine. After a debug, we found that the issue would appear when qualifying the aor, the SIP OPTIONS would get sent AFTER the qualify had expired. The SIP OPTIONS would receive its valid response almost instantly but still several seconds too late.
As per the community discussion - https://community.asterisk.org/t/pjsip-is-now-unreachable-realtime/92392/5, it was recommended to raise the bug for further investigation.
Current work around is simply to increase the qualify_timeout default from 3 to 5 or greater pending the delay.
Whether the issue/delay is due to the asterisk -> DNS server query or the DNS server -> remote domain DNS query, we are unsure... happy to help
Comments:By: Asterisk Team (asteriskteam) 2022-04-25 00:10:37.077-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Joshua C. Colp (jcolp) 2022-04-25 04:30:56.084-0500

Since the issue appears to be DNS you'd need to examine how DNS works on the local system. Is there a local caching server? How is it configured? What upstream DNS servers is it configured to use? Are they responding slowly? Is one down and another work? You could also examine the DNS requests/responses by capturing the packets using tcpdump, for example.

I'm not sure over all there's anything we can do or would want to do in regards to DNS and OPTIONS, because DNS resolution time is included in all the timers across the PJSIP implementation. We'd either have to fundamentally alter that and introduce another timer, or OPTIONS would become special and then may not reflect reality or how other things work. That's not something I think we should do.

By: Nicholas Carr (connectsmart) 2022-04-25 23:03:31.456-0500

Thanks Joshua,

That is fair enough and I am happy to review the DNS configuration we have if that provides any helpful insight, but it sounds like it might not be.

If pjsip hands off to the OS for DNS resolution and handling, then I agree its out of the scope of pjsip anyway.

Should I close this bug off?

By: Joshua C. Colp (jcolp) 2022-04-26 04:34:35.002-0500

The implementation you are using does hand off to the OS for DNS resolution. Since this does seem to be DNS, I'm closing this out.