| Summary: | ASTERISK-29839: res_pjsip: Failover occurs after CANCEL | ||
| Reporter: | Jonas Swiatek (jonasswiatek) | Labels: | |
| Date Opened: | 2022-01-06 07:08:12.000-0600 | Date Closed: | |
| Priority: | Minor | Regression? | |
| Status: | Open/New | Components: | Resources/res_pjsip | 
| Versions: | 16.24.0 | Frequency of Occurrence | Constant | 
| Related Issues: | |||
| Environment: | Amazon Linux 2 | Attachments: | ( 0) bla.pcap ( 1) debug_log_dnssrv | 
| Description: | Asterisk configured with an AOR like so: [registrar] type=aor contact=somesrvrecord.example.com somesrvrecord.example.com returns multiple IP addresses. The issue I'm seeing is that Asterisk will perform a failover, even after it has cancelled an outbound call. For instance: 1. Asterisk sends a SIP INVITE 2. Asterisk sends a SIP CANCEL to cancel that invite 3. If asterisk receives anything but 200 cancelled followed by 487 Request Terminated, it will STILL failover by sending the same INVITE to the next IP address in the SRC Record set. This obviously creates a ghost call, and if that new INVITE is answered, asterisk immediately generates a BYE to end that transaction, which I suppose is fine, but it shouldn't be failing over to the next record under these conditions at all. This also happens if it doesn't receive the 487 Request Terminated within 10 seconds. Generally asterisk is pretty aggressive with it's failovers here. Almost anything but a 404, 486 or 480 will cause a failover, which could be a bug, but at least would be very nice if could be configured to only do a failover on either a 500 response or a timeout on the initial SIP INVITE. | ||
| Comments: | By: Asterisk Team (asteriskteam) 2022-01-06 07:08:13.685-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur. Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/]. By: Joshua C. Colp (jcolp) 2022-01-06 07:15:23.923-0600 Thanks for the report and debug. However we also need protocol specific debug captured at the time of the issue. Please include the following: * Asterisk log files generated using the instructions on the Asterisk wiki [1], with the appropriate protocol debug options enabled, e.g. 'pjsip set logger on' if the issue involves the chan_pjsip channel driver. * Configuration information for the relevant channel driver, e.g. pjsip.conf. * A wireshark compatible packet capture, captured at the same time as the Asterisk log output. [1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information By: Joshua C. Colp (jcolp) 2022-01-06 07:17:29.953-0600 As well, from a failover perspective in our code it's restricted to 408 and 503[1][2]. [1] https://github.com/asterisk/asterisk/blob/master/res/res_pjsip.c#L4952 [2] https://github.com/asterisk/asterisk/blob/master/res/res_pjsip_session.c#L4577 By: Jonas Swiatek (jonasswiatek) 2022-01-06 07:39:59.243-0600 Gotcha, I'll get those generated. I know this is an annoying ask, but can you accept it from an asterisk version 16.11.1? I swear to a variety of deities that this also happens with the latest version. I tested it a few weeks ago, but tore down that instance. I can set up a new one if it's necessary, but I'd just ask in the interest of time and effort ;) By: Joshua C. Colp (jcolp) 2022-01-06 07:44:35.036-0600 Ideally the logs would be under the latest supported version, as debug and other details may have changed to include additional details (they actually have in some areas). By: Jonas Swiatek (jonasswiatek) 2022-01-07 06:11:33.273-0600 Relevant parties in this pcap Asterisk Server: 172.28.32.24 Kamailio Server 1: 172.28.21.119 Kamailio Server 2: 172.28.20.196 The relevant parties are the invites going FROM asterisk TO the two kamailio servers. The first invite, from Asterisk -> Kamailio 2 is great, but as it's seen in the trace, kamailio doesn't return it's 487 Request Terminated response immediate, and 10 seconds in, AFTER asterisk has cancelled, it sends out the same invite to Kamailio 1 which isn't supposed to happen. By: Jonas Swiatek (jonasswiatek) 2022-01-07 06:12:01.522-0600 Debug output from Asterisk By: Jonas Swiatek (jonasswiatek) 2022-01-07 07:19:13.416-0600 A feature request: Adding a configuration option, that will cause PJSIP to never fail over if it's received any sort of 100-199 range provisional response. In our setup, we have a set of servers further downstream (kamailio), which handles failover. What I really want, is for Asterisk to only failover if it completely fails to establish any sort of dialog with the first server downstream of it. | ||