Summary: | ASTERISK-30454: res_pjsip_refer: NOTIFY sipfrag may terminate early depending on conditions | ||
Reporter: | David Middleton (dave-midd) | Labels: | |
Date Opened: | 2023-03-06 11:54:45.000-0600 | Date Closed: | |
Priority: | Minor | Regression? | |
Status: | Open/New | Components: | Resources/res_pjsip_refer |
Versions: | 18.14.0 | Frequency of Occurrence | Constant |
Related Issues: | |||
Environment: | Attachments: | ( 0) 2023030602_ASTERISK-30454_anon.pcap ( 1) 2023030602_debug_log_ASTERISK-30454.txt ( 2) 20230314_debug_log_ASTERISK-30454.txt ( 3) 20230314_pstn-teams_at_teams-500ms_delay_ok_digium_ANON_ASTERISK-30454.pcap ( 4) msteams_attended-transfer_teams-teams_ASTERISK-30454.png ( 5) msteams_attended-transfer_teams-teams_notify_delay_ASTERISK-30454.png | |
Description: | Party A calls Party B (and call is answered).
Party B puts Party A on hold and blind transfers to Party C. Asterisk sends a SIP NOTIFY with a sipfrag 200 OK (after various NOTIFY with 1xx progress sipfrags) before the transferred call (to Party C) has been answered. The expected behaviour is for the all the NOTIFYs to be sent with sipfrags reflecting the upstream progress of the transferred call. | ||
Comments: | By: Asterisk Team (asteriskteam) 2023-03-06 11:54:48.797-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur. Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/]. By: Joshua C. Colp (jcolp) 2023-03-06 12:02:15.734-0600 We require additional debug to continue with triage of your issue. Please follow the instructions on the wiki [1] for how to collect debugging information from Asterisk. For expediency, where possible, attach the debug with a '.txt' file extension so that the debug will be usable for further analysis. Thanks! [1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information Additionally this behavior is not guaranteed, because it can depend on the dialplan as well. By: David Middleton (dave-midd) 2023-03-06 12:02:47.285-0600 PSTN (+447700900000) calling DDI (+441632960000) for exten (319-astpjsip). PJSIP/proxy-00000070 - PSTN -> asterisk PJSIP/proxy-00000071 - asterisk -> exten Exten (319-astpjsip) puts call on hold and blind transfers to PSTN +441632960001 PJSIP/proxy-00000072 - asterisk -> PSTN Frame 16 shows the REFER arriving at asterisk. Before the INVITE to the transferred party is sent upstream, asterisk has already sent: Frame 18 - NOTIFY with Sipfrag SIP/2.0 100 Trying Frame 19 - NOTIFY with Sipfrag SIP/2.0 183 Session Progress Frame 21 - NOTIFY with Sipfrag SIP/2.0 180 Ringing Then asterisk sends the INVITE upstream Frame 22 - INVITE to the transferred destination But after only a '100 Trying' has been received, asterisk sends Frame 25 - NOTIFY with Sipfrag SIP/2.0 200 OK By: David Middleton (dave-midd) 2023-03-06 12:08:33.828-0600 I wasn't sure whether to raise this as a separate issue (please advise), but asterisk is also sending those initial NOTIFYs so close together (e.g frame 18 and 19 are 0.3mS apart) that they often arrive at the destination SIP server out of order and so rejections are frequently seen (e.g. frame 20 SIP/2.0 500 Server Internal Error). It's also not clear to me what/why the NOTIFY with the sipfrags 100, 183 and 180 are being generated, before anything has happened with the upstream INVITE. Thanks. By: Joshua C. Colp (jcolp) 2023-03-06 12:19:02.007-0600 The res_pjsip_refer module attempts to deduce what is happening based on various things happening to the channel. There is no explicit Asterisk interface to receive notifications or to know what is really going on, and things can get even more complicated because your call may not even end up at another device - it may just end up in the dialplan. It's not an exact science, so sometimes it gets it wrong or not as one would expect. I'll leave this issue open in case there are any enhancements to improve it. By: David Middleton (dave-midd) 2023-03-15 12:01:21.268-0500 Hi Joshua, I wanted to provide a bit more context and detail around the problems that the 'early' ('early' as in before the transferee call has been answered) NOTIFY with sigrag 200 OK is causing. The problem is particularly acute when interworking with MS Teams and exagerated by the way that MS handles attended transfers. In fact, they never signal an attended transfer; they always use a blind transfer, but rely on all call legs being routed through their infrastructure and then use INVITEs with replaces to complete the call setup on any external call legs. I've attached a couple of example call flows to help try and visulaise the problem. In 'msteams_attended-transfer_teams-teams_ASTERISK-30454.png' we have a call from asterisk into Teams Exten A. Teams Exten A puts call on hold and initiates an attended transfer to Teams Exten B. Teams Exten A makes a consult call directly to Teams Exten B (and so we don't see the signalling as it's Teams to Teams, dialled using the user name in the Teams address book popup). Once the call is established, Teams exten A hits 'Transfer'. We receive a blind transfer with the refer-to header containing a URI parameter with the Microsoft GUID of the transferee (as well as the referred-by header containing a URI parameter of the transferor). Asterisk sends a new INVITE to Teams Exten B but very shortly after also sends a NOTIFY with a sipgrag 200 OK back to Teams. (I've omitted the NOTIFYs with sipgrags 100, 183 and 180 for brevity.) The problem is that MS assume (due to the sipfrag 200 OK) that the call to the transferee has been answered and start tearing down the original call leg from asterisk to the transferor as well as the consult leg from transferor to transferee. This means that when the INVITE that was initiated by asterisk due to the REFER (and has a replaces header added by MS to replace the consult leg) get's processed, the consult leg has gone and so they reject it with the 404 Not Found (and a reason of 'Replacement call was not found'). (I appreciate that you don't see these INVITEs with replaces in this example, but we do see them if the transfer was to PSTN; MS dictate that all transferred calls go back through them and so we see the INVITE to PSTN forwarded back out to us as a new call, but with a replaces header to replace the consul call that's already set up.) We carried out some tests, first by blocking the NOTIFY from reaching MS. With this set up, the call to the transferee was not rejected and rang the target correctly. We then added a 500mS delay in res_rjsip_refer for any NOTIFY with a sipgrag 200 OK. Transfers were also completed successfully. I've added an example sip trace (20230314_pstn-teams_at_teams-500ms_delay_ok_digium_ANON_ASTERISK-30454.pcap) and debug log (20230314_debug_log_ASTERISK-30454.txt). PSTN (+447700900000) calling DDI (+441632960000) for Teams exten A (319-astpjsip). PJSIP/proxy-00000010 - PSTN -> asterisk PJSIP/proxy-00000011 - asterisk -> Teams Exten A PJSIP/proxy-00000012 - asterisk -> Teams Exten B I've also attached a simplified flow diagram for when the delay was added (msteams_attended-transfer_teams-teams_notify_delay_ASTERISK-30454.png). Obviously w'd rather not run our production asterisk servers with this fixed delay in res_rjsip_refer and it would be far better if the sending of the sipfrag 200 NOTIFY was driven by a more appropiate event, such as the SIP 200 OK response to the transferred call leg. (One other interesting side effect of adding the delay, even a much smaller 20mS one, was that asterisk stopped sending the NOTIFYs with the 183 and 180 sipfrags.) Thanks, David Edit: MS document their behaviour on receipt of the NOTIFY with sipfrag 200 OK - https://learn.microsoft.com/en-us/microsoftteams/troubleshoot/phone-system/direct-routing/issues-with-call-transfers#calls-drop-before-the-transfer-is-completed |