Summary: | ASTERISK-13679: [patch] Asterisk does not stop retransmission | ||
Reporter: | klaus3000 (klaus3000) | Labels: | |
Date Opened: | 2009-03-02 09:22:45.000-0600 | Date Closed: | 2009-06-24 13:57:13 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_sip/Interoperability |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) 14584_v2.patch ( 1) 14584.patch ( 2) debugging.patch.1.4.23.txt ( 3) full-trace.txt | |
Description: | Scenario: <pre> client Asterisk ------INV,407,ACK--------> ------INV----------------> -----INV----> <---183------ <-------183--------------- <---180------ <-------180--------------- --------CANCEL-----------> X----487, 200-(loss)---- ---CANCEL retransmission-> <----487, 200------------- ---ACK-------------------> <----487------------------ <----487------------------ <----487------------------ <----487------------------ <----487------------------ </pre> ****** ADDITIONAL INFORMATION ****** I reproduced the scenario using sipp. Further I found out that the problem must be related to the retransmitted CANCEL, as the following scenario is fine: <pre> client Asterisk ------INV,407,ACK--------> ------INV----------------> -----INV----> <---183------ <-------183--------------- <---180------ <-------180--------------- --------CANCEL-----------> X----487, 200-(loss)--- --CANCEL retr. (loss)-X <----487------------------ ---ACK-------------------> </pre> | ||
Comments: | By: klaus3000 (klaus3000) 2009-03-02 10:25:52.000-0600 I think I found the problem: The retransmitted CANCEL triggers again 487. But this is not an immediate replacement of the scheduled 487, but a new one. Thus, after the second CANCEL there are two 487 replies scheduled for retransmissions, and the ACK stops only one of these two. (take a look at the retransmission ID) btw: attached patch helped me find the problem By: Leif Madsen (lmadsen) 2009-06-10 12:11:02 Hey Mark! I know I'm not supposed to assign any issues to you right now, but I thought this one was important enough, and potentially quick enough, to assign to you to review when you have a chance. Thanks! By: Mark Michelson (mmichelson) 2009-06-11 10:47:12 I updated the diagrams inside the description and additional information fields to be wrapped in "pre" tags so that the original spacing is preserved. This makes them more readable. I'm about to take a look at the patch. edit: need to escape angle brackets so that this note makes more sense. edit2: ok, forget the brackets, I'll just use "pre" and hopefully that makes sense. By: Mark Michelson (mmichelson) 2009-06-11 13:51:42 Give 14584.patch a try. Thanks for doing the legwork to find the source of the issue. By: klaus3000 (klaus3000) 2009-06-18 09:33:29 Hi! With this patch Asterisk 1.4.24 crashes on reception of the retransmitted CANCEL: <--- Transmitting (no NAT) to 83.136.32.132:5061 ---> SIP/2.0 180 Ringing Via: SIP/2.0/UDP 83.136.32.132:5061;branch=z9hG4bK-10506-1-4;received=83.136.32.132 From: <sip:klaus@83.136.32.165>;tag=1 To: <sip:01505641636@83.136.32.165:5060>;tag=as5434e245 Call-ID: 1-10506@83.136.32.132 CSeq: 2 INVITE User-Agent: Asterisk PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Contact: <sip:01505641636@83.136.32.165> Content-Length: 0 <------------> <--- SIP read from 83.136.32.132:5061 ---> CANCEL sip:01505641636@83.136.32.165:5060 SIP/2.0 Via: SIP/2.0/UDP 83.136.32.132:5061;branch=z9hG4bK-10506-1-8 From: <sip:klaus@83.136.32.165>;tag=1 To: <sip:01505641636@83.136.32.165:5060> Call-ID: 1-10506@83.136.32.132 CSeq: 2 CANCEL Max-Forwards: 70 <-------------> --- (7 headers 0 lines) --- Sending to 83.136.32.132 : 5061 (no NAT) <--- Reliably Transmitting (no NAT) to 83.136.32.132:5061 ---> SIP/2.0 487 Request Terminated Via: SIP/2.0/UDP 83.136.32.132:5061;branch=z9hG4bK-10506-1-4;received=83.136.32.132 From: <sip:klaus@83.136.32.165>;tag=1 To: <sip:01505641636@83.136.32.165:5060>;tag=as5434e245 Call-ID: 1-10506@83.136.32.132 CSeq: 2 INVITE User-Agent: Asterisk PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Content-Length: 0 <------------> [Jun 18 16:32:23] NOTICE[23306]: chan_sip.c:2067 __sip_reliable_xmit: Hey, I just set the response code for this packet to 487 <--- Transmitting (no NAT) to 83.136.32.132:5061 ---> SIP/2.0 200 OK Via: SIP/2.0/UDP 83.136.32.132:5061;branch=z9hG4bK-10506-1-8;received=83.136.32.132 From: <sip:klaus@83.136.32.165>;tag=1 To: <sip:01505641636@83.136.32.165:5060>;tag=as5434e245 Call-ID: 1-10506@83.136.32.132 CSeq: 2 CANCEL User-Agent: Asterisk PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Content-Length: 0 <------------> -- Hungup 'DAHDI/1-1' == Spawn extension (direct, 01505641636, 9) exited non-zero on 'SIP/klaus-08fb17f8' <--- SIP read from 83.136.32.132:5061 ---> CANCEL sip:01505641636@83.136.32.165:5060 SIP/2.0 Via: SIP/2.0/UDP 83.136.32.132:5061;branch=z9hG4bK-10506-1-12 From: <sip:klaus@83.136.32.165>;tag=1 To: <sip:01505641636@83.136.32.165:5060> Call-ID: 1-10506@83.136.32.132 CSeq: 2 CANCEL Max-Forwards: 70 <-------------> --- (7 headers 0 lines) --- Sending to 83.136.32.132 : 5061 (no NAT) Scheduling destruction of SIP dialog '1-10506@83.136.32.132' in 32000 ms (Method: CANCEL) Segmentation fault (core dumped) #0 0xb61b7cde in handle_request_cancel (p=0x8fb17f8, req=0xb6170050) at chan_sip.c:15366 15366 prev_pkt->next = pkt->next; (gdb) bt #0 0xb61b7cde in handle_request_cancel (p=0x8fb17f8, req=0xb6170050) at chan_sip.c:15366 #1 0xb61bb2e2 in handle_request (p=0x8fb17f8, req=0xb6170050, sin=0xb6170040, recount=0xb6170034, nounlock=0xb6170038) at chan_sip.c:16050 #2 0xb61bc10f in sipsock_read (id=0x8f7d7a0, fd=47, events=1, ignore=0x0) at chan_sip.c:16293 #3 0x080b0b54 in ast_io_wait (ioc=0x8f70d00, howlong=999) at io.c:279 #4 0xb61bcadc in do_monitor (data=0x0) at chan_sip.c:16512 ASTERISK-1 0x0810588d in dummy_start (data=0x8f7d690) at utils.c:856 ASTERISK-2 0xb7f1e4c0 in start_thread () from /lib/i686/cmov/libpthread.so.0 ASTERISK-3 0xb7e306de in clone () from /lib/i686/cmov/libc.so.6 By: Mark Michelson (mmichelson) 2009-06-18 09:37:40 Ah, thanks. I'll work out a new version really soon. It looks like I made a simple mistake here. By: Mark Michelson (mmichelson) 2009-06-18 09:41:41 New patch uploaded (14584_v2.patch). This should correct the problem you had with the previous patch. By: klaus3000 (klaus3000) 2009-06-19 06:54:21 Hi Michael! I tested the new patch in several scenarios and it seems to work fine now. Thanks. By: Digium Subversion (svnbot) 2009-06-22 09:42:56 Repository: asterisk Revision: 202341 U branches/1.4/channels/chan_sip.c ------------------------------------------------------------------------ r202341 | mmichelson | 2009-06-22 09:42:55 -0500 (Mon, 22 Jun 2009) | 26 lines Fix a situation in which Asterisk would not stop retransmitting 487s. If a CANCEL were received by Asterisk, we would send a 487 in response to the original INVITE and a 200 OK for the CANCEL. If there were a network hiccup which caused the 200 OK and the 487 to be lost, then the UA communicating with Asterisk may try to retransmit its CANCEL. Asterisk's response to this used to be to try sending another 487 to the canceled INVITE and another 200 OK to the CANCEL. The problem here is that the originally-sent 487 was sent "reliably" meaning that it will be retransmitted until it is received properly. So when we receive the second CANCEL it is likely that the first batch of 487s we sent is still going strong and reaches the UA. The result was that the second set of 487s would be retransmitted constantly until the maximum number of retries had been reached. The fix for this is that if we receive a second CANCEL for an INVITE, then we cancel the retransmission of the first set of 487s and start a second set. This causes the dialog to be terminated reasonably. (closes issue ASTERISK-13679) Reported by: klaus3000 Patches: 14584_v2.patch uploaded by mmichelson (license 60) Tested by: klaus3000 ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=202341 By: Digium Subversion (svnbot) 2009-06-22 09:58:25 Repository: asterisk Revision: 202343 _U trunk/ U trunk/channels/chan_sip.c ------------------------------------------------------------------------ r202343 | mmichelson | 2009-06-22 09:58:24 -0500 (Mon, 22 Jun 2009) | 36 lines Merged revisions 202341-202342 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r202341 | mmichelson | 2009-06-22 09:42:55 -0500 (Mon, 22 Jun 2009) | 26 lines Fix a situation in which Asterisk would not stop retransmitting 487s. If a CANCEL were received by Asterisk, we would send a 487 in response to the original INVITE and a 200 OK for the CANCEL. If there were a network hiccup which caused the 200 OK and the 487 to be lost, then the UA communicating with Asterisk may try to retransmit its CANCEL. Asterisk's response to this used to be to try sending another 487 to the canceled INVITE and another 200 OK to the CANCEL. The problem here is that the originally-sent 487 was sent "reliably" meaning that it will be retransmitted until it is received properly. So when we receive the second CANCEL it is likely that the first batch of 487s we sent is still going strong and reaches the UA. The result was that the second set of 487s would be retransmitted constantly until the maximum number of retries had been reached. The fix for this is that if we receive a second CANCEL for an INVITE, then we cancel the retransmission of the first set of 487s and start a second set. This causes the dialog to be terminated reasonably. (closes issue ASTERISK-13679) Reported by: klaus3000 Patches: 14584_v2.patch uploaded by mmichelson (license 60) Tested by: klaus3000 ........ r202342 | mmichelson | 2009-06-22 09:44:58 -0500 (Mon, 22 Jun 2009) | 3 lines Remove an extra debug line left from previous commit. ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=202343 By: Digium Subversion (svnbot) 2009-06-22 10:05:01 Repository: asterisk Revision: 202344 _U branches/1.6.0/ U branches/1.6.0/channels/chan_sip.c ------------------------------------------------------------------------ r202344 | mmichelson | 2009-06-22 10:05:01 -0500 (Mon, 22 Jun 2009) | 43 lines Merged revisions 202343 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r202343 | mmichelson | 2009-06-22 09:58:24 -0500 (Mon, 22 Jun 2009) | 36 lines Merged revisions 202341-202342 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r202341 | mmichelson | 2009-06-22 09:42:55 -0500 (Mon, 22 Jun 2009) | 26 lines Fix a situation in which Asterisk would not stop retransmitting 487s. If a CANCEL were received by Asterisk, we would send a 487 in response to the original INVITE and a 200 OK for the CANCEL. If there were a network hiccup which caused the 200 OK and the 487 to be lost, then the UA communicating with Asterisk may try to retransmit its CANCEL. Asterisk's response to this used to be to try sending another 487 to the canceled INVITE and another 200 OK to the CANCEL. The problem here is that the originally-sent 487 was sent "reliably" meaning that it will be retransmitted until it is received properly. So when we receive the second CANCEL it is likely that the first batch of 487s we sent is still going strong and reaches the UA. The result was that the second set of 487s would be retransmitted constantly until the maximum number of retries had been reached. The fix for this is that if we receive a second CANCEL for an INVITE, then we cancel the retransmission of the first set of 487s and start a second set. This causes the dialog to be terminated reasonably. (closes issue ASTERISK-13679) Reported by: klaus3000 Patches: 14584_v2.patch uploaded by mmichelson (license 60) Tested by: klaus3000 ........ r202342 | mmichelson | 2009-06-22 09:44:58 -0500 (Mon, 22 Jun 2009) | 3 lines Remove an extra debug line left from previous commit. ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=202344 By: Digium Subversion (svnbot) 2009-06-22 10:10:53 Repository: asterisk Revision: 202345 _U branches/1.6.1/ U branches/1.6.1/channels/chan_sip.c ------------------------------------------------------------------------ r202345 | mmichelson | 2009-06-22 10:10:53 -0500 (Mon, 22 Jun 2009) | 43 lines Merged revisions 202343 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r202343 | mmichelson | 2009-06-22 09:58:24 -0500 (Mon, 22 Jun 2009) | 36 lines Merged revisions 202341-202342 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r202341 | mmichelson | 2009-06-22 09:42:55 -0500 (Mon, 22 Jun 2009) | 26 lines Fix a situation in which Asterisk would not stop retransmitting 487s. If a CANCEL were received by Asterisk, we would send a 487 in response to the original INVITE and a 200 OK for the CANCEL. If there were a network hiccup which caused the 200 OK and the 487 to be lost, then the UA communicating with Asterisk may try to retransmit its CANCEL. Asterisk's response to this used to be to try sending another 487 to the canceled INVITE and another 200 OK to the CANCEL. The problem here is that the originally-sent 487 was sent "reliably" meaning that it will be retransmitted until it is received properly. So when we receive the second CANCEL it is likely that the first batch of 487s we sent is still going strong and reaches the UA. The result was that the second set of 487s would be retransmitted constantly until the maximum number of retries had been reached. The fix for this is that if we receive a second CANCEL for an INVITE, then we cancel the retransmission of the first set of 487s and start a second set. This causes the dialog to be terminated reasonably. (closes issue ASTERISK-13679) Reported by: klaus3000 Patches: 14584_v2.patch uploaded by mmichelson (license 60) Tested by: klaus3000 ........ r202342 | mmichelson | 2009-06-22 09:44:58 -0500 (Mon, 22 Jun 2009) | 3 lines Remove an extra debug line left from previous commit. ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=202345 By: Digium Subversion (svnbot) 2009-06-22 10:17:02 Repository: asterisk Revision: 202346 _U branches/1.6.2/ U branches/1.6.2/channels/chan_sip.c ------------------------------------------------------------------------ r202346 | mmichelson | 2009-06-22 10:17:02 -0500 (Mon, 22 Jun 2009) | 43 lines Merged revisions 202343 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r202343 | mmichelson | 2009-06-22 09:58:24 -0500 (Mon, 22 Jun 2009) | 36 lines Merged revisions 202341-202342 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r202341 | mmichelson | 2009-06-22 09:42:55 -0500 (Mon, 22 Jun 2009) | 26 lines Fix a situation in which Asterisk would not stop retransmitting 487s. If a CANCEL were received by Asterisk, we would send a 487 in response to the original INVITE and a 200 OK for the CANCEL. If there were a network hiccup which caused the 200 OK and the 487 to be lost, then the UA communicating with Asterisk may try to retransmit its CANCEL. Asterisk's response to this used to be to try sending another 487 to the canceled INVITE and another 200 OK to the CANCEL. The problem here is that the originally-sent 487 was sent "reliably" meaning that it will be retransmitted until it is received properly. So when we receive the second CANCEL it is likely that the first batch of 487s we sent is still going strong and reaches the UA. The result was that the second set of 487s would be retransmitted constantly until the maximum number of retries had been reached. The fix for this is that if we receive a second CANCEL for an INVITE, then we cancel the retransmission of the first set of 487s and start a second set. This causes the dialog to be terminated reasonably. (closes issue ASTERISK-13679) Reported by: klaus3000 Patches: 14584_v2.patch uploaded by mmichelson (license 60) Tested by: klaus3000 ........ r202342 | mmichelson | 2009-06-22 09:44:58 -0500 (Mon, 22 Jun 2009) | 3 lines Remove an extra debug line left from previous commit. ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=202346 |