Summary:ASTERISK-05782: SIP CANCELs don't seem to be retransmitted per RFC2543
Reporter:Brett Nemeroff (brettnem)Labels:
Date Opened:2005-12-05 12:00:13.000-0600Date Closed:2008-01-15 17:22:03.000-0600
Versions:Frequency of
Environment:Attachments:( 0) sip-cancel-trace.txt
( 1) sip-cancel-trace2.txt
( 2) sip-cancel-trace3.txt
Description:I'm dialing two phones (receptionist phones) for an office. Both phones ring fine. __Occaisionally__ after one receptionist answers their phone, the other phone continues to ring. With a SIP trace I have verified that the original request was CANCELed, but Asterisk never recieved a 200 OK or a 487 back from the device. No retransmission is performed and the phone continues to ring.

Please see attached trace.

Also, I tried a plain call to a single polycom phone. I unplugged the ethernet connection to the phone while it was ringing, then the originator (while listening to ringing) hung up. Naturally the phone continued to ring. A SIP Trace shows only ONE CANCEL ever going out.


Per RFC 2543:
10.4 Reliability for BYE, CANCEL, OPTIONS, REGISTER Requests

10.4.1 UDP

  A SIP client using UDP SHOULD retransmit a BYE, CANCEL, OPTIONS, or
  REGISTER request with an exponential backoff, starting at a T1 second
  interval, doubling the interval for each packet, and capping off at a
  T2 second interval.

Comments:By: Olle Johansson (oej) 2005-12-05 15:00:36.000-0600

As per the bug guidelines I need a SIP debug of the whole transaction, with a debug setting of 4 and verbose setting of 4. Thank you!

By: Brett Nemeroff (brettnem) 2005-12-05 15:41:22.000-0600

Uploaded new trace in format per bug guidelines.
sip-cancel-trace.txt - SIP Scenario Decoded Trace
sip-cancel-trace2.txt - In proper format (sip debug verbose, debug level 4)

Just included the cancel portion since this is a production server and there was a whole lot of data produced. Hope that is ok.

By: Olle Johansson (oej) 2005-12-05 15:43:16.000-0600

This small fragment is really not enough, sorry. CAn you provide us with more?

By: Brett Nemeroff (brettnem) 2005-12-05 15:45:12.000-0600

I'm going to need to do this after hours to limit the amount of excessive traffic in the debug. Thanks for your patience.

By: Brett Nemeroff (brettnem) 2005-12-06 08:43:31.000-0600

Ok just uploaded:

This is a FULL trace of the call with debug 4, verbose 4, sip debug. Sorry, there is some subscription/failed registration traffic in there..

Hope this is what you need..

By: alexb (alexb) 2005-12-08 04:40:23.000-0600

Maybe is this issue related to ASTERISK-5493? BTW, we do not use SJphone anymore, however sometimes we still have the same problem with eyeBeam.

By: alexb (alexb) 2005-12-08 04:41:28.000-0600

FYI, we have Asterisk 1.2.1

By: Brett Nemeroff (brettnem) 2005-12-08 08:20:53.000-0600

I think it's similar. Your ACK is probably not making it to Asterisk for some reason and Asterisk isn't bothering to retransmit the CANCEL. Just a thought..

We've replaced the entire network segment to those phones and even replaced the phones (twice) and still get similar problems. There most likely is a small layer 1 problem, but in ANY case, a retransmission should be occuring.

NOTE: my traces are from the asterisk server itself. So not seeing the retransmission couldn't really be a layer 1 problem.

Also, not that it makes a difference, but there is a dedicated connection (2 bonded T1s) from the LAN with the asterisk server and the LAN with the phones.

By: xinu (xinu) 2006-01-04 16:44:46.000-0600

Hi, I have been using Asterisk, for over a year with polycom phones and no problems, I just upgraded my system to 1.2.1 and I'm now having the same problem you are describing. Any new status on this?

By: Serge Vecher (serge-v) 2006-01-04 22:20:21.000-0600

I have observed a similar situation when upgrading Cisco IP phones to SIP firware 7.5. The bug was reported http://bugs.digium.com/view.php?id=5336. With help of MikeJ and Joshnet, we figured out it was the Cisco Phone at fault. It basically never bothered to ACK a 487. Might be a thought for you -- try downgrading the firmware on the phone to the last known working version. Before, I was upgrading Asterisk and sip firmware on the phones in the same cycle -- now I know better.

By: xinu (xinu) 2006-01-05 06:16:19.000-0600

The phones were not upgraded, just *, I'm thinking about trying the latest development branch to see if the issue is still there, if it is then I most likely will have to downgrade to the last version of * that worked for me.  We have 2 people who answer phone at the front of our store, and they are not real happy with me right now :(

By: Brett Nemeroff (brettnem) 2006-01-05 08:32:02.000-0600

It appears that there is no pending_cancel state in asterisk. Asterisk simply sends out the CANCEL and considers the call dead regardless if the endpoint recieved the CANCEL or not. It is a bug in asterisk. In fact, I don't think asterisk even cares if it ever gets any kind of response to a CANCEL at all (I don't think it's looking for a 487).

If you look in the code, it uses the function call for the reliable transmit. Which would lead you to believe that it would retransmit if the CANCEL didn't get an ACK. However, since Asterisk don't maintain a pending_cancel state and simply destroys the call once the CANCEL goes out, it won't try to retransmit a message on a call that is already destroyed.

Which is why if one phone doesn't get the CANCEL (for whatever reason, be it firmware, network problems, congestion, solar eclipse, etc) it will continue to ring (annoying).

I'm not much of a coder, but that's my interpretation.

By: Olle Johansson (oej) 2006-01-05 09:38:26.000-0600

Yes, the whole code for handling of hangups is wrong. I am looking into this issue. We destroy everything too fast to remember to re-transmit.

By: Olle Johansson (oej) 2006-01-26 02:17:56.000-0600

"Your bug report is on hold, the approximate waiting time is..." :-)

Still an open issue that needs to be resolved, even though it is not a simple hack.

By: Olle Johansson (oej) 2006-03-10 06:18:18.000-0600

Fixed in Asterisk 1.2 svn, revision 12495.
svn trunk revision 12496.

Thank you for reporting this!

By: Digium Subversion (svnbot) 2008-01-15 17:22:01.000-0600

Repository: asterisk
Revision: 12495

U   branches/1.2/channels/chan_sip.c

r12495 | oej | 2008-01-15 17:22:01 -0600 (Tue, 15 Jan 2008) | 2 lines

Issue ASTERISK-5782 - Make sure SIP CANCEL's are re-transmitted



By: Digium Subversion (svnbot) 2008-01-15 17:22:03.000-0600

Repository: asterisk
Revision: 12496

_U  trunk/
U   trunk/channels/chan_sip.c

r12496 | oej | 2008-01-15 17:22:02 -0600 (Tue, 15 Jan 2008) | 3 lines

Issue ASTERISK-5782 - Make sure that SIP CANCEL's are retransmitted properly
Importing revision 12495 from 1.2 with changes for svn trunk