|Summary:||ASTERISK-00204: sip session timer support/chan_sip continues to stream data even after SIP clients have terminated or crashed|
|Date Opened:||2003-09-01 18:42:09||Date Closed:||2011-06-07 14:10:13|
|Description:||Using latest CVS (Aug 27,28,29) with kphone as a client (tried 3.0.0 and 3.1.1), if kphone hangs up, or crashes there is a > 50% chance * will continue to keep the channel open, and stream data to the client, even though the application is long since closed. I have to either soft hangup, or restart * to get it to stop sending data.|
I'm using the MeetMe application for all calls, with ztdummy for timing.
|Comments:||By: Mark Spencer (markster) 2003-09-03 23:26:48|
If your client crashes, we have no way of knowing that it died and thus have no way to know to stop streaming data to it. THere is some request for support of some special invites that have an expiration on them but not all clients support this anyway.
Unfortunately this is basically a limitation of the SIP protocol.
By: John Todd (jtodd) 2003-09-08 22:06:19
See http://bugs.digium.com/bug_view_page.php?bug_id=0000025 for details on the timers mark was talking about. He is correct that this is a SIP protocol bug, but there may be another option...
Since we know that _most_ RTP streams are bi-directional, in that one side will transmit every once in a while (every X minutes) then it could be possible to hang up a channel that is completely one-way if no packets are being received from the opposite direction after some timeout. This doesn't seem particularly hard to detect, but I have no idea how that would feed back out of the RTP session into the channel.
The only problem I could see with this would be when someone had VAD turned on and was on a conference call that they had muted on their phone. That would result in no packets transmitted from their phone for an extended period of time. However, allowances can be made for that kind of thing if perhaps this would be a minute-based configurable setting per SIP peer (silencedetecthangup=15)
This would neatly sidestep those problems with SIP where a network disconnection led to an infinite call. Of course, all calls should have an AbsoluteTimeout, but why wait that long when with a bit of smarts you can tell that the other side is no longer there?
By: kbantoft (kbantoft) 2003-09-14 19:46:15
So SIP is seriously flawed - that's disappointing.
It's quite simple to DoS anyone who accepts inbound SIP calls (be it Asterisk or other vendor gear) - open connection, "crash" client, repeat until target is out of bandwidth. This assumes you have more than they do, since it will all be streaming back to you, but still quite possible.
This is actually how I ran into it - I DoS'd my friend's DSL line accidentally for over an hour before he SMS'd me to tell me what was going on.
Reading id 0000025 and the draft, it seems like it will allieviate this, so I suppose it's a waiting game, or at least until someone releases a tool to provoke this behaviour.
By: John Todd (jtodd) 2003-09-14 20:59:34
With some fairly trivial programming, one could limit the number of inbound SIP connections from a particular client or group. And there is even now an "outgoingimit=" command in sip.conf (see http://bugs.digium.com/bug_view_page.php?bug_id=0000207) which might help a bit in those circumstances. Additionally, AbsoluteTimeout might help to limit infinite length attacks.
However, you're right - it is a potential DoS attack if you have your SIP server open to any connections that come inbound. The method I suggest (if a conversation is one-way) to hang up SIP users handles that somewhat elegantly. I will modify my comments and say that instead of minute-based, it should be a seconds-based timer.
Note that the method I suggest only works on RTP sessions that originate/terminate on the Asterisk server; it is the case that remote SIP clients that do re-INVITEs will not be able to have this trick applied to their sessions, since Asterisk does not see the RTP stream.
The more I think about it, though, the more I do think this "silence detection" mode would be a good idea for Asterisk to implement to avoid such phantom calls from going on forever. This would be a protective measure for the bandwidth on the Asterisk server.
kbantoft: do you have any programming ability with C?
By: zoa (zoa) 2003-09-21 12:53:50
I am having the same problem with xlite & xpro:
after a while i have a zillion connections like these.
192.168.0.10 (None) E16F5495-15 00101/17040 00000ms 0000ms UNKN
192.168.0.10 (None) BD9ADC17-3E 00101/18721 00000ms 0000ms UNKN
192.168.0.10 (None) 4B640ECB-C9 00101/03000 00000ms 0000ms UNKN
If i wait long enough they disappear again.
How would this affect cdr logs and thus billing ?
By: zoa (zoa) 2003-09-21 13:08:23
looks like this is the same as feature request 0000025.
Meanwhile i also noticed that i only have the problem with private ip's.
(If in x-lite the option 'automatically detect ip' is disabled i have the impression the problem is not occuring).
My grandstream phones also have the same problem (on an 192 range).
None of the public ip clients i connected had the problem.
By: Brian West (bkw918) 2003-11-22 16:03:57.000-0600
Hasn't this been sorta resolved?
By: zoa (zoa) 2004-01-09 19:33:08.000-0600
i think this is resolved, as i've never seen the problem again...
By: scaredycat (scaredycat) 2004-01-11 09:53:38.000-0600
Still got this problem :(
Calls are staying open after client crashes....
Could we hijack the qualify= data and if there was no response after aa certain number of qualifies kill the stream and close the call? or alternatively have a timout= to do the same....
edited on: 01-11-04 09:44
By: jrollyson (jrollyson) 2004-01-12 00:19:56.000-0600
This requires implementation of session timers to fix
By: scaredycat (scaredycat) 2004-01-14 18:45:45.000-0600
or prack ....
By: timecop (timecop) 2004-01-14 23:36:51.000-0600
since it seems at least some equipment already uses session timers (like my isp), that should probably be hacked in first :)
By: John Todd (jtodd) 2004-01-20 11:00:14.000-0600
In the cases where the audio stream is going through Asterisk (in all events where you have a Zap card, as an example that is typical) then this can be resolved with the feature items that I suggest, which is to look for RTP streams that are one-way for some selectable period of time. This is not the "optimal" solution, but it solves the problem for people who have a high cost associated with failures of this type: namely, people sending calls out PRI trunks that may have very high costs associated with failures.
The "silencedetecthangup" would create a timer, in seconds, that would start ticking as soon as some period of one-way audio was encountered while * talks to this peer. If the peer has not transmitted any audio to us in N seconds, terminate both legs of the call.
One method of doing this could be accomplished by setting a variable to be the epoch time (in seconds) every time a packet was received. If last_transmit_time > last_receive_time+N then hangup. This is my crude method of doing it, and I am not a programmer, so other options are welcome.
This should be selectable on a per-peer basis, to allow for some phones to opt out of the timers, since it may be the case that some people with VAD-capable phones get on conference calls, which would create a false timeout situation.
By: Olle Johansson (oej) 2004-01-23 02:06:13.000-0600
I would suggest selectable per peer as well as a global setting.
By: scaredycat (scaredycat) 2004-01-23 02:45:49.000-0600
Jtodd's idea would make 'listen only' conferences break....
By: Brian West (bkw918) 2004-01-24 12:27:42.000-0600
NO we need to do the proper RFC described session timers. Nothing more, nothing less. Its the right way to do it.
By: scaredycat (scaredycat) 2004-01-24 16:07:13.000-0600
I was being subtle.... :)
is it planned for both to be implemented?
By: John Todd (jtodd) 2004-01-26 16:50:17.000-0600
Well, there may be subtle and blunt answers here, but that does not solve the problem because not all equipment will support the modifications for timers. I have calls, right now, today, that get hung up and dump off into space until the AbsoluteTimer routines kick in. Even if we had the SIP timers enabled, it will be some time before that's supported by the gear that we all use on the desktop. We have the ability to see if _many_ SIP calls are failing; at least those whose media stream goes through Asterisk. Let's use what we have and be pragmatic about it. There is no reason that both methods cannot be used.
Secondarily, no, it does not break "listen-only" conference calls, because that's only a problem if you have VAD, and as I mentioned, it would be selectable on a per-peer basis. If you read closely, you'll note that I mention that exact circumstance in my last bugnote. A less optimal idea would be to make it Yet Another Dial Modifier.
By: John Todd (jtodd) 2004-01-26 20:05:54.000-0600
After some additional research, a "cleaner" way of making my suggestion work would be to use the RTCP extensions, but that would imply that Asterisk would then also have an RTCP capable RTP stack, and also that the SIP clients were RTCP capable. I include Cisco's note on a similar feature very close to what I describe above that they have worked into recent versions of IOS:
By: Brian West (bkw918) 2004-01-26 21:03:42.000-0600
Accually the RFC for timers only states taht ONE and only one device/endpoint has to understand timers. The devices that don't understand it will just ignore it. Asterisk will be the one in the loop that needs to understand it.
By: scaredycat (scaredycat) 2004-01-29 05:45:18.000-0600
All the kit i have supports session timers, one or two support prack - I haven't seen a piece of SIP kit that doesn't (point me at a device that doesn;t and maybe we can apply some pressure to get the manuf. to implement).
I totally agree, we have a similar situation where phones crash and the session continues... it's the phones that are to blame but I'd like * to be able to detect this somehow.. My initial idea was to use the qualify to determine client availability but I've changed my mind, I'd just prefer something that was an rfc (Actually i was hoping for a 'quick fix' using that method). As bkw says only one end needs to support session timers.. imho following the rfc's is the way to go.
By: chrisorme (chrisorme) 2004-03-21 15:52:36.000-0600
I think this is related to what is happening to me with a Draytek 2600V.
If the person called with the Draytek SIP unit is on an analogue PSTN line the call continues after the Draytek hangs up (even though the Draytek monitoring screen thinks it has hung up the call) I need to do 'restart now' with * or wait between 3 and 17 minutes for the called person to get their line back and be able to use it again.
I'm not sure that the Draytek has crashed though - I just don't think it sends a proper BYE ???? My bug report is 0001239 but seems related to this in that the RTP stream seems to continue until the restart of * or some timeout.
If anyone can look at the debug on my bug report and say if this is what is happening to me with the Draytek is the same as with you or indeed if it is the manufacturers fault although some form of workaround in * would be amazing because then it's not just a kphone problem.
By: Malcolm Davenport (mdavenport) 2004-05-26 15:45:28
Closing. If someone wants to do something with this, please do. Find a bug marshall on IRC if you need it reopened.