[Home]

Summary:ASTERISK-11205: sip hung channels and UDP ports
Reporter:mvf (mvf)Labels:
Date Opened:2008-01-10 15:16:02.000-0600Date Closed:2008-01-29 17:37:50.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) bug11736.txt
( 1) bug11736-2.diff
( 2) show_channels-show_sip_channels.txt
( 3) show_channels-sip_show_channels-postpatch.txt
( 4) show_channels-sip_show_channels-prepatch.txt
Description: I've found that some sip channels of some SIP-SIP calls are not being released by asterisk. Issuing the command "show channels" display the exact information for active calls but if I use the "sip show channels" command it shows the active calls plus a lot of hung channels (check attach).

I´ve checked the "sip show history" of these hung channels and found that sometimes due to network problems (or redundant termination) asterisk could receive a "183 Session Progress" message from termination peer after a CANCEL message has been received from calling party. This problem causes that * CancelDestroy a non existing call resulting in a hung sip channel and UDP ports related to it taken. After some time of operation asterisk stop processing new calls. Please check the sip show history for a hung peer (originating peer):

 * SIP Call
1. NewChan         Channel SIP/SIP_LCR-097a7500 - from 4e061fc824c28fdb500a7d6e748
2. TxReqRel        INVITE / 102 INVITE - -UNKNOWN-
3. Rx              SIP/2.0 / 102 INVITE / 100 trying
4. Cancel          Cause Normal Clearing
5. SchedDestroy    32000 ms
6. TxReqRel        CANCEL / 102 CANCEL - -UNKNOWN-
7. SchedDestroy    32000 ms
8. Rx              SIP/2.0 / 102 CANCEL / 200 ok
9. Rx              SIP/2.0 / 102 INVITE / 183 Session Progress
10. CancelDestroy  
11. Rx              SIP/2.0 / 102 INVITE / 404 Not Found
12. TxReq           ACK / 102 ACK - -UNKNOWN-

You can see how after the cancelling exchange a late 183 arrive from termination peer and confuze * making it to execute a CancelDestroy over a no more existing peer.


****** ADDITIONAL INFORMATION ******

There is a previuos bug report (http://bugs.digium.com/view.php?id=9235) for this issue, I can check that the problem is still in the last release (1.4.17). It seems that the solution was applied to the channels hung on INVITE state but not to the channels hung on ACK state.
I manually applied the patch posted in that report on a 1.4.11 * and checked that under the same delayed 183 message the channel is properly release.
Comments:By: Olle Johansson (oej) 2008-01-22 09:15:32.000-0600

Please try this patch.
Thanks,
/O

By: mvf (mvf) 2008-01-22 14:40:49.000-0600

Hi oej, thanks for the patch. I´ve just installed it on my 1.4.17 asterisk, it will be running with moderate to high traffic (aprox. 30 simultaneous calls) from now on so I will see soon if hung sip channels are still there. I´ll be reporting tomorrow.

Just for the record, I attached a core channels and sip channels report of asterisk 1.4.17 before applying the patch (show_channels-sip_show_channels-prepatch.txt).

By: daniel g (revolution) 2008-01-22 20:46:11.000-0600

oej -- thanks.

installed onto 1.4.16.2 though I havent compiled, etc. yet (waiting for traffic to die down)...

...would've tested w/ svn but 1.4.17 was causing deadlocks whereas 1.4.16.2 was not and i've not had time to debug...

anything I should know about having applied to 16.2 instead of svn?

the box this will be installed onto hit a peak of 82 simul. calls recently... so obviously quite a few channels have hung... we'll see how this works.

will post back on thursday.

By: daniel g (revolution) 2008-01-23 09:02:58.000-0600

ok --- so im a liar... still seeing channels hung on invites...

1. NewChan         Channel SIP/NNNN-08278d88 - from 769b918e55c2fa1f228e98c27
2. TxReqRel        INVITE / 102 INVITE - -UNKNOWN-
3. Rx              SIP/2.0 / 102 INVITE / 100 Trying
4. Rx              SIP/2.0 / 102 INVITE / 180 Ringing
5. Rx              SIP/2.0 / 102 INVITE / 200 OK
6. TxReq           ACK / 102 ACK - -UNKNOWN-
7. Rx              INVITE / 35707 INVITE / sip:+1NPANXXXXXX@aaa.bbb.ccc.ddd
8. TxResp          SIP/2.0 / 35707 INVITE - 488 Not acceptable here
9. Rx              ACK / 35707 ACK / sip:+1NPANXXXXXX@aaa.bbb.ccc.ddd
10. Rx              INVITE / 35708 INVITE / sip:+1NPANXXXXXX@aaa.bbb.ccc.ddd
11. ReInv           Re-invite received
12. TxResp          SIP/2.0 / 35708 INVITE - 100 Trying
13. TxResp          SIP/2.0 / 35708 INVITE - 200 OK
14. Rx              ACK / 35708 ACK / sip:+1NPANXXXXXX@aaa.bbb.ccc.ddd
15. Rx              INVITE / 35709 INVITE / sip:+1NPANXXXXXX@aaa.bbb.ccc.ddd
16. TxResp          SIP/2.0 / 35709 INVITE - 488 Not acceptable here
17. Rx              ACK / 35708 ACK / sip:+1NPANXXXXXX@aaa.bbb.ccc.ddd
18. Rx              INVITE / 35710 INVITE / sip:+1NPANXXXXXX@aaa.bbb.ccc.ddd
19. TxResp          SIP/2.0 / 35710 INVITE - 491 Request Pending
20. Hangup          Cause Normal Clearing
21. SchedDestroy    6400 ms
22. CancelDestroy


It seems like Invites are just sent over and over and over... (though I have reinvite disabled as this particular peer is behind a NAT)... until finally Asterisk gives up and replies with a Request Pending... and the channel hangs.

By: mvf (mvf) 2008-01-23 14:42:15.000-0600

Hi oej, nice patch. My asterisk has been running for a day now under moderate traffic (about 40 concurrent calls) and there is definitely an improvement. I don't see any channel hang because of a late session progress as before. The only one channels hang shows that the call receives two 183 in a row, maybe the second is skiping the patch (there isn't a CancelDestroy after the first one).

*CLI> sip show history 4928bd92396
*CLI>
 * SIP Call
1. NewChan         Channel SIP/SIP_LCR-0a1416a0 - from 4928bd923962249b0f64aec0477
2. TxReqRel        INVITE / 102 INVITE - -UNKNOWN-
3. Rx              SIP/2.0 / 102 INVITE / 100 trying
4. Cancel          Cause Normal Clearing
5. SchedDestroy    32000 ms
6. TxReqRel        CANCEL / 102 CANCEL - -UNKNOWN-
7. SchedDestroy    32000 ms
8. Rx              SIP/2.0 / 102 CANCEL / 200 ok
9. Rx              SIP/2.0 / 102 INVITE / 183 Session Progress
10. Rx              SIP/2.0 / 102 INVITE / 183 Session Progress
11. CancelDestroy  
12. Rx              SIP/2.0 / 102 INVITE / 404 Not Found
13. TxReq           ACK / 102 ACK - -UNKNOWN-

Besides that everything is going perfect by now. THX
I will post another note tomorrow.

By: Olle Johansson (oej) 2008-01-23 14:51:13.000-0600

Ok, will have to check that. Thanks for the quick feedback!

By: Olle Johansson (oej) 2008-01-23 14:54:04.000-0600

revolution: That's a weird history. I don't get what's cancelling the desctruction at the end. Will have to check that. Can you get more information on that call, repeat it?

By: Digium Subversion (svnbot) 2008-01-23 14:55:45.000-0600

Repository: asterisk
Revision: 99977

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r99977 | oej | 2008-01-23 14:55:42 -0600 (Wed, 23 Jan 2008) | 9 lines

Make sure we don't cancel destruction on calls in CANCEL state, even if we
get 183 while waiting for answer on our CANCEL.

(issue ASTERISK-11205)
Reported by: MVF
Patches:
     bug11736.txt uploaded by oej (license 306)
Tested by: MVF

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=99977

By: Olle Johansson (oej) 2008-01-23 14:56:57.000-0600

Ok, committed the patch since it did solve some issues, even though not all. Let's keep hunting.

By: Olle Johansson (oej) 2008-01-23 15:00:58.000-0600

MVF: I see now. Will apply a fix for your issue to 1.4 svn.

By: Digium Subversion (svnbot) 2008-01-23 15:04:37.000-0600

Repository: asterisk
Revision: 99978

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r99978 | oej | 2008-01-23 15:04:37 -0600 (Wed, 23 Jan 2008) | 7 lines

Second attempt. Don't change invitestate when receiving 18x messages in CANCEL state.

(issue ASTERISK-11205)
Reported by: MVF

Patch by oej.

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=99978

By: Digium Subversion (svnbot) 2008-01-23 15:07:17.000-0600

Repository: asterisk
Revision: 99979

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r99979 | oej | 2008-01-23 15:07:17 -0600 (Wed, 23 Jan 2008) | 17 lines

Merged revisions 99977 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r99977 | oej | 2008-01-23 21:58:20 +0100 (Ons, 23 Jan 2008) | 9 lines

Make sure we don't cancel destruction on calls in CANCEL state, even if we
get 183 while waiting for answer on our CANCEL.

(issue ASTERISK-11205)
Reported by: MVF
Patches:
     bug11736.txt uploaded by oej (license 306)
Tested by: MVF

........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=99979

By: Digium Subversion (svnbot) 2008-01-23 15:09:16.000-0600

Repository: asterisk
Revision: 99980

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r99980 | oej | 2008-01-23 15:09:16 -0600 (Wed, 23 Jan 2008) | 15 lines

Merged revisions 99978 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r99978 | oej | 2008-01-23 22:07:16 +0100 (Ons, 23 Jan 2008) | 7 lines

Second attempt. Don't change invitestate when receiving 18x messages in CANCEL state.

(issue ASTERISK-11205)
Reported by: MVF

Patch by oej.

........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=99980

By: Olle Johansson (oej) 2008-01-23 15:12:28.000-0600

MVF: You want this patch applied to your system:
http://svn.digium.com/view/asterisk/branches/1.4/channels/chan_sip.c?r1=99978&r2=99977&pathrev=99978&diff_format=u

By: daniel g (revolution) 2008-01-23 18:47:15.000-0600

i can try oej... this issue likes to be a bit on the dodgy side...

give me a day. I'll try and reproduce the issue and capture for you...

it really looks like asterisk is essentially just receiving re-invites (attempts to change the codec? some boxes can only power one translation at a time)... until it gives up and replies with a pending... which then hangs.

By: mvf (mvf) 2008-01-24 09:23:34.000-0600

Thanks for the commit oej, I just compiled using chan_sip.c revision 99978. Is running now. I´ll be reporting tomorrow.

By: daniel g (revolution) 2008-01-24 16:21:23.000-0600

you know... i've been trying... and trying... and trying... to get a debug of this channel -- but * acts as if though it doesnt exist... I can pull the channel info... and the history... but when I search through it for the debug log -- i find one of two things...

1 - the same channel id belongs to a TxReqRel OPTIONS
2 - the channel is reported as  AutoDestroy by *

...will continue to try and track this one down

By: Olle Johansson (oej) 2008-01-26 13:54:12.000-0600

This patch applies to latest 1.4 from svn.

It actually stops new NOTIFY while we have an open NOTIFY and delays transmission until we've received a 200 OK, so we actually always only have one open transaction.

It will also always send out the latest state, so if there's two state changes while we wait for a 200 OK, we will only send out the latest one, in order not to confuse the phone.

Please test!

Greetings from Torremolinos outside of Malaga, Spain :-)

By: daniel g (revolution) 2008-01-26 17:53:22.000-0600

oej -- applying patch (to svn... just to make sure everything is on point) now... give me a few days and ill report back and let you know whats up.

By: mvf (mvf) 2008-01-28 12:26:36.000-0600

Hi oej, My asterisk has been running for 4 days with chan_sip Revision 99978 and the late 183 problem is gone (Yeah!), thank for all the great work.
I only have three channels hang for other reason. They hang after a re-invite is received, it looks close to what revolution reported. Please check the history of one of these channels:

1. NewChan         Channel SIP/SIP_LCR-095acd78 - from 3ea708da55f0c3f43d5dde76019
2. TxReqRel        INVITE / 102 INVITE - -UNKNOWN-
3. Rx              SIP/2.0 / 102 INVITE / 100 trying
4. Rx              SIP/2.0 / 102 INVITE / 200 OK
5. TxReq           ACK / 102 ACK - -UNKNOWN-
6. Rx              INVITE / 244 INVITE / sip:222222222222@10.10.10.106
7. ReInv           Re-invite received
8. TxResp          SIP/2.0 / 244 INVITE - 100 Trying
9. Hangup          Cause Normal Clearing
10. SchedDestroy    32000 ms
11. CancelDestroy  
12. Rx              INVITE / 245 INVITE / sip:222222222222@10.10.10.106
13. TxResp          SIP/2.0 / 245 INVITE - 491 Request Pending
14. Rx              ACK / 245 ACK / sip:222222222222@10.10.10.106
15. Rx              CANCEL / 244 CANCEL / sip:222222222222@10.10.10.106
16. TxResp          SIP/2.0 / 244 CANCEL - 503 Server error

 Making some test calls I was able to reproduce this sequence from step 1 to 11, I forced the UA that receives the forwarded re-invite to ignore it and then CANCEL the call. After that I see the SchedDestroy/CancelDestroy sequence appear:

1. NewChan         Channel SIP/SIP_LCR-097647b8 - from 50b6d3da06c9106372fc1efa798
2. TxReqRel        INVITE / 102 INVITE - -UNKNOWN-
3. Rx              SIP/2.0 / 102 INVITE / 100 trying
4. Rx              SIP/2.0 / 102 INVITE / 183 Session Progress
5. Rx              SIP/2.0 / 102 INVITE / 200 OK
6. TxReq           ACK / 102 ACK - -UNKNOWN-
7. Rx              INVITE / 563 INVITE / sip:5555848119@64.76.154.106
8. ReInv           Re-invite received
9. TxResp          SIP/2.0 / 563 INVITE - 100 Trying
10. Hangup          Cause Normal Clearing
11. SchedDestroy    32000 ms
12. CancelDestroy  

Anyway for an unknown reason this test channel is released after 32 seconds, maybe the following messages appearing in the history of the hang channel make the channel actually hang (I can't reproduce those messages yet).

Please let me know if this make any sense to you and... Saludos a España y OLe!

By: Digium Subversion (svnbot) 2008-01-29 02:23:59.000-0600

Repository: asterisk
Revision: 100740

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r100740 | oej | 2008-01-29 02:23:59 -0600 (Tue, 29 Jan 2008) | 8 lines

(closes issue ASTERISK-11205)
Reported by: MVF
Patches:
     bug11736-2.diff uploaded by oej (license 306)
Tested by: oej, MVF, revolution

(russellb: This was the showstopper for the release.)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=100740

By: Digium Subversion (svnbot) 2008-01-29 03:16:46.000-0600

Repository: asterisk
Revision: 100767

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r100767 | oej | 2008-01-29 03:16:46 -0600 (Tue, 29 Jan 2008) | 16 lines

Merged revisions 100740 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100740 | oej | 2008-01-29 09:26:48 +0100 (Tis, 29 Jan 2008) | 8 lines

(closes issue ASTERISK-11205)
Reported by: MVF
Patches:
     bug11736-2.diff uploaded by oej (license 306)
Tested by: oej, MVF, revolution

(russellb: This was the showstopper for the release.)

........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=100767

By: Digium Subversion (svnbot) 2008-01-29 17:37:50.000-0600

Repository: asterisk
Revision: 101077

_U  team/murf/bug11210/
U   team/murf/bug11210/Makefile
U   team/murf/bug11210/apps/app_queue.c
U   team/murf/bug11210/channels/chan_misdn.c
U   team/murf/bug11210/channels/chan_sip.c
U   team/murf/bug11210/channels/chan_zap.c
U   team/murf/bug11210/channels/misdn_config.c
U   team/murf/bug11210/main/Makefile
U   team/murf/bug11210/res/res_config_curl.c

------------------------------------------------------------------------
r101077 | murf | 2008-01-29 17:37:49 -0600 (Tue, 29 Jan 2008) | 190 lines

Merged revisions 100741,100767,100794-100795,100833,100843,100900,100928,100931,100933,100935,100974,101018,101036 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r100741 | oej | 2008-01-29 01:28:43 -0700 (Tue, 29 Jan 2008) | 3 lines

Removing code that wasn't supposed to be there at all, only at an experimental stage
before I found another solution. Thanks Kevin, for reminding me.

................
r100767 | oej | 2008-01-29 02:18:42 -0700 (Tue, 29 Jan 2008) | 16 lines

Merged revisions 100740 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100740 | oej | 2008-01-29 09:26:48 +0100 (Tis, 29 Jan 2008) | 8 lines

(closes issue ASTERISK-11205)
Reported by: MVF
Patches:
     bug11736-2.diff uploaded by oej (license 306)
Tested by: oej, MVF, revolution

(russellb: This was the showstopper for the release.)

........

................
r100794 | crichter | 2008-01-29 03:47:06 -0700 (Tue, 29 Jan 2008) | 9 lines

Merged revisions 96199 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r96199 | crichter | 2008-01-03 13:12:27 +0100 (Do, 03 Jan 2008) | 1 line

make sure frame is completely clean, before we send it to asterisk as DTMF. If we don't make it clean, it happens that one way audio occurs..
........

................
r100795 | crichter | 2008-01-29 03:50:29 -0700 (Tue, 29 Jan 2008) | 9 lines

Merged revisions 100793 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100793 | crichter | 2008-01-29 11:36:19 +0100 (Di, 29 Jan 2008) | 1 line

fixed potential segfault in misdn show channels CLI command
........

................
r100833 | file | 2008-01-29 08:30:19 -0700 (Tue, 29 Jan 2008) | 4 lines

Make externip work as documented. If no port is specified it will use the value of bindport instead of always being 5060.
(closes issue ASTERISK-11317)
Reported by: hmodes

................
r100843 | qwell | 2008-01-29 09:14:08 -0700 (Tue, 29 Jan 2008) | 13 lines

Merged revisions 100835 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100835 | qwell | 2008-01-29 10:10:00 -0600 (Tue, 29 Jan 2008) | 5 lines

Allow zap groups above 30 to work properly.

(closes issue ASTERISK-11074)
Reported by: tbsky

........

................
r100900 | russell | 2008-01-29 10:16:34 -0700 (Tue, 29 Jan 2008) | 14 lines

Merged revisions 100882 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100882 | russell | 2008-01-29 11:06:43 -0600 (Tue, 29 Jan 2008) | 6 lines

Fix building Asterisk when the working path has spaces in it.

(closes issue ASTERISK-11295)
Reported by: spendergrass
Patched by: me

........

................
r100928 | russell | 2008-01-29 10:22:47 -0700 (Tue, 29 Jan 2008) | 11 lines

Merged revisions 100922 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100922 | russell | 2008-01-29 11:21:33 -0600 (Tue, 29 Jan 2008) | 3 lines

Use GNU make magic instead of shell magic to escape spaces in the working directory.
(related to issue ASTERISK-11295)

........

................
r100931 | qwell | 2008-01-29 10:42:48 -0700 (Tue, 29 Jan 2008) | 14 lines

Merged revisions 100930 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100930 | qwell | 2008-01-29 11:41:43 -0600 (Tue, 29 Jan 2008) | 6 lines

Initialize an array to 0s if config option not specified.

(closes issue ASTERISK-11319)
Patches:
     misdn_get_config.v1.diff uploaded by IgorG (license 20)

........

................
r100933 | russell | 2008-01-29 10:44:05 -0700 (Tue, 29 Jan 2008) | 12 lines

Merged revisions 100932 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100932 | russell | 2008-01-29 11:43:41 -0600 (Tue, 29 Jan 2008) | 4 lines

Fix the last couple of issues related to building from a path that contains spaces.

(closes issue ASTERISK-11295)

........

................
r100935 | file | 2008-01-29 10:58:37 -0700 (Tue, 29 Jan 2008) | 11 lines

Blocked revisions 100934 via svnmerge

........
r100934 | file | 2008-01-29 13:57:05 -0400 (Tue, 29 Jan 2008) | 4 lines

Don't forget to record the channel so we know whether it is bridged or not later.
(closes issue ASTERISK-11275)
Reported by: slavon

........

................
r100974 | mmichelson | 2008-01-29 12:41:42 -0700 (Tue, 29 Jan 2008) | 14 lines

Merged revisions 100973 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r100973 | mmichelson | 2008-01-29 13:39:00 -0600 (Tue, 29 Jan 2008) | 6 lines

Fixing an erroneous return value returned when attempting to pause or unpause a queue member
fails.

Fixes BE-366, thanks to John Bigelow for writing the patch.


........

................
r101018 | tilghman | 2008-01-29 15:04:15 -0700 (Tue, 29 Jan 2008) | 2 lines

Oops, a sizeof error

................
r101036 | mmichelson | 2008-01-29 16:02:48 -0700 (Tue, 29 Jan 2008) | 11 lines

Merged revisions 101035 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r101035 | mmichelson | 2008-01-29 17:02:03 -0600 (Tue, 29 Jan 2008) | 3 lines

Remove a memory leak from updating realtime queues


........

................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=101077