[Home]

Summary:ASTERISK-15581: [patch] [regression] 1.6.1.13 and 1.6.1.14 UDP ports not freed
Reporter:Marcin Kowalczyk (kowalma)Labels:
Date Opened:2010-02-04 12:46:18.000-0600Date Closed:2010-05-13 09:17:46
Priority:BlockerRegression?No
Status:Closed/CompleteComponents:Core/RTP
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20100315__issue16774.diff.txt
( 1) debug.gz
( 2) debug1.gz
( 3) full.gz
( 4) full1.gz
( 5) sipp_scenario.tar.gz
Description:Since 1.6.1.13 looks there is problem with freeing UDP ports. 2-3 days ago I upgraded one box from 1.6.0.5 and second from 1.6.1.1 to 1.6.1.13 and every few thusants calls I'm running out of UDP ports.

mg0:~# rasterisk -x " sip show channels"
Peer             User/ANR         Call ID          Format           Hold     Last Message    Expiry
192.168.9.2      mg0              28f9a3d00edd874  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              77d13e113c8409e  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              7297a36515d38d1  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              07b22e6a461f5b3  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              3076b8603492c24  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              30180433452ebf7  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              28c1ad56233d208  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              0182d70a3ec9d14  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              76a5f0c46328c99  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              56be88b8268d261  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              626ef5624a83e0c  0x0 (nothing)    No       Rx: ACK
192.168.9.2      mg0              1af2adc275f0b10  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              645b2a390a66a11  0x0 (nothing)    No       Rx: ACK
192.168.9.2      mg0              502a709e10ff8dc  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              58ab1eac5e50a92  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              6967aeee16f8e71  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              618b46651909df3  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              4ddae9dc5f91236  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              250e1d32111b2c5  0x0 (nothing)    No       Rx: ACK
192.168.9.2      mg0              72cfef2c5fc67eb  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              6d70624c7eb9942  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              5e2c546e27a0af2  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              53fa535541908ce  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              0f8cb2514fd7e4f  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              5cdeead033ebb17  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              4083b3aa6c3a4c5  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              4bd036370f2ac5d  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              50467ac42828e9d  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              17faaccd5554f00  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              5bfa754b209122e  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              78d3f4fa2920ac2  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              2fcbbb020e2cf0a  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              2cf255430ced865  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              385ac7317e336fc  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              633b90226eaac79  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              36fe9f8b78ce7b2  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              61b9d37358c31de  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              30a1736605b2c28  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              6fe7902c4ae25f0  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              7a3168e81aabb71  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              4bf8f6e05a1efc9  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              6665165a33c355e  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              3446ee9b07593d4  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              145209e01d8b3d4  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              45a79c7c44628cf  0x8 (alaw)       No       Rx: INVITE
45 active SIP dialogs
mg0:~#

On this box I have 4xE1 so max I can handle 120 calls.
Before upgrade this MG was rock-stable and had asterisk uptime counted in weeks.

I have one box running 1.6.1.12 and there seems there is no RTP-UDP ports problem.

mg0:~# while true
> do
>  lsof | grep UDP -c ; sleep 10
> done
682
688
691
682
703
730
745
757
781
^C
mg0:~#

Every few hours I need to restart asterisk as it runs out of free ports

/etc/asterisk/rtp.conf
[general]
rtpstart=5000
rtpend=51000
Comments:By: Marcin Kowalczyk (kowalma) 2010-02-04 13:24:07.000-0600

Now no ports for udptl, soon I will run out UDP and no calls will be possible:

Connected to Asterisk 1.6.1.14 currently running on mg0 (pid = 24886)
[Feb  4 20:23:31] WARNING[24908]: udptl.c:975 ast_udptl_new_with_bindaddr: No UDPTL ports remaining
[Feb  4 20:23:31] WARNING[24908]: udptl.c:975 ast_udptl_new_with_bindaddr: No UDPTL ports remaining
[Feb  4 20:23:32] WARNING[24908]: udptl.c:975 ast_udptl_new_with_bindaddr: No UDPTL ports remaining
mg0*CLI>
Disconnected from Asterisk server
mg0:~# while true; do  lsof | grep UDP -c ; sleep 10; done
mg0:~# lsof | grep UDP -c
3048
mg0:~# rasterisk -x "sip show channels"
Peer             User/ANR         Call ID          Format           Hold     Last Message    Expiry
192.168.9.2      mg0              779175b126b70d7  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              50771ed21c74adb  0x0 (nothing)    No
192.168.9.2      (None)           5ffd55a90ab6840  0x0 (nothing)    No       Rx: OPTIONS
192.168.9.2      mg0              531957ae47cc644  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              2bbee6ea143560c  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              04a635961a26c09  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              7feead1b439d3cf  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              25fea24a103abe3  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              2049d934570c8dd  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              65f64c8b3dbcedf  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              3ef441a1045464c  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              24d322d96d9a12e  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              5923ff3a0cd78c5  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              1e87f73c1ffe4bd  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              5acb2392618c9e2  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              091b66132d795dc  0x8 (alaw)       No       Rx: INVITE
192.168.128.179  (None)           b57369500b28102  0x0 (nothing)    No       Rx: REGISTER
192.168.9.2      mg0              4d47a06a6e94e6e  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              0e7e1fd901f1086  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              554b797b019a64c  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              2a4060d22adbdcc  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              47e1bd5d24869aa  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              09f6b1c924860ce  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              77622a737761a0a  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              10e68ee92d54453  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              27d11a4447425a8  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              7ddeb9a25ca52b8  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              238cc6197c664b2  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              6673c667292046f  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              0bbd6dfe5f07ee4  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              03e669db5505714  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              1cf2d2ca7e415e3  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              4f82336a54e761a  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              7f4cbc9f54db153  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              71fb63066775e3f  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              4148bd633aa61be  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              3cf5822643486b0  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              66eeeba57916b19  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              27aa9cae4180c12  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              7c418eb52e8f2cb  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              55d76f1574b17c7  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              183fed4c70e91a9  0x8 (alaw)       No       Rx: INVITE
192.168.9.2      mg0              0a07f95f04ad740  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              21557a351584e41  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              5a54cb1a3683111  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              62259767475a74b  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              083a73a15bed7f2  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              71fc595a0f0b60f  0x8 (alaw)       No       Tx: ACK
192.168.9.2      mg0              37e7f2490680bed  0x8 (alaw)       No       Rx: ACK
192.168.9.2      mg0              58da33f160e8c53  0x8 (alaw)       No       Tx: ACK
50 active SIP dialogs
mg0:~#



By: Elazar Broad (ebroad) 2010-02-04 13:57:09.000-0600

Do you have sip timers enabled?

By: Leif Madsen (lmadsen) 2010-02-04 14:15:59.000-0600

ebroad: also please remember not to move things out of the New state until I've had a chance to triage the issue. Thanks!

By: Marcin Kowalczyk (kowalma) 2010-02-04 14:44:31.000-0600

Yes. I do have session timers enabled as orginate

By: Leif Madsen (lmadsen) 2010-02-04 14:54:57.000-0600

What platform is this running on? (virtual?) -- is this on parallels?

By: Private Name (falves11) 2010-02-04 15:07:46.000-0600

it seems, indeed, like my issue. If lmadsen wants to control a production box of mine and install a probe or some trace version, please contact me. I imagine it is related to sip timers, but it needs to be verified.

By: Marcin Kowalczyk (kowalma) 2010-02-04 15:13:55.000-0600

I'm running Asterisk on regular server, not virtualized.

Debian 4.0 stable (from frozen own APT repo)
Kernel 2.6.27.8

By: Marcin Kowalczyk (kowalma) 2010-02-04 15:21:29.000-0600

I downgraded mg0 box to 1.6.1.12 as I have one box running that version and seems it's OK (but I don't have much traffic there - 10-15 calls at time).
In next 14h rush hours on mg0 start then I will be able to see if it solves problem or not.

By: Marcin Kowalczyk (kowalma) 2010-02-05 02:21:09.000-0600

I've disabled session-timers (refuse) on box doing routing logic as it crashed (running 1.6.1.13) again. Let's see if this solves problem

By: Marcin Kowalczyk (kowalma) 2010-02-05 15:19:16.000-0600

One more thing that popped-up and may be related to this problem. I've saw huge amount of Down channels

Asterisk-node0:~# rasterisk -x "core show channels" | grep -c Down
1207
Asterisk-node0:~#

and they do not react on soft hangup


Asterisk-node0:~# rasterisk -x "core show channel SIP/20033-000079c8"
-- General --
          Name: SIP/20033-000079c8
          Type: SIP
      UniqueID: 1265359356.32092
     Caller ID: 20033
   DNID Digits: 0207001800532311111111
      Language: pl
         State: Down (0)
         Rings: 0
 NativeFormats: 0x8 (alaw)
   WriteFormat: 0x8 (alaw)
    ReadFormat: 0x8 (alaw)
WriteTranscode: No
 ReadTranscode: No
1st File Descriptor: 875
     Frames in: 0
    Frames out: 0
Time to Hangup: 0
  Elapsed Time: 12h35m22s
 Direct Bridge: <none>
Indirect Bridge: <none>
--   PBX   --
       Context: out
     Extension: 0207001800532311111111
      Priority: 1
    Call Group: 0
  Pickup Group: 0
   Application: (N/A)
          Data: (None)
   Blocking in: (Not Blocking)
     Variables:
RTPAUDIOQOSRTT=Not available
RTPAUDIOQOSLOSS=lost=1;expected=1;
RTPAUDIOQOSJITTER=rxjitter=0.000000;
RTPAUDIOQOS=ssrc=1287011820;themssrc=0;rxjitter=0.000000;rxcount=0;txcount=0;
SIPCALLID=13e4f3d237526ced7bda6b612eb01126@192.168.20.33
SIPDOMAIN=sip.local
SIPURI=sip:20033@192.168.20.33

 CDR Variables:
level 1: channel=SIP/20033-000079c8
level 1: start=2010-02-05 09:42:36
level 1: duration=45322
level 1: billsec=0
level 1: disposition=NO ANSWER
level 1: amaflags=DOCUMENTATION
level 1: uniqueid=1265359356.32092

Asterisk-node0:~# rasterisk -x "core show channels" | grep -v Down
Channel              Location             State   Application(Data)
1207 active channels
0 active calls
95459 calls processed
Asterisk-node0:~#



By: Marcin Kowalczyk (kowalma) 2010-02-06 07:49:34.000-0600

1.6.1.13 + session timers = UDP port leak
1.6.1.13 without session timers looks OK
1.6.1.12 + session timers = OK

By: Alekzander (alekz) 2010-02-08 05:39:40.000-0600

I've issued the same problem on 1.6.1.14 with session-timers = refuse in sip.conf.
2.6.29-gentoo-r5

By: Marcin Kowalczyk (kowalma) 2010-02-08 12:25:17.000-0600

I can confirm that 1.6.1.13 with session-timers set to refuse is also affected.

By: Marcus Hunger (fnordian) 2010-02-10 07:55:56.000-0600

I can confirm this issue. In our tests, this happens when an unanswered call was hung up. Debugging showed an unmatched ao_ref causing the leak.

Here are two ao_ref-debug-traces, the first one shows the current behavior, the second shows the old one:

0x85df2e8 =1   chan_sip.c:6805:sip_alloc (allocate a dialog(pvt) struct)                                                                      
0x85df2e8 +1   chan_sip.c:6938:sip_alloc (link pvt into dialogs table) [@1]                                                                    
0x820f758 +1   chan_sip.c:4633:find_peer (ao2_find in peers_by_ip table) [@2]                                                                  
0x820f758 -1   chan_sip.c:2772:unref_peer (check_peer_ok: unref_peer: tossing temp ptr to peer from find_peer) [@3]                            
0x85df2e8 +1   chan_sip.c:6382:sip_new (sip_new: set chan->tech_pvt to i) [@2]                                                                
0x820f758 +1   chan_sip.c:4626:find_peer (ao2_callback in peers table) [@2]                                                                    
0x820f758 -1   chan_sip.c:2772:unref_peer (unref peer pointer from find_peer call in st_get_se (2)) [@3]                                      
0x820f758 +1   chan_sip.c:4626:find_peer (ao2_callback in peers table) [@2]                                                                    
0x820f758 -1   chan_sip.c:2772:unref_peer (unref peer pointer from find_peer call in st_get_mode) [@3]                                        
0x85df2e8 +1   chan_sip.c:3838:update_provisional_keepalive (Increment refcount to pass dialog pointer to sched callback) [@3]                
0x85df2e8 -1   chan_sip.c:21115:handle_request_do (throw away dialog ptr from find_call at end of routine) [@4]                                
0x820f758 +1   chan_sip.c:4626:find_peer (ao2_callback in peers table) [@2]                                                                    
0x820f758 -1   chan_sip.c:2772:unref_peer (unref_peer, from sip_devicestate, release ref from find_peer) [@3]                                  
0x85df2e8 -1   chan_sip.c:5690:sip_hangup (unref ast->tech_pvt) [@3]                                                                          
0x85df2e8 +1   chan_sip.c:3652:sip_scheddestroy (setting ref as passing into ast_sched_add for __sip_autodestruct) [@2]                        

0x85df2e8 +1   chan_sip.c:3531:__sip_reliable_xmit (__sip_reliable_xmit: setting pkt->owner) [@3]                                              
0x85df2e8 +1   chan_sip.c:7058:find_call (ao2_find in dialogs) [@4]                                                                            
0x85df2e8 -1   chan_sip.c:3730:__sip_ack (unref pkt cur->owner dialog from sip ack before freeing pkt) [@5]                                    
0x85df2e8 -1   chan_sip.c:21115:handle_request_do (throw away dialog ptr from find_call at end of routine) [@4]                                
0x820f758 +1   chan_sip.c:4626:find_peer (ao2_callback in peers table) [@2]                                                                    
0x820f758 -1   chan_sip.c:2772:unref_peer (unref_peer, from sip_devicestate, release ref from find_peer) [@3]                                  
0x85df2e8 +1   chan_sip.c:2820:dialog_unlink_all (Let's bump the count in the unlink so it doesn't accidentally become dead before we are done) [@3]
0x85df2e8 -1   chan_sip.c:2822:dialog_unlink_all (unlinking dialog via ao2_unlink) [@4]                                                        
0x85df2e8 -1   chan_sip.c:2877:dialog_unlink_all (Let's unbump the count in the unlink so the poor pvt can disappear if it is time) [@3]      
0x85df2e8 -1   chan_sip.c:3631:__sip_autodestruct (The ref to a dialog passed to this sched callback is going out of scope; unref it.) [@2]


0x82132c8 =1   chan_sip.c:6758:sip_alloc (allocate a dialog(pvt) struct)                                                                      
0x82132c8 +1   chan_sip.c:6891:sip_alloc (link pvt into dialogs table) [@1]                                                                    
0x8242ed0 +1   chan_sip.c:4624:find_peer (ao2_find in peers_by_ip table) [@2]                                                                  
0x8242ed0 -1   chan_sip.c:2771:unref_peer (check_peer_ok: unref_peer: tossing temp ptr to peer from find_peer) [@3]                            
0x82132c8 +1   chan_sip.c:6335:sip_new (sip_new: set chan->tech_pvt to i) [@2]                                                                
0x8242ed0 +1   chan_sip.c:4617:find_peer (ao2_callback in peers table) [@2]                                                                    
0x8242ed0 -1   chan_sip.c:2771:unref_peer (unref peer pointer from find_peer call in st_get_se (2)) [@3]                                      
0x8242ed0 +1   chan_sip.c:4617:find_peer (ao2_callback in peers table) [@2]
0x8242ed0 -1   chan_sip.c:2771:unref_peer (unref peer pointer from find_peer call in st_get_mode) [@3]
0x82132c8 +1   chan_sip.c:3829:update_provisional_keepalive (Increment refcount to pass dialog pointer to sched callback) [@3]
0x82132c8 -1   chan_sip.c:21039:handle_request_do (throw away dialog ptr from find_call at end of routine) [@4]
0x8242ed0 +1   chan_sip.c:4617:find_peer (ao2_callback in peers table) [@2]
0x8242ed0 -1   chan_sip.c:2771:unref_peer (unref_peer, from sip_devicestate, release ref from find_peer) [@3]
0x82132c8 -1   chan_sip.c:5656:sip_hangup (unref ast->tech_pvt) [@3]
0x82132c8 +1   chan_sip.c:3643:sip_scheddestroy (setting ref as passing into ast_sched_add for __sip_autodestruct) [@2]
0x82132c8 -1   chan_sip.c:3856:send_response (when you delete the provisional_keepalive_sched_id, you should dec the refcount for the stored dialog ptr) [@3]
0x82132c8 +1   chan_sip.c:3528:__sip_reliable_xmit (__sip_reliable_xmit: setting pkt->owner) [@2]
0x82132c8 +1   chan_sip.c:7011:find_call (ao2_find in dialogs) [@3]
0x82132c8 -1   chan_sip.c:3721:__sip_ack (unref pkt cur->owner dialog from sip ack before freeing pkt) [@4]
0x82132c8 -1   chan_sip.c:21039:handle_request_do (throw away dialog ptr from find_call at end of routine) [@3]
0x8242ed0 +1   chan_sip.c:4617:find_peer (ao2_callback in peers table) [@2]
0x8242ed0 -1   chan_sip.c:2771:unref_peer (unref_peer, from sip_devicestate, release ref from find_peer) [@3]
0x82132c8 +1   chan_sip.c:2817:dialog_unlink_all (Let's bump the count in the unlink so it doesn't accidentally become dead before we are done) [@2]
0x82132c8 -1   chan_sip.c:2819:dialog_unlink_all (unlinking dialog via ao2_unlink) [@3]
0x82132c8 -1   chan_sip.c:2874:dialog_unlink_all (Let's unbump the count in the unlink so the poor pvt can disappear if it is time) [@2]
0x82132c8 -1   chan_sip.c:3622:__sip_autodestruct (The ref to a dialog passed to this sched callback is going out of scope; unref it.) [@1]    
0x82132c8 **call destructor** chan_sip.c:3622:__sip_autodestruct (The ref to a dialog passed to this sched callback is going out of scope; unref it.)


You can see that trace two lacks the unref of send_response(). The function does not call dialog_unref, because p->provisional_keepalive_sched_id is already -1.

We walked through svn and found out, that revision 234533 introduced a change in sip_hangup that set the sched_id to -1. Reverting seems to fix the problem, but we haven't tested it fully. We also did not investigate in the issue of rev 234533, so I guess that the problem of 183 after hangup will occur after revert.

Hope that helps.



By: S Chernyshevych (wslch) 2010-02-16 03:33:31.000-0600

I can confirm that with Asterisk 1.6.2.2 the same problem:
WARNING[25170]: udptl.c:975 ast_udptl_new_with_bindaddr: No UDPTL ports remaining

OS: CentOS 5.4
no virtual server
with session timers

By: Armin Schindler (armin) 2010-02-16 06:29:06.000-0600

Maybe it is related, I have a similar problem with asterisk 1.4.28. As far as I can tell, with previous versions like 1.4.24 the problem didn't occur:
RTP UDP ports are still in use, 'sip show channels' still shows channels with 'INVITE' that aren't used any more and when that happens, chan_sip refuses to unload because of active channels (but there are no active channels for hours any more).

By: Armin Schindler (armin) 2010-02-18 03:46:17.000-0600

With version 1.4.28 it is easy to reproduce:
Start SIP client to register at asterisk, then disconnect that SIP phone from network. In the time the registration is still active in asterisk, call this phone. Asterisk will send INVITEs (of course with no answer), then hangup after about 30 seconds. The asterisk channels are released, but the sip channel for that "Init: INVITE" is not released.
If this is not related to this PR, we need to open a new one.

By: Armin Schindler (armin) 2010-02-18 15:10:19.000-0600

The issue with asterisk 1.4.28 seems to be fixed in 1.4.29. After updating I was not able to reproduce the error again. Sorry for the noise.

By: Leif Madsen (lmadsen) 2010-02-23 09:36:29.000-0600

OK, I'm leaving this open as I think 1.6.2.2 and 1.4.29 were released at about the same time (if not the same time) so should both work the same.

If someone hasn't tried the latest release candidates (1.6.1.15-rc2 and 1.6.2.3-rc2) that might be worth it as well to try out to see if the issue is already resolved.

By: S Chernyshevych (wslch) 2010-02-23 14:27:33.000-0600

I tried Asterisk 1.6.2.4 - the problem remained. Ports are not freed!!!

By: Alekzander (alekz) 2010-02-24 01:49:25.000-0600

I've tried Asterisk 1.6.1.16 - it steel remain.

By: Leif Madsen (lmadsen) 2010-02-24 09:56:06.000-0600

Please note that 1.6.2.4 and 1.6.1.16 are NOT new releases! They are security releases based on 1.6.2.2 and 1.6.1.14 per the release announcement. If you want to test the latest release candidates with changes beyond 1.6.2.2 or 1.6.1.14, you need to test 1.6.2.3-rc2 or 1.6.1.15-rc2.

Alternatively, you can test the latest code in the branches themselves by checking out from SVN:

svn co http://svn.asterisk.org/svn/asterisk/branches/1.6.2

-or-

svn co http://svn.asterisk.org/svn/asterisk/branches/1.6.1

By: Olle Johansson (oej) 2010-02-25 07:34:58.000-0600

Test the latest code and please enable sip history and try to get history of all the open channels. Also enable dumphistory so we can get the history of channels that close. THanks.

By: Private Name (falves11) 2010-02-25 07:55:09.000-0600

I can reproduce the issue in Trunk from last night, in less than a minute. I start a campaign, all the calls fail, get rejected on the far end. The UDP count goes to 1500. I stop the campaign, nothing else moves, and after one hour the same UDP count remains.
I can let the developers log into my box and do whatever. The campaign can be started and stopped remotely via the web. The same issue is in Trunk as it is in 1.6. 1.4 does not have any issue. I am desperate to start using 1.6, for I need a feature. Please contact me  via phone at nine five four, 444 seven four ohh eight

By: Private Name (falves11) 2010-02-26 12:59:59.000-0600

I tested Trunk from this morning ans still has the UDP leak. Any ideas? I need the features as a matter of business survival.

By: Olle Johansson (oej) 2010-02-26 13:01:34.000-0600

Falves11: I asked for "sip history" above. If it's a matter of business survival, there's paid support from Digium and many other sources.

By: Marcin Kowalczyk (kowalma) 2010-02-26 14:10:26.000-0600

Add following:
recordhistory=yes
dumphistory=yes

into sip.conf, make sure you have
debug => debug
uncommented in logger.conf.


Reload configuration and then rasterisk -x "sip set history on"


@oej - I've just enabled requested debugs on my MG box (trunk rev249231), now I have very low traffic but I beleive in next 10-12h I should have enough hanged ports.

By: Olle Johansson (oej) 2010-02-27 01:12:54.000-0600

Turn on debug channel in logger conf, turn on siphistory and dumphistory in sip.conf.

By: Private Name (falves11) 2010-02-28 15:44:01.000-0600

SVN-trunk-r249449 still has the issue. I am testing trunk, but 1.6.X is the same story. Once we find the bug, we can fix it in both.

By: Private Name (falves11) 2010-03-01 04:08:33.000-0600

This is the entire dialplan, which corresponds to the file full1.gz and deug1.gz

[general]
static=>yes
writeprotect=no
extenpatternmatchnew =yes
clearglobalvars=yes

[inbound]
exten =>_X.,1,Hangup(34)

By: Private Name (falves11) 2010-03-01 17:29:10.000-0600

SVN-branch-1.6.0-r249673 does bot have the UDP bug.
I want to use it in production, but it has another bug precisely in what I need to use. I need that the timer timerb be equal to 6000, and I set it at that value in sip.conf
timerb=6000

but when I type "sip show settings"  it shows 32000, the default value. Please advise.

By: Private Name (falves11) 2010-03-01 18:27:22.000-0600

SVN-branch-1.6.1-r249674 Also shows Timerb=32000 when I have 6000 in sip.conf.

By: Private Name (falves11) 2010-03-01 18:36:18.000-0600

SVN-branch-1.6.2-r249675  also has the same issue with timerb. Is there a workaround?

By: Private Name (falves11) 2010-03-01 20:23:42.000-0600

The mistery of timerb is solved. If it is lower than T1*64, the code sets it back  to the default. I changed T1 to 100, and timerb went to 6400. There is no way to force it, it seems. I think it is not correct.



By: Private Name (falves11) 2010-03-01 21:10:54.000-0600

I am stuck again. In SVN-branch-1.6.1-r249674, the function ${CHANNEL(recvip)} was eliminated and returns a null. The command "core show function CHANNEL confirms that this option, recvip, is gone, so how do I read the signalling source IP? Also, "peerip" is gone.

Mar  1 21:57:54] DEBUG[34273] pbx.c: return because scoreboard has a match with '/'--- _X.
[Mar  1 21:57:54] DEBUG[34273] pbx.c: returning an exact match-- _X.
[Mar  1 21:57:54] WARNING[34273] func_channel.c: Unknown or unavailable item requested: 'recvip'
[Mar  1 21:57:54] DEBUG[34273] pbx.c: Function result is '(null)'
[Mar  1 21:57:54] DEBUG[34273] pbx.c: Expression result is '1'
[Mar  1 21:57:54] DEBUG[34273] pbx.c: Launching 'GotoIf'

By: Leif Madsen (lmadsen) 2010-03-03 12:37:19.000-0600

falves11: per the mailing list, the ${CHANNEL(recvip)} never existed, and was in ${SIPCHANINFO(...)} or something like that.

This issue will be resolved as time and resources are available. As per oej, if this is something that is critical to your business, there are many ways to move this issue forward more quickly.

By: Maciej Krajewski (jamicque) 2010-03-12 17:22:39.000-0600

Does anyone know if 1.6.1.18 is free of UDP port leak bug?

By: Tilghman Lesher (tilghman) 2010-03-15 11:57:25

Patch uploaded against 1.6.1.

By: Private Name (falves11) 2010-03-15 15:43:36

The patch does work. I tested with and without the patch and in fact the patch keeps the ratio around 10 UDP sockets per open call, using SIP timers. Without the patch it just goes up and up.

Please feel free to merge it.

By: Maciej Krajewski (jamicque) 2010-03-16 04:59:47

I can confirm it. Patch works like a charm

By: Private Name (falves11) 2010-03-16 10:45:53

Question, when this patch is applied, will it be also applied to 1.6.2 and Trunk? All those versions need it.

By: mvesco (mvesco) 2010-03-19 05:06:26

I just tested 1.6.1.18 + the patch mentioned above and I still have the problem...

sip show settings shows me:
 Timer B:                32000

By: Private Name (falves11) 2010-03-22 09:16:11

I tested the patch against  SVN-branch-1.6.1-r252625M and it works.

By: Henning Holtschneider (hehol) 2010-03-23 03:16:55

I tested the patach against 1.6.1.16 and it solves the problem.

By: Danilo Lotina F. (dlotina) 2010-03-23 06:42:30

Tested on 1.6.1.14 and 1.6.1.18, it solves the issue.

By: Kenneth Van Velthoven (kvveltho) 2010-03-24 13:41:32

Concerning UDP's not freed, I've tested the patch on 1.6.1.18 (low load system).
At night when no more calls are active I still notice open UDP ports:
(extract from netstat -a)

udp 0 0 *:42755 *:*
udp 0 0 *:14596 *:*
udp 0 0 *:16132 *:*
udp 0 0 *:14597 *:*
udp 0 0 *:16133 *:*
udp 0 0 *:14474 *:*
udp 0 0 *:16522 *:*
udp 0 0 *:14475 *:*
udp 0 0 *:16523 *:*
udp 0 0 *:18444 *:*
udp 0 0 *:16524 *:*
udp 0 0 *:18445 *:*
udp 0 0 *:16525 *:*
udp 0 0 *:15886 *:*
udp 0 0 *:15887 *:*
udp 0 0 *:14994 *:*
udp 0 0 *:14995 *:*
udp 0 0 *:19862 *:*
udp 0 0 *:19863 *:*

Is this normal?   I just want to be sure before re-installing 1.6.1.18 on our heavy-load server (>100000 calls/day)

By: Tilghman Lesher (tilghman) 2010-03-24 14:01:05

What would be most helpful right now is for someone to get a packet capture of the exact SIP dialog that causes this leak to occur.  I'm having trouble reproducing it, which makes writing a regression test for it difficult.

By: mvesco (mvesco) 2010-03-26 06:49:06

I'm having trouble reproducing it as well. With sip debugging enabled, I was hoping to find some of the leaking RTP sockets in the SDP headers but that was not the case.

In addition, I wanted to point out that IMHO we clearly see two sources of leaking sockets. One that was recently fixed and one that shows up less often and does not affect that many users.

How to proceed?

By: mvesco (mvesco) 2010-03-30 11:37:46

asterisk: rev 253624 from 1.6.1 stable branch + 20100315__issue16774.diff applied

I did some experiments today and found a way to reproduce the problem with the help of sipp. Unfortunately, running tcpdump AND sipp at the same time makes the problem disappear. So the only thing I got is:

maier*CLI> core show channels verbose
Channel              Context              Extension        Prio State   Application  Data                      CallerID        Duration Accountcode BridgedTo          
0 active channels
0 active calls
2274 calls processed
maier*CLI> sip show channels
Peer             User/ANR         Call ID          Format           Hold     Last Message    Expiry
0 active SIP dialogs
maier*CLI> ! lsof -np $(pidof asterisk) | grep UDP | wc -l
19
maier*CLI> ! lsof -np $(pidof asterisk) | grep UDP
asterisk 19207 root   10u  IPv4           18523876                UDP *:sip
asterisk 19207 root 1928u  IPv4           18538844                UDP *:18820
asterisk 19207 root 2132u  IPv4           18538845                UDP *:18821
asterisk 19207 root 2133u  IPv4           18538846                UDP *:4559
asterisk 19207 root 3410u  IPv4           18538436                UDP *:15898
asterisk 19207 root 3411u  IPv4           18538437                UDP *:15899
asterisk 19207 root 3412u  IPv4           18538438                UDP *:4313
asterisk 19207 root 3708u  IPv4           18538849                UDP *:16814
asterisk 19207 root 3709u  IPv4           18538850                UDP *:16815
asterisk 19207 root 3710u  IPv4           18538851                UDP *:4157
asterisk 19207 root 3713u  IPv4           18538855                UDP *:17768
asterisk 19207 root 3714u  IPv4           18538856                UDP *:17769
asterisk 19207 root 3715u  IPv4           18538857                UDP *:4992
asterisk 19207 root 3718u  IPv4           18538863                UDP *:18890
asterisk 19207 root 3719u  IPv4           18538864                UDP *:18891
asterisk 19207 root 3720u  IPv4           18538865                UDP *:4809
asterisk 19207 root 3723u  IPv4           18538869                UDP *:15996
asterisk 19207 root 3724u  IPv4           18538870                UDP *:15997
asterisk 19207 root 3725u  IPv4           18538871                UDP *:4507

=> 18 leaking UDP (RTP) sockets

By: snuffy (snuffy) 2010-03-30 14:10:36

If you have a sipp test that can re-create it, please upload it to the issue.

By: mvesco (mvesco) 2010-03-31 11:30:38

Please find scenarios for uas and uac in sipp_scenario.tar.gz. I am running the test with:

uas: sipp -sf scenarios/uas.xml -p 5061 -mi $(hostname -f)
uac: sipp -sf scenarios/uacSimple.xml -inf simpleCalls.csv -l 1000 -r 100 -rp 10 asterisk.domain.ltd

I am sorry I cannot provide simpleCalls.csv file but I am convinced you'll find out what you need for further testing.

After about 20 seconds I'll stop the uac since I assume that the leaking UDP sockets show up for canceled SIP sessions that were just about to being established... but that is just guess.



By: Tilghman Lesher (tilghman) 2010-03-31 13:49:30

Actually, if you'll check the reviewboard link, you'll find that there's now a repeatable scenario attached there.

By: Henning Holtschneider (hehol) 2010-04-06 14:21:03

After 10 days of running Asterisk with the patch, I'm not so sure if it really fixes all UDP port leaks:

asterisk*CLI> core show uptime
System uptime: 1 week, 3 days, 21 hours, 32 minutes, 47 seconds
Last reload: 1 week, 3 days, 21 hours, 32 minutes, 47 seconds
asterisk*CLI> core show calls
43 active calls
128797 calls processed

But:

root@asterisk:~# lsof -i | grep asterisk | grep "UDP " | wc -l
5320

Can anyone else who applied the patch confirm or deny the port leak after a longer uptime?

By: Private Name (falves11) 2010-04-06 14:24:36

I am running with the patch and I just had to reboot the box after a few days. Even "core restart now" would not work.
So it seems to work, or there is some other leak unplugged.

By: Digium Subversion (svnbot) 2010-04-13 14:17:49

Repository: asterisk
Revision: 257191

U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r257191 | tilghman | 2010-04-13 14:17:49 -0500 (Tue, 13 Apr 2010) | 10 lines

Also unref the pvt when we delete the provisional keepalive job.

(closes issue ASTERISK-15581)
Reported by: kowalma
Patches:
      20100315__issue16774.diff.txt uploaded by tilghman (license 14)
Tested by: falves11, jamicque

Review: https://reviewboard.asterisk.org/r/591/

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=257191

By: Digium Subversion (svnbot) 2010-04-13 14:20:43

Repository: asterisk
Revision: 257206

_U  branches/1.6.0/
U   branches/1.6.0/channels/chan_sip.c

------------------------------------------------------------------------
r257206 | tilghman | 2010-04-13 14:20:43 -0500 (Tue, 13 Apr 2010) | 17 lines

Merged revisions 257191 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r257191 | tilghman | 2010-04-13 14:17:48 -0500 (Tue, 13 Apr 2010) | 10 lines
 
 Also unref the pvt when we delete the provisional keepalive job.
 
 (closes issue ASTERISK-15581)
  Reported by: kowalma
  Patches:
        20100315__issue16774.diff.txt uploaded by tilghman (license 14)
  Tested by: falves11, jamicque
 
 Review: https://reviewboard.asterisk.org/r/591/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=257206

By: Digium Subversion (svnbot) 2010-04-13 14:20:54

Repository: asterisk
Revision: 257208

_U  branches/1.6.1/
U   branches/1.6.1/channels/chan_sip.c

------------------------------------------------------------------------
r257208 | tilghman | 2010-04-13 14:20:54 -0500 (Tue, 13 Apr 2010) | 17 lines

Merged revisions 257191 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r257191 | tilghman | 2010-04-13 14:17:48 -0500 (Tue, 13 Apr 2010) | 10 lines
 
 Also unref the pvt when we delete the provisional keepalive job.
 
 (closes issue ASTERISK-15581)
  Reported by: kowalma
  Patches:
        20100315__issue16774.diff.txt uploaded by tilghman (license 14)
  Tested by: falves11, jamicque
 
 Review: https://reviewboard.asterisk.org/r/591/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=257208

By: Digium Subversion (svnbot) 2010-04-13 14:21:06

Repository: asterisk
Revision: 257210

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_sip.c

------------------------------------------------------------------------
r257210 | tilghman | 2010-04-13 14:21:05 -0500 (Tue, 13 Apr 2010) | 17 lines

Merged revisions 257191 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r257191 | tilghman | 2010-04-13 14:17:48 -0500 (Tue, 13 Apr 2010) | 10 lines
 
 Also unref the pvt when we delete the provisional keepalive job.
 
 (closes issue ASTERISK-15581)
  Reported by: kowalma
  Patches:
        20100315__issue16774.diff.txt uploaded by tilghman (license 14)
  Tested by: falves11, jamicque
 
 Review: https://reviewboard.asterisk.org/r/591/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=257210

By: Digium Subversion (svnbot) 2010-05-11 14:35:37

Repository: asterisk
Revision: 262418

U   branches/1.6.0/channels/chan_sip.c

------------------------------------------------------------------------
r262418 | tilghman | 2010-05-11 14:35:37 -0500 (Tue, 11 May 2010) | 8 lines

Backport fix for issue ASTERISK-15581 to 1.6.0 (refcount leak with provisional responses)

(closes issue ASTERISK-15097)
Reported by: jsutton
Patches:
      20100504__issue16202.diff.txt uploaded by tilghman (license 14)
Tested by: tilghman

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=262418