[Home]

Summary:ASTERISK-14576: [patch] Asterisk runs out of sockets
Reporter:Private Name (falves11)Labels:
Date Opened:2009-07-31 17:12:09Date Closed:2010-01-27 18:50:32.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/Netsock
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20091209__issue15627__1.4.diff.txt
( 1) 20091209__issue15627__1.6.0.diff.txt
( 2) bug15714_1.4.diff
( 3) bug15714_ver1.6.2.diff
( 4) bugtrace.txt
( 5) chan_sip.c.rej
( 6) chan_sip.c-1.6.2.0-beta4.diff
( 7) crash_asterisk-1x.txt
( 8) issue_15627_1.6.1.patch
Description:The Parallels engineers have found a bug that takes down asterisk because the server runs out of sockets, and also it degrades the performance because over time it takes more and more time for the processor to find an empty socket. The load on the processor grows over time,


****** ADDITIONAL INFORMATION ******

This is what they saw:

"As i can see, the number of UDP sockets are more or less permanently grows:

cat /proc/net/sockstat
sockets: used 90645
TCP: inuse 1242 orphan 0 tw 117 alloc 1322 mem 150
UDP: inuse 88881 mem 6744
RAW: inuse 0
FRAG: inuse 0 memory 0



Processors again spend more and more time to get need socket out from 86366 "inuse".However the traffic not growing and much less now:

[root@host-208 ~]# vzstat -l
 5:23pm, up  3:17,  5 users, load average: 24.79, 25.45, 18.39
CTNum 23, procs 1762: R  22, S 1731, D   0, Z   0, T   0, X   9
CPU [ OK ]: CTs 100%, CT0   0%, user  33%, sys   8%, idle  60%, lat(ms)  85/0
Mem [ OK ]: total 128737MB, free 124755MB/0MB (low/high), lat(ms) 7/2
 ZONE0 (DMA): size 9MB, act 0MB, inact 0MB, free 10MB (0/0/0)
 ZONE1 (DMA32): size 2986MB, act 0MB, inact 0MB, free 2789MB (1/1/1)
 ZONE2 (Normal): size 126000MB, act 2756MB, inact 528MB, free 121955MB (43/54/65)
 Mem lat (ms): A0 0, K0 5, U0 0, K1 7, U1 0
 Slab pages: 437MB/437MB (ino 117MB, de 43MB, bh 5MB, pb 35MB)
Swap [ OK ]: tot 2055MB, free 2055MB, in 0.000MB/s, out 0.000MB/s
Net [ OK ]: tot: in  3.275MB/s 15778pkt/s, out  1.876MB/s 8706pkt/s
            lo: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
          eth0: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
          eth1: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
          eth2: in  3.275MB/s 15778pkt/s, out  1.876MB/s 8706pkt/s
          eth3: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
          sit0: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
    veth9102.0: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
    veth9106.0: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
    veth9100.0: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
    veth9111.0: in  0.000MB/s    0pkt/s, out  0.000MB/s    0pkt/s
Disks [ OK ]: in 0.000MB/s, out 0.480MB/s

vztop -n2 -d3
vztop - 17:26:11 up  3:19,  5 users,  load average: 20.27, 23.55, 18.61
Tasks: 902 total,   1 running, 901 sleeping,   0 stopped,   0 zombie
Cpu(s): 13.4% us,  3.3% sy,  0.0% ni, 27.6% id,  0.0% wa,  0.0% hi, 55.7% si
Mem:  131826772k total,  4050364k used, 127776408k free,   111156k buffers
Swap:  2104472k total,        0k used,  2104472k free,  1559484k cached

CTID   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
9112 39281 root      18   0  584m 232m  11m S  184  0.2 104:47.71 asterisk
9100 39326 root      18   0  581m 209m  11m S  181  0.2  94:24.69 asterisk
9102 39709 root      18   0  577m 207m  11m S  171  0.2  89:55.23 asterisk
9113 39355 root      18   0  578m 212m  11m S  166  0.2  95:13.56 asterisk
9101 16934 root      18   0  595m 212m  11m S  156  0.2  86:46.97 asterisk
9105 39340 root      18   0  407m  70m  11m S   73  0.1  30:09.99 asterisk
9103 39486 root      18   0  402m  66m  11m S   56  0.1  27:52.46 asterisk
9110 39535 root      18   0  401m  65m  11m S   44  0.1  28:23.30 asterisk
9104 39686 root      18   0  396m  65m  11m S   40  0.1  26:59.71 asterisk
9106 39485 root      18   0  396m  63m  11m S   38  0.0  27:29.80 asterisk
9108 39366 root      18   0  405m  70m  11m S   31  0.1  27:43.83 asterisk
   0 39220 root      15   0 11188 1700  824 R    1  0.0   0:00.08 vztop
   0 51361 root      16   0 11188 1712  816 S    1  0.0   1:15.12 vztop
9114 39062 root      15   0 70824  28m  28m S    1  0.0   0:27.20 opensips
9107 39259 root      18   0  277m  17m  10m S    1  0.0   0:08.53 asterisk
9114 39060 root      15   0 70824  28m  28m S    0  0.0   0:27.18 opensips
   0     1 root      15   0 10344  676  568 S    0  0.0   0:02.85 init
   0     2 root      RT  -5     0    0    0 S    0  0.0   0:00.02 migration/0/0


i guess probably asterisk might leave some sockets open instead of closing them after end of calls/meetings/etc.

It looks very much like this bug:

https://issues.asterisk.org/view.php?id=14253

except that we don't have voice mail. All we do is SIP to SIP, 100% of the time.

I need to use 1.6.2 because I rely on Sip Timers. Can somebody look at this issue. My business may go out of business over this bug.
Comments:By: Private Name (falves11) 2009-07-31 17:38:55

From Parallels
when the load average begin to grow, the situation was absolutely different:
2) all CPUs spend many time in "si"(soft interrupts) - 50%-60% of their power they spent trying to find needed socket among 80+ thousands of them.
  At the save time the traffic was no so high - 2-3MB/s "in" and something similar "out".

So the problem in the second situation is not the inability to handle received packets - which can be solved by adding another network card, but the HUGE number UDP sockets _in use_ - so many of them that even extended hash table does not solve it for 100%.
So the question is - why there are so many sockets that are IN USE.
i guess (i cannot be sure cause i don't know internal Asterisk structure) that many of those sockets should have been closed, but were not closed by asterisk for some reason.

So i think it might help - please, ask Asterisk people to check the server and tell they thoughts if so many sockets are normal situation or not.

By: Private Name (falves11) 2009-07-31 18:03:04

I realize that I use Sip timers in my sip.conf:
sip.conf
session-timers=refuse
session-expires=3600
session-minse=90
session-refresher=uac

in each one of my peers I use:
session-timers=originate
session-expires=1800
session-minse=90
session-refresher=uas

So maybe the Sip Timers technology is leaking.
Is it possible to set a trace for a minute or so and see what section is allocation UDP sockets and after the call is finished which ones are open?

By: Russell Bryant (russell) 2009-08-01 05:51:26

I understand that when you have bugs, it is important to you to resolve them.  However, you have really got to stop putting stuff like this in your bug reports:

"Can somebody look at this issue. My business may go out of business over this bug."

It is completely inappropriate for this forum.

By: Private Name (falves11) 2009-08-01 05:57:55

Ok, thanks. I take note.

By: Kristijan Vrban (vrban) 2009-08-01 07:36:58

Are you using SIP over TCP?

You can use the test tool sipp (http://sipp.sourceforge.net)
"sipp uac -i <your-ip> -t t1 <asterisk-ip> -s 78 -trace_err -d 1s"
to test it with TCP and:
"sipp uac -i <your-ip> -t u1 <asterisk-ip> -s 78 -trace_err -d 1s"
with UDP, to see if there is a difference. Also with activ SIP-timer and disabled SIP-timer to narrow down this.

You said that 1.6.2-r197190 does not have that issue. then try the revision in the middel between 197190 and 209626 -> 203408 test this, if you have also the same error, then middel between 203408 and 197190 -> 200299... need some time this work, but then you find the commit that introduced this issue.

By: Private Name (falves11) 2009-08-01 10:03:22

Actually now that I think about it, SIP timers has nothing to do with this. If I disable full-media-proxy by including this in my sip.conf:
directrtpsetup=yes
canreinvite=nonat

the issue is non-existent, but SIP Timers are active. The issue of the UDP sockets happens only when I change my sip.conf to:
directrtpsetup=no
canreinvite=no

However, if I try one single call the amount of UDP sockets goes up and down correctly. I imagine there is special kind of call that never finishes "naturally" and thus the SIP Timers or my RTP timeout close it, but then the sockets never get released.
Any ideas where to go from here?

By: Private Name (falves11) 2009-08-01 15:28:04

This problem seems identical to bug:
https://issues.asterisk.org/view.php?id=9235

I wonder if version 1.6.X did get the same patches.

By: Private Name (falves11) 2009-08-04 14:33:42

All versions of Asterisk leak UDP socket handles, from the Business Edition, 1.4 and 1.6.2. I tested them all. Additionally, the 1.4 svn crashes and I just uploaded a trace.

By: Dmitry Andrianov (dimas) 2009-08-05 17:52:21

Isn't stating that ALL versions of Asterisk leak handles a bit too much?
I have 1.4.2X systems with
System uptime: 13 weeks, 11 hours, 54 minutes, 43 seconds
and
# cat /proc/net/sockstat
sockets: used 137
TCP: inuse 15 orphan 0 tw 0 alloc 17 mem 8
UDP: inuse 14
RAW: inuse 0
FRAG: inuse 0 memory 0

The load to this system is very light (like 20 calls a day) but still it is clear it is not leaking udp sockets. So this can be specific to your environment that even 1.4 leaks. Can you reproduce your problem on clean asterisk with one-by-one calls? Like you make call, hangup, and inuse sockets increase. Then it stays more or less the same and then you make next call and number increases again?

By: Private Name (falves11) 2009-08-08 13:35:48

After processing 4500 calls and having only 4 calls open, I have 120 UDP sockets taken. Please look at the relationship, 30 sockets per open call. It means that the sockets are not been freed. This is version 1.6.2 from today.
SVN-branch-1.6.2-r211122. But I installed 1.4 SVN and the result was similar. I need to use 1.6.2 because of the Sip Timers. I am not using H323, only SIP to SIP, nothing else.

By: Dmitry Andrianov (dimas) 2009-08-08 18:47:25

Well, 120 sockets after 4500 calls means not each call leaks a socket.
If you can find what calls are causing this (like a call with video on SDP offer, or with T38 or a cancelled call - anything) then someone will easily fix the bug. Otherwise it is almost impossible unless someone can reproduce exactly the same problem....

By: Private Name (falves11) 2009-08-09 00:01:44

My socket count gets to 80.000 on a single day and the server collapses. I process over a million of calls per day. If the culprit cannot be found, then Asterisk should provide a "core clean sockets" function that will free any sockets not in use by any object. I would run htat very hour and the system would be stable. I noticed that Asterisk constantly drops calls because RTP timeout, which I set to 30 seconds. This could be the issue, the leaking mechanism. In any case we could create a version that logs what function allocates what socket number, and at the end of the day we could print "lsof -i | grep UDP" and see what sockets are still alive. I don't have any calls after 8 PM. Or I could change the inbound bindport for SIP and wait until no more calls are alive (my calls are very short), and then compare what sockets are still open. But how do I get a log of what function name allocates what socket number? Can somebody provide a patch that stores to a file the name of the function and the UDP port allocted?

By: Private Name (falves11) 2009-08-09 00:18:17

lsof -i | grep  UDP | wc -l
97
asterisk -rx "core show hannels"                                                                
Channel              Location             State   Application(Data)            
0 active channels
0 active calls
6813 calls processed

By: Private Name (falves11) 2009-08-10 23:17:08

Asterisk SVN-branch-1.4-r211112 has the same issue. At the end of the day, with no calls open or in process, the count of sockets in use as reported by linux is high. I suggest a new CLI command be created to cleanup any sockets that are not attached to any open objects. For example: "core release sockets", which will somehow check with Linux, get the list of open sockets open by the same process and tell Linux to release the ones that are not attached to any object. Is this doable? The other option is to log the function that opens and closes each socket and at the end of the day we could determine what happened.

By: Dmitry Andrianov (dimas) 2009-08-11 06:12:24

There is no way to release "unised" sockets. If Asterisk knew the socket is unused, it would call close() on it at the first place.

If I were you I would try:
1. running asterisk with core set debug 1
2. collecting its output
3. getting list of UDP ports inuse by asterisk
4. comparing this list with the output.

the idea is that when debug is turned on, Asterisk generats log lines like:

Audio is at %s port %d\n
T.38 UDPTL is at %s port %d\n
Video is at %s port %d\n

so you will at least know what ports are stuck - video/audio/udptl.



By: Private Name (falves11) 2009-08-11 06:14:59

I run literally hundreds of thousands of calls per day. The idea is unfeasible.

By: Dmitry Andrianov (dimas) 2009-08-11 06:20:59

There is no need to collect the output for the whole day.
If you state Asterisk leaks 80000 sockets a day, it is about 500 seockets in 10 minutes, 50 sockets a minute...

By: Private Name (falves11) 2009-08-11 07:11:46

how is that you collect the output to a file?

By: Dmitry Andrianov (dimas) 2009-08-11 07:26:15

edit /etc/asterisk/logger.conf

By: Private Name (falves11) 2009-08-11 13:47:50

lsof -i | grep asterisk | wc -l
6958

I figured out what happens. I have only 125 open calls. So my "core show channels" has 250 lines. But my "sip show channels" has more than 3000 lines like this, all identical.

208.37.201.149   0015311773  1bc3fea13c6  00102/00000  0x0 (nothing)    No       Init: INVITE              
208.37.201.149   0015311937  7091da8147d  00102/00000  0x0 (nothing)    No       Init: INVITE              
208.37.201.149   0015311804  5ab9031713c  00102/00000  0x0 (nothing)    No       Init: INVITE              
208.37.201.149   0015311209  5c23b2205e2  00102/00000  0x0 (nothing)    No       Init: INVITE              
208.37.201.149   0015311720  5aab76181da  00102/00000  0x0 (nothing)    No       Init: INVITE              
208.37.201.149   0015311941  09605f3b1fa  00102/00000  0x0 (nothing)    No       Init: INVITE

I wander what does it mean and how can it be fixed.

By: Private Name (falves11) 2009-08-11 14:15:27

I realize that that carrier was down. The question is if they disappear from the network, how long does it take for Asterisk to realize it and return a timeout, and how can I control that. I have now over 5000 SIP dialogs and only 100+ calls.

By: Private Name (falves11) 2009-08-12 01:01:16

This bug that is actually taking me down daily.
Suppose you have a carrier that is out of the network, simply disappeared. His connection is down. Well, Asterisk sends an INVITE 5 times, and then  it keeps for ever the dialog open in "sip show channels", although there is nothing in "show channels". Furthermore, it keeps for ever the UDP ports taken from the OS, and thus it can take down the whole server because if you have thousands of calls going through, soon you will exhaust the Linux UDP port space. It happened to me yesterday.
It is very easy to reproduce. Just sent an INVITE to my home address which is down, 72.187.243.97, and do a "sip set debug ip 72.187.243.97". Then after you see the INVITE being retransmitted 5 times, simply hang-up. You may then issue a "sip show channels" and a "show channels". Then exit Asterisk and type "lsof -i | grep UDP". Those ports stay open for ever.
Asterisk must kill the dialog and release the resources taken by the OS or else the finite state machine is bound to go down and take down the OS at some point.

My is 1.4 SVN, so I think every version is affected.

By: Private Name (falves11) 2009-08-14 04:24:50

Please find attached the log that was requested. I sent a SIP call to an IP address that is down. As you may see, Asterisk sends the same INVITE 5 times and then the call is rejected. That is fine, except that if you look at the end, when I type "sip show channels", there is a dialog open

sip show channels                                                                                                                                                        
Peer             User/ANR    Call ID          Format           Hold     Last Message    Expiry
86.100.217.35    1954444740  OGUyMDc4OWQxMDQ  0x0 (nothing)    No       Rx: ACK                  
72.187.243.97    1256428616  3077f28a76e6e01  0x0 (nothing)    No       Init: INVITE              
2 active SIP dialogs        

The second line shows a dialog that lingers on for ever, and also if you look in Linux with "lsof -i | grep UDP", it shows that two ports get taken and are never returned to the Linux pool, or reused,

This is what happens to me daily. I have a carrier that goes down and since I have many, many calls, I exhaust the UDP pool and the server needs to be recycled.

By: dant (dant) 2009-08-14 05:29:04

I raised bug 0015716 and submitted a patch for lingering INVITEs in trunk, may be related, however, in trunk on looking at lsof nothing is left behind there...

By: Private Name (falves11) 2009-08-14 05:34:28

If can you generate a patch for 1.4 and 1.6, I can test it. Maybe it closes the UPD ports as well. In my test machine I have 1.6.2 loaded. If it works we can test the other versions. I think this is the bug. Congratulations.

By: dant (dant) 2009-08-14 05:58:53

Actually, reverted my patch from my trunk install and double checked... the RTP port and RTP port+1 are being left behind as you described for each hung dialog. This would suggest that these are related if not the exact same issue.

My patch triggers the destruction of the dialog and these sockets disappear too... The code that caused this wasn't present when checked in 1.4, so either it's been added recently, or this shouldn't be a problem in 1.4...

By: Private Name (falves11) 2009-08-14 06:03:18

It happens in 1.4 and 1.6.2. What you are saying is that the patch cannot be ported back to 1.4. Can a Bug Marshall please step in?

By: dant (dant) 2009-08-14 06:13:07

I've updated the patch for 1.6.2.0-beta4 and uploaded here... I'll look at 1.4 later...

By: Private Name (falves11) 2009-08-14 06:24:47

Tha patch failed for my version. When I type "core show version" I get
SVN-branch-1.6.2-r212162. I uploaded the rejects.

By: snuffy (snuffy) 2009-08-14 07:02:15

falves: try the ones i just put up.. they should patch fine
(yes i know the names are wrong but fix later)

By: Private Name (falves11) 2009-08-14 08:18:01

I put the 1.4 in production and tested with one call the 1.6.2 version, which successfully kills the dialogs and the ports.

Let's wait for today's traffic. I hope the Digium Bug Marshalls add this patch to the business edition.
Many thanks.

By: Private Name (falves11) 2009-08-14 11:29:44

I got a couple of crashes but had not compiled it with debug info. I just did that. If it happens again I will upload  the trace. It may be related to the patch because it was very stable so far. A Bug Marshall should check it.

By: Private Name (falves11) 2009-08-15 01:00:02

The patch works perfectly. Please put into the SVN code. It did not miss a single UDP port in over a million calls.

By: Leif Madsen (lmadsen) 2009-08-17 08:09:52

Changing status from "block" to something more appropriate.

By: Private Name (falves11) 2009-08-20 01:39:41

There is still an issue relate to this patch. Some ACK's are never killed like those INVTE's. For example, right now it is 2:30 AM and there has not been any calls since a few hours ago, but I see dozens of lines  like this  this when I type "sip show channels"
208.78.161.210   3864922293  790a13246b2  00103/00102  0x0 (nothing)    No       Tx: ACK

By: Kristijan Vrban (vrban) 2009-10-14 08:44:43

hello, i can confirm this issue with 1.4.47-RC2:

sip show channels show hundreds of this
...
xx4.54.82.156     (None)      630871ae72c  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      087cc7af54d  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      58646ede376  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      00eff63964a  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      6e9e525a515  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      622500aa5e3  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      6745ebb944d  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      6455da20338  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      53ffeb6625f  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      34fcf3d00bc  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      706d4ff20c2  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      27dcda68652  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      375ddeb64ec  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      58ebd5ad0fb  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      17340f296d2  00102/00000  0x0 (nothing)    No       Init: INVITE
xx4.54.82.156     (None)      659277b8082  00102/00000  0x0 (nothing)    No       Init: INVITE
...

And i have hundreds of never closed/open udp/rtp ports.

The patch  bug15714_1.4.diff fixed the issue with 1.4.47-RC2

By: David Brillert (aragon) 2009-10-14 09:12:07

vrban: I assume you meant 1.4.27rc2
Also bug 15714 seems completely unrelated...
Are you sure about the patch name?
Can you post a link to the actual patch?

By: Kristijan Vrban (vrban) 2009-10-14 12:55:14

@aragon
yes, 1.4.27-RC2... not 1.4.47-RC2 :)

The patch you can see in the header of this page at "Attached Files" Not hard to find.

By: David Brillert (aragon) 2009-10-14 12:57:30

hmmm

strange that this bug report ASTERISK-14576 would have an attached patch that does not match bug ID
bug15714_1.4.diff I'd expect to see something like bug15627_1.4.diff



By: Kristijan Vrban (vrban) 2009-10-28 06:45:55

can this patch admitted into svn? after everybody confirmed that the patch fixed the issue.

By: Leif Madsen (lmadsen) 2009-10-28 08:57:31

It is marked as Ready for Review, and will be reviewed (and potentially committed) as soon as time allows.

By: Private Name (falves11) 2009-12-08 19:34:21.000-0600

The patch is too old and it fails to work in the current SVN, today December 8th. Since the patch was posted on 2009-11-06, it has been more than a month. The SVN code is already too far away from this patch. I cannot explain why the code has not been added to the main 1.4 tree. Please, kindly revisit the patch and repost something that can be applied today.

By: Tilghman Lesher (tilghman) 2009-12-09 00:00:32.000-0600

I've modified the patch slightly.  It's the same general idea, except that it will prevent timeout looping for ANY dialog, not just an INVITE.  Given that this is not the first time that we've looked at the potential for looping forever, I'd like to get any remaining edge cases out of the way at the same time.

By: dant (dant) 2009-12-09 00:32:18.000-0600

Tested new patch from tilghman on 1.6.1.11. Appears to work correctly using my previous test case of establish call to an unused IP address and cancel.

History:
1. NewChan         Channel SIP/10.2.9.252-00000002 - from 09093f8413b2db0c6de780f7
2. TxReqRel        INVITE / 102 INVITE - INVITE
3. ReTx            1000 INVITE sip:10.2.9.252 SIP/2.0
4. ReTx            2000 INVITE sip:10.2.9.252 SIP/2.0
5. Cancel          Cause Unknown
6. SchedDestroy    32000 ms
7. DELAY           Not sending cancel, waiting for timeout
8. ReliableXmit    timeout
9. ReliableXmit    timeout

After 32 seconds, at 8. in history 'Last Message' becomes Tx: CANCEL (autodestruct)
After a further 32 seconds at 9. in history Need Destroy flag is set
After a further 32 seconds the dialogue is really destroyed...

Edit to correct some nonsense English above, sorry...



By: Private Name (falves11) 2009-12-09 10:22:23.000-0600

I think that the patch may need additional work. I have 95 open calls, and 381 lines in "Sip Show channels", mostly like below. Please notice that all IPs 38.X.X.X are the same IP. The "lsof - i" count is 1114. It seems we are still leaving the dialogs open beyond their useful life.

38.X.X.X      1561900136  039e5ae1200  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1864233049  64d8e34a0a2  00102/00000  0x0 (nothing)    No       Init: INVITE              
38.X.X.X      1561900136  57f1d06d7f7  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900139  4c305c6e613  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1607936602  4fca0f33551  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900139  46c7e6740dd  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900138  75d18ccc09d  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1303937086  66a498463ad  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900136  6f3634090a2  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900136  38f1a442444  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900136  6e35a8ff279  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
64.X.X.X.X    1410229600  15cff42849c  00102/00000  0x0 (nothing)    No       Tx: CANCEL                
38.X.X.X      1561900136  47eb528900c  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900139  73087185454  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
208.X.X.X.X   8008188424  2d4a5ed89bb  00101/00501  0x0 (nothing)    No       Rx: ACK                  
38.X.X.X      1402660484  1c7284be7ce  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900139  02a1c5e2360  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900137  3b9a32f725b  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900136  3e2504031be  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900136  4481dca7190  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1814943682  089b3a6f3c2  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900139  31d4c0946a4  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900139  7233223d0be  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900138  0df79641545  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900137  23210828386  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900137  03352c58303  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1814943689  1ee65d5103c  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1864242343  3ec9c118304  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
64.X.X.X.X    1646736937  4ca10ef53ff  00102/00000  0x100 (g729)     No       Tx: ACK                  
208.X.X.X.X   2137108116  GW64b0f0cc5  00101/00001  0x100 (g729)     No       Rx: ACK                  
38.X.X.X      1864288129  1183ab5b5a0  00102/00000  0x0 (nothing)    No       Tx: CANCEL (aut          
38.X.X.X      1561900138  53e1d45f2d8  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1810787283  1b1d40c22a2  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
190.X.X.X    7789ASTERISK-15904  638cddef6c5  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL                
38.X.X.X      1480208892  73f5eeab2f9  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900136  4da670681cb  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900137  0998514a08b  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut          
38.X.X.X      1561900137  2fb3675d0a6  00102/00000  0x0 (nothing)    No  (d)  Tx: CANCEL (aut

By: Tilghman Lesher (tilghman) 2009-12-09 10:27:22.000-0600

I understand, but it's better to leave the dialog open a bit longer, just to ensure that we aren't prematurely closing it than to be too aggressive and close it down too early.  If the ports are getting closed within several minutes, I think we've found a good balance between consumption of resources and being too aggressive.

By: Tilghman Lesher (tilghman) 2009-12-09 14:35:38.000-0600

Following code review, the patch has once again been changed.  Retesting is required to verify that it still solves the problem.

By: Private Name (falves11) 2009-12-09 15:00:22.000-0600

It has been running for while and I have close a 80% of the file handles than before. It is much better.



By: Private Name (falves11) 2009-12-10 02:14:07.000-0600

I think it is perfect. It is 3 AM EST. There are about 50 open calls, 236 SIP Channels, but 873 file handles counted by lsof -i. I have proved it this way. I changed the bindport in sip.conf and reloaded. There were no calls coming in. I waited about 2 mins and all the sip channels disappeared. It means that "sip show channels" showed nothing at all. So Asterisk should be, theoretically, back to zero. But there were still 147 open file handles, but 100% of them were for the database connection. So this patch does work. Please commit it.

By: Digium Subversion (svnbot) 2009-12-10 10:08:23.000-0600

Repository: asterisk
Revision: 234095

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r234095 | tilghman | 2009-12-10 10:08:21 -0600 (Thu, 10 Dec 2009) | 9 lines

When we receive no response at all to our INVITE, allow the channel to be destroyed.
(closes issue ASTERISK-14576, closes issue ASTERISK-14653, closes issue ASTERISK-15161, closes issue ASTERISK-14337, issue ASTERISK-14500)
Reported by: falves11
Patches:
      20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
      20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
Tested by: falves11
Review: https://reviewboard.asterisk.org/r/446/

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234095

By: Digium Subversion (svnbot) 2009-12-10 10:22:28.000-0600

Repository: asterisk
Revision: 234095

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines

When we receive no response at all to our INVITE, allow the channel to be destroyed.
(closes issue ASTERISK-14576, closes issue ASTERISK-14653, closes issue ASTERISK-15161, closes issue ASTERISK-14337, issue ASTERISK-14500)
Reported by: falves11, corruptor, dant
Patches:
      20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
      20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
Tested by: falves11
Review: https://reviewboard.asterisk.org/r/446/

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234095

By: Digium Subversion (svnbot) 2009-12-10 10:24:30.000-0600

Repository: asterisk
Revision: 234129

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r234129 | tilghman | 2009-12-10 10:24:28 -0600 (Thu, 10 Dec 2009) | 16 lines

Merged revisions 234095 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
 
 When we receive no response at all to our INVITE, allow the channel to be destroyed.
 (closes issue ASTERISK-14576, closes issue ASTERISK-14653, closes issue ASTERISK-15161, closes issue ASTERISK-14337, issue ASTERISK-14500)
  Reported by: falves11, corruptor, dant
  Patches:
        20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
        20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
  Tested by: falves11
 Review: https://reviewboard.asterisk.org/r/446/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234129

By: Digium Subversion (svnbot) 2009-12-10 10:30:26.000-0600

Repository: asterisk
Revision: 234131

_U  branches/1.6.0/
U   branches/1.6.0/channels/chan_sip.c

------------------------------------------------------------------------
r234131 | tilghman | 2009-12-10 10:30:25 -0600 (Thu, 10 Dec 2009) | 23 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
   
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576, closes issue ASTERISK-14653, closes issue ASTERISK-15161, closes issue ASTERISK-14337, issue ASTERISK-14500)
    Reported by: falves11, corruptor, dant
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234131

By: Digium Subversion (svnbot) 2009-12-10 10:30:41.000-0600

Repository: asterisk
Revision: 234132

_U  branches/1.6.1/
U   branches/1.6.1/channels/chan_sip.c

------------------------------------------------------------------------
r234132 | tilghman | 2009-12-10 10:30:39 -0600 (Thu, 10 Dec 2009) | 23 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
   
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576, closes issue ASTERISK-14653, closes issue ASTERISK-15161, closes issue ASTERISK-14337, issue ASTERISK-14500)
    Reported by: falves11, corruptor, dant
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234132

By: Digium Subversion (svnbot) 2009-12-10 10:38:46.000-0600

Repository: asterisk
Revision: 234133

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_sip.c

------------------------------------------------------------------------
r234133 | tilghman | 2009-12-10 10:38:45 -0600 (Thu, 10 Dec 2009) | 23 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
   
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576, closes issue ASTERISK-14653, closes issue ASTERISK-15161, closes issue ASTERISK-14337, issue ASTERISK-14500)
    Reported by: falves11, corruptor, dant
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234133

By: Digium Subversion (svnbot) 2009-12-10 10:53:44.000-0600

Repository: asterisk
Revision: 234095

U   branches/1.4/channels/chan_sip.c

------------------------------------------------------------------------
r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 17 lines

When we receive no response at all to our INVITE, allow the channel to be destroyed.
(closes issue ASTERISK-14576)
Reported by: falves11
Patches:
      20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
      20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
Tested by: falves11
Review: https://reviewboard.asterisk.org/r/446/
(closes issue ASTERISK-14653)
Reported by: dant
(closes issue ASTERISK-15161)
Reported by: corruptor
(closes issue ASTERISK-14337)
Reported by: falves11
(issue ASTERISK-14500)
Reported by: lftsy

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234095

By: Digium Subversion (svnbot) 2009-12-10 10:55:12.000-0600

Repository: asterisk
Revision: 234129

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 24 lines

Merged revisions 234095 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
 
 When we receive no response at all to our INVITE, allow the channel to be destroyed.
 (closes issue ASTERISK-14576)
  Reported by: falves11
  Patches:
        20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
        20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
  Tested by: falves11
 Review: https://reviewboard.asterisk.org/r/446/
 (closes issue ASTERISK-14653)
 Reported by: dant
 (closes issue ASTERISK-15161)
 Reported by: corruptor
 (closes issue ASTERISK-14337)
 Reported by: falves11
 (issue ASTERISK-14500)
 Reported by: lftsy
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234129

By: Digium Subversion (svnbot) 2009-12-10 10:56:17.000-0600

Repository: asterisk
Revision: 234131

_U  branches/1.6.0/
U   branches/1.6.0/channels/chan_sip.c

------------------------------------------------------------------------
r234131 | tilghman | 2009-12-10 10:30:22 -0600 (Thu, 10 Dec 2009) | 29 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576)
    Reported by: falves11
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
   (closes issue ASTERISK-14653)
   Reported by: dant
   (closes issue ASTERISK-15161)
   Reported by: corruptor
   (closes issue ASTERISK-14337)
   Reported by: falves11
   (issue ASTERISK-14500)
   Reported by: lftsy
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234131

By: Digium Subversion (svnbot) 2009-12-10 10:57:10.000-0600

Repository: asterisk
Revision: 234132

_U  branches/1.6.1/
U   branches/1.6.1/channels/chan_sip.c

------------------------------------------------------------------------
r234132 | tilghman | 2009-12-10 10:30:32 -0600 (Thu, 10 Dec 2009) | 31 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
   
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576)
    Reported by: falves11
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
   (closes issue ASTERISK-14653)
   Reported by: dant
   (closes issue ASTERISK-15161)
   Reported by: corruptor
   (closes issue ASTERISK-14337)
   Reported by: falves11
   (issue ASTERISK-14500)
   Reported by: lftsy
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234132

By: Digium Subversion (svnbot) 2009-12-10 10:57:39.000-0600

Repository: asterisk
Revision: 234133

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_sip.c

------------------------------------------------------------------------
r234133 | tilghman | 2009-12-10 10:30:40 -0600 (Thu, 10 Dec 2009) | 31 lines

Merged revisions 234129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r234129 | tilghman | 2009-12-10 10:24:26 -0600 (Thu, 10 Dec 2009) | 16 lines
 
 Merged revisions 234095 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r234095 | tilghman | 2009-12-10 10:08:20 -0600 (Thu, 10 Dec 2009) | 9 lines
   
   When we receive no response at all to our INVITE, allow the channel to be destroyed.
   (closes issue ASTERISK-14576)
    Reported by: falves11
    Patches:
          20091209__issue15627__1.6.0.diff.txt uploaded by tilghman (license 14)
          20091209__issue15627__1.4.diff.txt uploaded by tilghman (license 14)
    Tested by: falves11
   Review: https://reviewboard.asterisk.org/r/446/
   (closes issue ASTERISK-14653)
   Reported by: dant
   (closes issue ASTERISK-15161)
   Reported by: corruptor
   (closes issue ASTERISK-14337)
   Reported by: falves11
   (issue ASTERISK-14500)
   Reported by: lftsy
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=234133

By: Leif Madsen (lmadsen) 2009-12-10 15:14:27.000-0600

falves11: Thanks for your input, but we do have a process that changes go through. Issues and changes to resolve those issues will be handled appropriately, and will be resolved as quickly as possible, and as appropriate.