[Home]

Summary:ASTERISK-10770: Blowup after one-two hours with Trunk
Reporter:Private Name (falves11)Labels:
Date Opened:2007-11-14 18:52:45.000-0600Date Closed:2008-02-24 06:17:20.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) valgrind.txt
( 1) valgrind1.txt
( 2) valgrind2.txt
( 3) valgrind3.txt
( 4) valgrind4.txt
( 5) valgrind5.txt
( 6) valgrind6.txt
( 7) valgrind7.txt
Description:I am running 300-400 calls only with signaling and after one -two hours it blows up. Thisis the error.

****** ADDITIONAL INFORMATION ******

#0  0x08085c93 in ast_waitfor_nandfds_complex (c=0xb65ddf60, n=3, ms=0xb65ddf50) at channel.c:1921
1921                            ast_clear_flag(winner, AST_FLAG_EXCEPTION);
(gdb) bt full
#0  0x08085c93 in ast_waitfor_nandfds_complex (c=0xb65ddf60, n=3, ms=0xb65ddf50) at channel.c:1921
       __p = 0
       __x = 0
       aed = (struct ast_epoll_data *) 0x87ceec0
       start = {tv_sec = 1195082124, tv_usec = 188659}
       res = 1
       i = 0
       ev = {{events = 1, data = {ptr = 0x87ceec0, fd = 142405312, u32 = 142405312, u64 = 142405312}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0,
     u64 = 0}} <repeats 24 times>}
       whentohangup = 3589
       diff = 3589
       rms = 500
       now = 1195082124
       winner = (struct ast_channel *) 0x0
       __PRETTY_FUNCTION__ = "ast_waitfor_nandfds_complex"
#1  0x08085e12 in ast_waitfor_nandfds (c=0xb65ddf60, n=3, fds=0x0, nfds=0, exception=0x0, outfd=0x0, ms=0xb65ddf50) at channel.c:1949
No locals.
#2  0x08085e59 in ast_waitfor_n (c=0xb65ddf60, n=3, ms=0xb65ddf50) at channel.c:1955
No locals.
#3  0x0807793c in autoservice_run (ign=0x0) at autoservice.c:83
       chan = (struct ast_channel *) 0x0
       as = (struct asent *) 0x0
       ms = 500
       mons = {0xb5a417b0, 0xb456b4e0, 0xb494a528, 0x86d1cc8, 0xb715d898, 0x400000, 0x0 <repeats 32 times>, 0x806f3e1, 0x14000000, 0xb715d898, 0x400000,
 0x0 <repeats 32 times>, 0xb725dff4, 0xb65de138, 0x17, 0xb65de1d4, 0xb715d72f, 0x17, 0xb65de138, 0xb65de0a8, 0x806f3e1, 0x400000, 0x0 <repeats 31 times>,
 0x14000000, 0xb715d898, 0x0, 0x806f3e1, 0x400000, 0x0 <repeats 31 times>, 0x10000000, 0x0, 0x0, 0x7d0f00, 0xb660f5b0, 0xb65debe8, 0xb65de1e4, 0x806f3fa,
 0x17, 0x806f3e1, 0xb660f5d8, 0xb715d898, 0x17, 0x33, 0x0, 0x7b, 0xfffe007b, 0xb65debe8, 0xb660f5b0, 0xb660f5d8, 0xb65de4c4, 0x7d0f00, 0xb65debe8,
 0xb65de4c4, 0xb71986a4, 0x0, 0x0, 0xb71fec28, 0x73, 0x292, 0xb65de4c4, 0x7b, 0xb65de248, 0x8005003, 0x0, 0xffff037f, 0xffff0020, 0xffffffff, 0x81060f2,
 0x35d0073, 0xb6610a98, 0x7b, 0x0 <repeats 16 times>, 0x80000000, 0x3ffe, 0xc0400000, 0x4017d0f5, 0xb725f848, 0xb7197d44, 0x10, 0x81060f2, 0xb71973b0,
 0x610a98, 0xc, 0xb725dff4, 0xc, 0xb725f840, 0xb65de308, 0xb719a456, 0xb725f840, 0xc, 0x0, 0x12b60, 0xc, 0x82204a0, 0xb7eb2ff4, 0x0, 0x0, 0xb65de338,
 0x810ff45, 0x1, 0xc, 0x0, 0x0, 0x0, 0x0, 0x8207aa8, 0x806c719, 0x816fd28, 0x8207aa8, 0xb65de368, 0x806c8bb, 0x816fd28, 0xc, 0x81453f3, 0x136, 0x814541a,
 0xb725f840, 0x8207aa8, 0xb7eb2ff4}
       x = 3
       __PRETTY_FUNCTION__ = "autoservice_run"
#4  0x08110fbb in dummy_start (data=0x8207aa8) at utils.c:858
       _buffer = {__routine = 0x806c8c1 <ast_unregister_thread>, __arg = 0xb65deba0, __canceltype = 0, __prev = 0x0}
       ret = (void *) 0x0
       a = {start_routine = 0x807784c <autoservice_run>, data = 0x0,
 name = 0x8207ad0 "autoservice_run      started at [  115] autoservice.c ast_autoservice_start()"}
ASTERISK-1  0xb7ea93cc in start_thread () from /lib/tls/libpthread.so.0
No symbol table info available.
ASTERISK-2  0xb71fec3e in clone () from /lib/tls/libc.so.6
No symbol table info available.
(gdb) print winner
$1 = (struct ast_channel *) 0x0
(gdb) print *winner
Cannot access memory at address 0x0
(gdb) quit
Comments:By: Private Name (falves11) 2007-11-14 18:53:07.000-0600

Nov 14 19:41:55] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call.
[Nov 14 19:41:55] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to build sip pvt data for '14405992711@66.28.147.100' (Out of memory or socket error)
[Nov 14 19:41:55] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call.
[Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call.
[Nov 14 19:41:56] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to build sip pvt data for '14405992711@38.102.64.25' (Out of memory or socket error)
[Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call.
[Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call.
[Nov 14 19:41:56] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to build sip pvt data for '13099375954@66.28.147.100' (Out of memory or socket error)
[Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call.
[Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call.
[Nov 14 19:41:56] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to buil

By: Tilghman Lesher (tilghman) 2007-11-14 19:18:55.000-0600

Please read doc/valgrind.txt and follow the instructions therein.

By: Private Name (falves11) 2007-11-14 22:34:19.000-0600

The latest valgrind was obtained with no epoll support. did a make and make install after commenting out the line have_epoll in autoconfig

By: Private Name (falves11) 2007-11-15 00:00:03.000-0600

I think that I found the issue: Set(TIMEOUT(absolute)=3600) is flawed. I have an extension exten => T,1,Hangup, but as soon as that happens, the Dial command gets repeated over and over, until the file handles get exhausted. That's why the error pops up No RTP ports remaining. This is a bug or I don't have a clue about how to use Set(TIMEOUT(absolute)=xxx)

By: Tilghman Lesher (tilghman) 2007-11-15 01:21:03.000-0600

There is absolutely no reason to have a "T,1,Hangup".  The purpose of the existence of that extension is if you want to do something OTHER than Hangup.

By: Private Name (falves11) 2007-11-15 08:26:46.000-0600

I get the same redial forever effect even without the existence of a T extension. I guess this is a bug. I can duplicate it if somebody wants to contact me he or she can log into my development box and I will show them.

By: Jason Parker (jparker) 2008-01-15 18:56:26.000-0600

What does the rest of your dialplan look like?  It almost sounds like some sort of dial/hangup loop is happening...

By: Private Name (falves11) 2008-01-15 19:07:41.000-0600

Please look at the bug 11746. I am submitted a lot of traces, but so far, no help.

By: Tilghman Lesher (tilghman) 2008-01-15 20:30:21.000-0600

We have requested valgrind output from you on that bug, but so far, you have not seen fit to provide that information.  We need that information to debug ASTERISK-11215 and cannot proceed without it.

By: Digium Subversion (svnbot) 2008-02-21 10:41:33.000-0600

Repository: asterisk
Revision: 104019

U   trunk/configure
U   trunk/configure.ac
U   trunk/include/asterisk/autoconfig.h.in

------------------------------------------------------------------------
r104019 | file | 2008-02-21 10:41:31 -0600 (Thu, 21 Feb 2008) | 8 lines

Disable epoll as it has caused more obscure issues then any of my previous code. I will continue to work on it in a separate branch to make it stable for a release and test it against the following issues.
(closes issue ASTERISK-10770)
Reported by: falves11
(closes issue ASTERISK-11131)
Reported by: davevg
(closes issue ASTERISK-10573)
Reported by: falves11

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=104019

By: Private Name (falves11) 2008-02-21 17:09:08.000-0600

In my opinion, Epoll is not the issue. I am using version SVN-trunk-r103908M for over 24 hours, with good traffic. This is what I did: MAX_LOCKS to 256 from 64, MAX_AUTOMONS to 25000 from 1500, I did compile it with optimizations and removed chan_h323. Maybe when I compiled with don't optimize I reached some speed-processing threshold that made it crash often. In any case, if you can merge those two changes, I think this is near a perfect SIP server. Maybe also chan_h323 is messing the whole thing. Some resources should be applied to chan_h323, since it is a very important piece of everybody's business, Please see issue 11940. This issue, by the way, was caused by a misconfiguration of the rtp and t38 files. For large operations, the files must set apart a range from 5000 to 64000 for RTP and from 1024 to 4999 for T.38. Once that is done, it works.



By: Digium Subversion (svnbot) 2008-02-23 21:17:36.000-0600

Repository: asterisk
Revision: 104076

_U  team/seanbright/NoLossCDR-Redux/
U   team/seanbright/NoLossCDR-Redux/CHANGES
U   team/seanbright/NoLossCDR-Redux/UPGRADE.txt
U   team/seanbright/NoLossCDR-Redux/channels/chan_sip.c
U   team/seanbright/NoLossCDR-Redux/channels/chan_zap.c
U   team/seanbright/NoLossCDR-Redux/configure
U   team/seanbright/NoLossCDR-Redux/configure.ac
U   team/seanbright/NoLossCDR-Redux/doc/manager_1_1.txt
U   team/seanbright/NoLossCDR-Redux/include/asterisk/autoconfig.h.in
U   team/seanbright/NoLossCDR-Redux/include/asterisk/manager.h
U   team/seanbright/NoLossCDR-Redux/main/manager.c
U   team/seanbright/NoLossCDR-Redux/res/res_agi.c
U   team/seanbright/NoLossCDR-Redux/res/res_config_pgsql.c
U   team/seanbright/NoLossCDR-Redux/utils/astman.c

------------------------------------------------------------------------
r104076 | seanbright | 2008-02-23 21:17:26 -0600 (Sat, 23 Feb 2008) | 142 lines

Merged revisions 104014,104016,104019-104020,104024-104025,104028-104029,104031,104036,104038-104039,104045,104073-104074 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r104014 | tilghman | 2008-02-21 00:21:39 -0500 (Thu, 21 Feb 2008) | 6 lines

Ignore some more unused generated events.
(closes issue ASTERISK-11487)
Reported by: junky
Patches:
      astman_events.diff uploaded by junky (license 177)

................
r104016 | kpfleming | 2008-02-21 09:44:04 -0500 (Thu, 21 Feb 2008) | 10 lines

Merged revisions 104015 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r104015 | kpfleming | 2008-02-21 08:33:51 -0600 (Thu, 21 Feb 2008) | 2 lines

reduce the likelihood that HTTP Manager session ids will consist of primarily '1' bits

........

................
r104019 | file | 2008-02-21 11:44:57 -0500 (Thu, 21 Feb 2008) | 8 lines

Disable epoll as it has caused more obscure issues then any of my previous code. I will continue to work on it in a separate branch to make it stable for a release and test it against the following issues.
(closes issue ASTERISK-10770)
Reported by: falves11
(closes issue ASTERISK-11131)
Reported by: davevg
(closes issue ASTERISK-10573)
Reported by: falves11

................
r104020 | mmichelson | 2008-02-21 11:46:37 -0500 (Thu, 21 Feb 2008) | 7 lines

Don't print the fact that we are using dead mode in AGI if called from the
'h' extension since it is well-known that it will be running in dead mode.

(closes issue ASTERISK-11491)
Reported by: explidous


................
r104024 | dbailey | 2008-02-21 12:38:40 -0500 (Thu, 21 Feb 2008) | 4 lines

Added configuration distinction between neon and fsk mwi detection
Add the detection for neon MWI events
got rid of extraneous handle_init_event call in monitor loop

................
r104025 | mmichelson | 2008-02-21 12:44:34 -0500 (Thu, 21 Feb 2008) | 4 lines

Instead of a notice, make the message about a hung-up channel a debug message, and revert the original
logic on the if statement. Thanks to Juggie for bringing this to my attention.


................
r104028 | mmichelson | 2008-02-21 16:09:11 -0500 (Thu, 21 Feb 2008) | 14 lines

Blocked revisions 104026 via svnmerge

........
r104026 | mmichelson | 2008-02-21 14:12:38 -0600 (Thu, 21 Feb 2008) | 7 lines

Remove an incorrect debug message. It reported that it had received a specific event and tried to report
which event was received. What actually was happening was that it was reporting the number of bytes returned
from a call to read().

Thanks to Jared Smith for bringing the issue up on IRC


........

................
r104029 | mmichelson | 2008-02-21 16:09:54 -0500 (Thu, 21 Feb 2008) | 11 lines

Blocked revisions 104027 via svnmerge

........
r104027 | mmichelson | 2008-02-21 15:05:42 -0600 (Thu, 21 Feb 2008) | 4 lines

And as a followup to revision 104026, completely remove event-related
calls from a section of code where we know there was no event to handle or get.


........

................
r104031 | russell | 2008-02-21 16:27:24 -0500 (Thu, 21 Feb 2008) | 1 line

fix a typo
................
r104036 | tilghman | 2008-02-22 17:39:21 -0500 (Fri, 22 Feb 2008) | 7 lines

Allow database password to be NULL and several other cleanups.
(closes issue ASTERISK-11493)
Reported by: bukaj
Patches:
      20080222__bug12048.diff.txt uploaded by Corydon76 (license 14)
Tested by: bukaj

................
r104038 | tilghman | 2008-02-22 17:48:18 -0500 (Fri, 22 Feb 2008) | 14 lines

Merged revisions 104037 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r104037 | tilghman | 2008-02-22 16:45:14 -0600 (Fri, 22 Feb 2008) | 6 lines

Backwards debug message.
(closes issue ASTERISK-11496)
Reported by: flefoll
Patches:
      chan_sip.c.br14.patch_found-notfound uploaded by flefoll (license 244)

........

................
r104039 | tilghman | 2008-02-22 17:55:35 -0500 (Fri, 22 Feb 2008) | 2 lines

Move Originate to a separate privilege and require the additional System privilege to call out to a subshell.

................
r104045 | dbailey | 2008-02-22 18:56:55 -0500 (Fri, 22 Feb 2008) | 2 lines

Add protection to chan_zap build when NEONMWI events are not defined

................
r104073 | murf | 2008-02-23 19:44:14 -0500 (Sat, 23 Feb 2008) | 1 line

On a 64-bit machine, with dev-mode turned on, and pgsql installed, I get warnings that stops the compile. They are fixed now.
................
r104074 | murf | 2008-02-23 21:37:08 -0500 (Sat, 23 Feb 2008) | 1 line

Enforce a space between function args as per code review.
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=104076

By: Digium Subversion (svnbot) 2008-02-24 06:17:20.000-0600

Repository: asterisk
Revision: 104077

_U  team/group/multiparking/
U   team/group/multiparking/CHANGES
U   team/group/multiparking/UPGRADE.txt
U   team/group/multiparking/channels/chan_sip.c
U   team/group/multiparking/channels/chan_zap.c
U   team/group/multiparking/configure
U   team/group/multiparking/configure.ac
U   team/group/multiparking/doc/manager_1_1.txt
U   team/group/multiparking/include/asterisk/autoconfig.h.in
U   team/group/multiparking/include/asterisk/manager.h
U   team/group/multiparking/main/manager.c
U   team/group/multiparking/res/res_agi.c
U   team/group/multiparking/res/res_config_pgsql.c
U   team/group/multiparking/utils/astman.c

------------------------------------------------------------------------
r104077 | mvanbaak | 2008-02-24 06:17:17 -0600 (Sun, 24 Feb 2008) | 142 lines

Merged revisions 104014,104016,104019-104020,104024-104025,104028-104029,104031,104036,104038-104039,104045,104073-104074 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r104014 | tilghman | 2008-02-21 06:21:39 +0100 (Thu, 21 Feb 2008) | 6 lines

Ignore some more unused generated events.
(closes issue ASTERISK-11487)
Reported by: junky
Patches:
      astman_events.diff uploaded by junky (license 177)

................
r104016 | kpfleming | 2008-02-21 15:44:04 +0100 (Thu, 21 Feb 2008) | 10 lines

Merged revisions 104015 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r104015 | kpfleming | 2008-02-21 08:33:51 -0600 (Thu, 21 Feb 2008) | 2 lines

reduce the likelihood that HTTP Manager session ids will consist of primarily '1' bits

........

................
r104019 | file | 2008-02-21 17:44:57 +0100 (Thu, 21 Feb 2008) | 8 lines

Disable epoll as it has caused more obscure issues then any of my previous code. I will continue to work on it in a separate branch to make it stable for a release and test it against the following issues.
(closes issue ASTERISK-10770)
Reported by: falves11
(closes issue ASTERISK-11131)
Reported by: davevg
(closes issue ASTERISK-10573)
Reported by: falves11

................
r104020 | mmichelson | 2008-02-21 17:46:37 +0100 (Thu, 21 Feb 2008) | 7 lines

Don't print the fact that we are using dead mode in AGI if called from the
'h' extension since it is well-known that it will be running in dead mode.

(closes issue ASTERISK-11491)
Reported by: explidous


................
r104024 | dbailey | 2008-02-21 18:38:40 +0100 (Thu, 21 Feb 2008) | 4 lines

Added configuration distinction between neon and fsk mwi detection
Add the detection for neon MWI events
got rid of extraneous handle_init_event call in monitor loop

................
r104025 | mmichelson | 2008-02-21 18:44:34 +0100 (Thu, 21 Feb 2008) | 4 lines

Instead of a notice, make the message about a hung-up channel a debug message, and revert the original
logic on the if statement. Thanks to Juggie for bringing this to my attention.


................
r104028 | mmichelson | 2008-02-21 22:09:11 +0100 (Thu, 21 Feb 2008) | 14 lines

Blocked revisions 104026 via svnmerge

........
r104026 | mmichelson | 2008-02-21 14:12:38 -0600 (Thu, 21 Feb 2008) | 7 lines

Remove an incorrect debug message. It reported that it had received a specific event and tried to report
which event was received. What actually was happening was that it was reporting the number of bytes returned
from a call to read().

Thanks to Jared Smith for bringing the issue up on IRC


........

................
r104029 | mmichelson | 2008-02-21 22:09:54 +0100 (Thu, 21 Feb 2008) | 11 lines

Blocked revisions 104027 via svnmerge

........
r104027 | mmichelson | 2008-02-21 15:05:42 -0600 (Thu, 21 Feb 2008) | 4 lines

And as a followup to revision 104026, completely remove event-related
calls from a section of code where we know there was no event to handle or get.


........

................
r104031 | russell | 2008-02-21 22:27:24 +0100 (Thu, 21 Feb 2008) | 1 line

fix a typo
................
r104036 | tilghman | 2008-02-22 23:39:21 +0100 (Fri, 22 Feb 2008) | 7 lines

Allow database password to be NULL and several other cleanups.
(closes issue ASTERISK-11493)
Reported by: bukaj
Patches:
      20080222__bug12048.diff.txt uploaded by Corydon76 (license 14)
Tested by: bukaj

................
r104038 | tilghman | 2008-02-22 23:48:18 +0100 (Fri, 22 Feb 2008) | 14 lines

Merged revisions 104037 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r104037 | tilghman | 2008-02-22 16:45:14 -0600 (Fri, 22 Feb 2008) | 6 lines

Backwards debug message.
(closes issue ASTERISK-11496)
Reported by: flefoll
Patches:
      chan_sip.c.br14.patch_found-notfound uploaded by flefoll (license 244)

........

................
r104039 | tilghman | 2008-02-22 23:55:35 +0100 (Fri, 22 Feb 2008) | 2 lines

Move Originate to a separate privilege and require the additional System privilege to call out to a subshell.

................
r104045 | dbailey | 2008-02-23 00:56:55 +0100 (Sat, 23 Feb 2008) | 2 lines

Add protection to chan_zap build when NEONMWI events are not defined

................
r104073 | murf | 2008-02-24 01:44:14 +0100 (Sun, 24 Feb 2008) | 1 line

On a 64-bit machine, with dev-mode turned on, and pgsql installed, I get warnings that stops the compile. They are fixed now.
................
r104074 | murf | 2008-02-24 03:37:08 +0100 (Sun, 24 Feb 2008) | 1 line

Enforce a space between function args as per code review.
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=104077