Summary: | ASTERISK-10770: Blowup after one-two hours with Trunk | ||
Reporter: | Private Name (falves11) | Labels: | |
Date Opened: | 2007-11-14 18:52:45.000-0600 | Date Closed: | 2008-02-24 06:17:20.000-0600 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Core/General |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) valgrind.txt ( 1) valgrind1.txt ( 2) valgrind2.txt ( 3) valgrind3.txt ( 4) valgrind4.txt ( 5) valgrind5.txt ( 6) valgrind6.txt ( 7) valgrind7.txt | |
Description: | I am running 300-400 calls only with signaling and after one -two hours it blows up. Thisis the error. ****** ADDITIONAL INFORMATION ****** #0 0x08085c93 in ast_waitfor_nandfds_complex (c=0xb65ddf60, n=3, ms=0xb65ddf50) at channel.c:1921 1921 ast_clear_flag(winner, AST_FLAG_EXCEPTION); (gdb) bt full #0 0x08085c93 in ast_waitfor_nandfds_complex (c=0xb65ddf60, n=3, ms=0xb65ddf50) at channel.c:1921 __p = 0 __x = 0 aed = (struct ast_epoll_data *) 0x87ceec0 start = {tv_sec = 1195082124, tv_usec = 188659} res = 1 i = 0 ev = {{events = 1, data = {ptr = 0x87ceec0, fd = 142405312, u32 = 142405312, u64 = 142405312}}, {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}} <repeats 24 times>} whentohangup = 3589 diff = 3589 rms = 500 now = 1195082124 winner = (struct ast_channel *) 0x0 __PRETTY_FUNCTION__ = "ast_waitfor_nandfds_complex" #1 0x08085e12 in ast_waitfor_nandfds (c=0xb65ddf60, n=3, fds=0x0, nfds=0, exception=0x0, outfd=0x0, ms=0xb65ddf50) at channel.c:1949 No locals. #2 0x08085e59 in ast_waitfor_n (c=0xb65ddf60, n=3, ms=0xb65ddf50) at channel.c:1955 No locals. #3 0x0807793c in autoservice_run (ign=0x0) at autoservice.c:83 chan = (struct ast_channel *) 0x0 as = (struct asent *) 0x0 ms = 500 mons = {0xb5a417b0, 0xb456b4e0, 0xb494a528, 0x86d1cc8, 0xb715d898, 0x400000, 0x0 <repeats 32 times>, 0x806f3e1, 0x14000000, 0xb715d898, 0x400000, 0x0 <repeats 32 times>, 0xb725dff4, 0xb65de138, 0x17, 0xb65de1d4, 0xb715d72f, 0x17, 0xb65de138, 0xb65de0a8, 0x806f3e1, 0x400000, 0x0 <repeats 31 times>, 0x14000000, 0xb715d898, 0x0, 0x806f3e1, 0x400000, 0x0 <repeats 31 times>, 0x10000000, 0x0, 0x0, 0x7d0f00, 0xb660f5b0, 0xb65debe8, 0xb65de1e4, 0x806f3fa, 0x17, 0x806f3e1, 0xb660f5d8, 0xb715d898, 0x17, 0x33, 0x0, 0x7b, 0xfffe007b, 0xb65debe8, 0xb660f5b0, 0xb660f5d8, 0xb65de4c4, 0x7d0f00, 0xb65debe8, 0xb65de4c4, 0xb71986a4, 0x0, 0x0, 0xb71fec28, 0x73, 0x292, 0xb65de4c4, 0x7b, 0xb65de248, 0x8005003, 0x0, 0xffff037f, 0xffff0020, 0xffffffff, 0x81060f2, 0x35d0073, 0xb6610a98, 0x7b, 0x0 <repeats 16 times>, 0x80000000, 0x3ffe, 0xc0400000, 0x4017d0f5, 0xb725f848, 0xb7197d44, 0x10, 0x81060f2, 0xb71973b0, 0x610a98, 0xc, 0xb725dff4, 0xc, 0xb725f840, 0xb65de308, 0xb719a456, 0xb725f840, 0xc, 0x0, 0x12b60, 0xc, 0x82204a0, 0xb7eb2ff4, 0x0, 0x0, 0xb65de338, 0x810ff45, 0x1, 0xc, 0x0, 0x0, 0x0, 0x0, 0x8207aa8, 0x806c719, 0x816fd28, 0x8207aa8, 0xb65de368, 0x806c8bb, 0x816fd28, 0xc, 0x81453f3, 0x136, 0x814541a, 0xb725f840, 0x8207aa8, 0xb7eb2ff4} x = 3 __PRETTY_FUNCTION__ = "autoservice_run" #4 0x08110fbb in dummy_start (data=0x8207aa8) at utils.c:858 _buffer = {__routine = 0x806c8c1 <ast_unregister_thread>, __arg = 0xb65deba0, __canceltype = 0, __prev = 0x0} ret = (void *) 0x0 a = {start_routine = 0x807784c <autoservice_run>, data = 0x0, name = 0x8207ad0 "autoservice_run started at [ 115] autoservice.c ast_autoservice_start()"} ASTERISK-1 0xb7ea93cc in start_thread () from /lib/tls/libpthread.so.0 No symbol table info available. ASTERISK-2 0xb71fec3e in clone () from /lib/tls/libc.so.6 No symbol table info available. (gdb) print winner $1 = (struct ast_channel *) 0x0 (gdb) print *winner Cannot access memory at address 0x0 (gdb) quit | ||
Comments: | By: Private Name (falves11) 2007-11-14 18:53:07.000-0600 Nov 14 19:41:55] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call. [Nov 14 19:41:55] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to build sip pvt data for '14405992711@66.28.147.100' (Out of memory or socket error) [Nov 14 19:41:55] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call. [Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call. [Nov 14 19:41:56] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to build sip pvt data for '14405992711@38.102.64.25' (Out of memory or socket error) [Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call. [Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call. [Nov 14 19:41:56] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to build sip pvt data for '13099375954@66.28.147.100' (Out of memory or socket error) [Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call. [Nov 14 19:41:56] ERROR[11672]: rtp.c:2262 ast_rtp_new_with_bindaddr: No RTP ports remaining. Can't setup media stream for this call. [Nov 14 19:41:56] ERROR[11672]: chan_sip.c:16966 sip_request_call: Unable to buil By: Tilghman Lesher (tilghman) 2007-11-14 19:18:55.000-0600 Please read doc/valgrind.txt and follow the instructions therein. By: Private Name (falves11) 2007-11-14 22:34:19.000-0600 The latest valgrind was obtained with no epoll support. did a make and make install after commenting out the line have_epoll in autoconfig By: Private Name (falves11) 2007-11-15 00:00:03.000-0600 I think that I found the issue: Set(TIMEOUT(absolute)=3600) is flawed. I have an extension exten => T,1,Hangup, but as soon as that happens, the Dial command gets repeated over and over, until the file handles get exhausted. That's why the error pops up No RTP ports remaining. This is a bug or I don't have a clue about how to use Set(TIMEOUT(absolute)=xxx) By: Tilghman Lesher (tilghman) 2007-11-15 01:21:03.000-0600 There is absolutely no reason to have a "T,1,Hangup". The purpose of the existence of that extension is if you want to do something OTHER than Hangup. By: Private Name (falves11) 2007-11-15 08:26:46.000-0600 I get the same redial forever effect even without the existence of a T extension. I guess this is a bug. I can duplicate it if somebody wants to contact me he or she can log into my development box and I will show them. By: Jason Parker (jparker) 2008-01-15 18:56:26.000-0600 What does the rest of your dialplan look like? It almost sounds like some sort of dial/hangup loop is happening... By: Private Name (falves11) 2008-01-15 19:07:41.000-0600 Please look at the bug 11746. I am submitted a lot of traces, but so far, no help. By: Tilghman Lesher (tilghman) 2008-01-15 20:30:21.000-0600 We have requested valgrind output from you on that bug, but so far, you have not seen fit to provide that information. We need that information to debug ASTERISK-11215 and cannot proceed without it. By: Digium Subversion (svnbot) 2008-02-21 10:41:33.000-0600 Repository: asterisk Revision: 104019 U trunk/configure U trunk/configure.ac U trunk/include/asterisk/autoconfig.h.in ------------------------------------------------------------------------ r104019 | file | 2008-02-21 10:41:31 -0600 (Thu, 21 Feb 2008) | 8 lines Disable epoll as it has caused more obscure issues then any of my previous code. I will continue to work on it in a separate branch to make it stable for a release and test it against the following issues. (closes issue ASTERISK-10770) Reported by: falves11 (closes issue ASTERISK-11131) Reported by: davevg (closes issue ASTERISK-10573) Reported by: falves11 ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=104019 By: Private Name (falves11) 2008-02-21 17:09:08.000-0600 In my opinion, Epoll is not the issue. I am using version SVN-trunk-r103908M for over 24 hours, with good traffic. This is what I did: MAX_LOCKS to 256 from 64, MAX_AUTOMONS to 25000 from 1500, I did compile it with optimizations and removed chan_h323. Maybe when I compiled with don't optimize I reached some speed-processing threshold that made it crash often. In any case, if you can merge those two changes, I think this is near a perfect SIP server. Maybe also chan_h323 is messing the whole thing. Some resources should be applied to chan_h323, since it is a very important piece of everybody's business, Please see issue 11940. This issue, by the way, was caused by a misconfiguration of the rtp and t38 files. For large operations, the files must set apart a range from 5000 to 64000 for RTP and from 1024 to 4999 for T.38. Once that is done, it works. By: Digium Subversion (svnbot) 2008-02-23 21:17:36.000-0600 Repository: asterisk Revision: 104076 _U team/seanbright/NoLossCDR-Redux/ U team/seanbright/NoLossCDR-Redux/CHANGES U team/seanbright/NoLossCDR-Redux/UPGRADE.txt U team/seanbright/NoLossCDR-Redux/channels/chan_sip.c U team/seanbright/NoLossCDR-Redux/channels/chan_zap.c U team/seanbright/NoLossCDR-Redux/configure U team/seanbright/NoLossCDR-Redux/configure.ac U team/seanbright/NoLossCDR-Redux/doc/manager_1_1.txt U team/seanbright/NoLossCDR-Redux/include/asterisk/autoconfig.h.in U team/seanbright/NoLossCDR-Redux/include/asterisk/manager.h U team/seanbright/NoLossCDR-Redux/main/manager.c U team/seanbright/NoLossCDR-Redux/res/res_agi.c U team/seanbright/NoLossCDR-Redux/res/res_config_pgsql.c U team/seanbright/NoLossCDR-Redux/utils/astman.c ------------------------------------------------------------------------ r104076 | seanbright | 2008-02-23 21:17:26 -0600 (Sat, 23 Feb 2008) | 142 lines Merged revisions 104014,104016,104019-104020,104024-104025,104028-104029,104031,104036,104038-104039,104045,104073-104074 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r104014 | tilghman | 2008-02-21 00:21:39 -0500 (Thu, 21 Feb 2008) | 6 lines Ignore some more unused generated events. (closes issue ASTERISK-11487) Reported by: junky Patches: astman_events.diff uploaded by junky (license 177) ................ r104016 | kpfleming | 2008-02-21 09:44:04 -0500 (Thu, 21 Feb 2008) | 10 lines Merged revisions 104015 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r104015 | kpfleming | 2008-02-21 08:33:51 -0600 (Thu, 21 Feb 2008) | 2 lines reduce the likelihood that HTTP Manager session ids will consist of primarily '1' bits ........ ................ r104019 | file | 2008-02-21 11:44:57 -0500 (Thu, 21 Feb 2008) | 8 lines Disable epoll as it has caused more obscure issues then any of my previous code. I will continue to work on it in a separate branch to make it stable for a release and test it against the following issues. (closes issue ASTERISK-10770) Reported by: falves11 (closes issue ASTERISK-11131) Reported by: davevg (closes issue ASTERISK-10573) Reported by: falves11 ................ r104020 | mmichelson | 2008-02-21 11:46:37 -0500 (Thu, 21 Feb 2008) | 7 lines Don't print the fact that we are using dead mode in AGI if called from the 'h' extension since it is well-known that it will be running in dead mode. (closes issue ASTERISK-11491) Reported by: explidous ................ r104024 | dbailey | 2008-02-21 12:38:40 -0500 (Thu, 21 Feb 2008) | 4 lines Added configuration distinction between neon and fsk mwi detection Add the detection for neon MWI events got rid of extraneous handle_init_event call in monitor loop ................ r104025 | mmichelson | 2008-02-21 12:44:34 -0500 (Thu, 21 Feb 2008) | 4 lines Instead of a notice, make the message about a hung-up channel a debug message, and revert the original logic on the if statement. Thanks to Juggie for bringing this to my attention. ................ r104028 | mmichelson | 2008-02-21 16:09:11 -0500 (Thu, 21 Feb 2008) | 14 lines Blocked revisions 104026 via svnmerge ........ r104026 | mmichelson | 2008-02-21 14:12:38 -0600 (Thu, 21 Feb 2008) | 7 lines Remove an incorrect debug message. It reported that it had received a specific event and tried to report which event was received. What actually was happening was that it was reporting the number of bytes returned from a call to read(). Thanks to Jared Smith for bringing the issue up on IRC ........ ................ r104029 | mmichelson | 2008-02-21 16:09:54 -0500 (Thu, 21 Feb 2008) | 11 lines Blocked revisions 104027 via svnmerge ........ r104027 | mmichelson | 2008-02-21 15:05:42 -0600 (Thu, 21 Feb 2008) | 4 lines And as a followup to revision 104026, completely remove event-related calls from a section of code where we know there was no event to handle or get. ........ ................ r104031 | russell | 2008-02-21 16:27:24 -0500 (Thu, 21 Feb 2008) | 1 line fix a typo ................ r104036 | tilghman | 2008-02-22 17:39:21 -0500 (Fri, 22 Feb 2008) | 7 lines Allow database password to be NULL and several other cleanups. (closes issue ASTERISK-11493) Reported by: bukaj Patches: 20080222__bug12048.diff.txt uploaded by Corydon76 (license 14) Tested by: bukaj ................ r104038 | tilghman | 2008-02-22 17:48:18 -0500 (Fri, 22 Feb 2008) | 14 lines Merged revisions 104037 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r104037 | tilghman | 2008-02-22 16:45:14 -0600 (Fri, 22 Feb 2008) | 6 lines Backwards debug message. (closes issue ASTERISK-11496) Reported by: flefoll Patches: chan_sip.c.br14.patch_found-notfound uploaded by flefoll (license 244) ........ ................ r104039 | tilghman | 2008-02-22 17:55:35 -0500 (Fri, 22 Feb 2008) | 2 lines Move Originate to a separate privilege and require the additional System privilege to call out to a subshell. ................ r104045 | dbailey | 2008-02-22 18:56:55 -0500 (Fri, 22 Feb 2008) | 2 lines Add protection to chan_zap build when NEONMWI events are not defined ................ r104073 | murf | 2008-02-23 19:44:14 -0500 (Sat, 23 Feb 2008) | 1 line On a 64-bit machine, with dev-mode turned on, and pgsql installed, I get warnings that stops the compile. They are fixed now. ................ r104074 | murf | 2008-02-23 21:37:08 -0500 (Sat, 23 Feb 2008) | 1 line Enforce a space between function args as per code review. ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=104076 By: Digium Subversion (svnbot) 2008-02-24 06:17:20.000-0600 Repository: asterisk Revision: 104077 _U team/group/multiparking/ U team/group/multiparking/CHANGES U team/group/multiparking/UPGRADE.txt U team/group/multiparking/channels/chan_sip.c U team/group/multiparking/channels/chan_zap.c U team/group/multiparking/configure U team/group/multiparking/configure.ac U team/group/multiparking/doc/manager_1_1.txt U team/group/multiparking/include/asterisk/autoconfig.h.in U team/group/multiparking/include/asterisk/manager.h U team/group/multiparking/main/manager.c U team/group/multiparking/res/res_agi.c U team/group/multiparking/res/res_config_pgsql.c U team/group/multiparking/utils/astman.c ------------------------------------------------------------------------ r104077 | mvanbaak | 2008-02-24 06:17:17 -0600 (Sun, 24 Feb 2008) | 142 lines Merged revisions 104014,104016,104019-104020,104024-104025,104028-104029,104031,104036,104038-104039,104045,104073-104074 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r104014 | tilghman | 2008-02-21 06:21:39 +0100 (Thu, 21 Feb 2008) | 6 lines Ignore some more unused generated events. (closes issue ASTERISK-11487) Reported by: junky Patches: astman_events.diff uploaded by junky (license 177) ................ r104016 | kpfleming | 2008-02-21 15:44:04 +0100 (Thu, 21 Feb 2008) | 10 lines Merged revisions 104015 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r104015 | kpfleming | 2008-02-21 08:33:51 -0600 (Thu, 21 Feb 2008) | 2 lines reduce the likelihood that HTTP Manager session ids will consist of primarily '1' bits ........ ................ r104019 | file | 2008-02-21 17:44:57 +0100 (Thu, 21 Feb 2008) | 8 lines Disable epoll as it has caused more obscure issues then any of my previous code. I will continue to work on it in a separate branch to make it stable for a release and test it against the following issues. (closes issue ASTERISK-10770) Reported by: falves11 (closes issue ASTERISK-11131) Reported by: davevg (closes issue ASTERISK-10573) Reported by: falves11 ................ r104020 | mmichelson | 2008-02-21 17:46:37 +0100 (Thu, 21 Feb 2008) | 7 lines Don't print the fact that we are using dead mode in AGI if called from the 'h' extension since it is well-known that it will be running in dead mode. (closes issue ASTERISK-11491) Reported by: explidous ................ r104024 | dbailey | 2008-02-21 18:38:40 +0100 (Thu, 21 Feb 2008) | 4 lines Added configuration distinction between neon and fsk mwi detection Add the detection for neon MWI events got rid of extraneous handle_init_event call in monitor loop ................ r104025 | mmichelson | 2008-02-21 18:44:34 +0100 (Thu, 21 Feb 2008) | 4 lines Instead of a notice, make the message about a hung-up channel a debug message, and revert the original logic on the if statement. Thanks to Juggie for bringing this to my attention. ................ r104028 | mmichelson | 2008-02-21 22:09:11 +0100 (Thu, 21 Feb 2008) | 14 lines Blocked revisions 104026 via svnmerge ........ r104026 | mmichelson | 2008-02-21 14:12:38 -0600 (Thu, 21 Feb 2008) | 7 lines Remove an incorrect debug message. It reported that it had received a specific event and tried to report which event was received. What actually was happening was that it was reporting the number of bytes returned from a call to read(). Thanks to Jared Smith for bringing the issue up on IRC ........ ................ r104029 | mmichelson | 2008-02-21 22:09:54 +0100 (Thu, 21 Feb 2008) | 11 lines Blocked revisions 104027 via svnmerge ........ r104027 | mmichelson | 2008-02-21 15:05:42 -0600 (Thu, 21 Feb 2008) | 4 lines And as a followup to revision 104026, completely remove event-related calls from a section of code where we know there was no event to handle or get. ........ ................ r104031 | russell | 2008-02-21 22:27:24 +0100 (Thu, 21 Feb 2008) | 1 line fix a typo ................ r104036 | tilghman | 2008-02-22 23:39:21 +0100 (Fri, 22 Feb 2008) | 7 lines Allow database password to be NULL and several other cleanups. (closes issue ASTERISK-11493) Reported by: bukaj Patches: 20080222__bug12048.diff.txt uploaded by Corydon76 (license 14) Tested by: bukaj ................ r104038 | tilghman | 2008-02-22 23:48:18 +0100 (Fri, 22 Feb 2008) | 14 lines Merged revisions 104037 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r104037 | tilghman | 2008-02-22 16:45:14 -0600 (Fri, 22 Feb 2008) | 6 lines Backwards debug message. (closes issue ASTERISK-11496) Reported by: flefoll Patches: chan_sip.c.br14.patch_found-notfound uploaded by flefoll (license 244) ........ ................ r104039 | tilghman | 2008-02-22 23:55:35 +0100 (Fri, 22 Feb 2008) | 2 lines Move Originate to a separate privilege and require the additional System privilege to call out to a subshell. ................ r104045 | dbailey | 2008-02-23 00:56:55 +0100 (Sat, 23 Feb 2008) | 2 lines Add protection to chan_zap build when NEONMWI events are not defined ................ r104073 | murf | 2008-02-24 01:44:14 +0100 (Sun, 24 Feb 2008) | 1 line On a 64-bit machine, with dev-mode turned on, and pgsql installed, I get warnings that stops the compile. They are fixed now. ................ r104074 | murf | 2008-02-24 03:37:08 +0100 (Sun, 24 Feb 2008) | 1 line Enforce a space between function args as per code review. ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=104077 |