Summary: | ASTERISK-10937: Random seg faults... | ||
Reporter: | akron (akron) | Labels: | |
Date Opened: | 2007-11-30 06:22:41.000-0600 | Date Closed: | 2007-12-26 14:46:11.000-0600 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Core/ManagerInterface |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) ast_1415_valgrind.txt | |
Description: | We get sometimes seg faults on our one configuration. We change hardware and it dosnt solve our problems. ****** ADDITIONAL INFORMATION ****** From Valgrind: ==5000== Thread 1: ==5000== Invalid read of size 4 ==5000== at 0x810B2CA: el_gets (read.c:254) ==5000== by 0x806EEB7: main (asterisk.c:2982) ==5000== Address 0x4226614 is 68 bytes inside a block of size 776 free'd ==5000== at 0x401BF6C: free (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so) ==5000== by 0x806BF1E: quit_handler (asterisk.c:1276) ==5000== by 0x806CD1C: monitor_sig_flags (asterisk.c:2531) ==5000== by 0x80FB954: dummy_start (utils.c:843) ==5000== by 0x402EF5A: pthread_start_thread (in /lib/libpthread-0.10.so) ==5000== by 0x41BDBE9: clone (in /lib/libc-2.3.6.so) ==5000== ==5000== Invalid read of size 4 ==5000== at 0x810B2E9: el_gets (read.c:258) ==5000== by 0x806EEB7: main (asterisk.c:2982) ==5000== Address 0x4226870 is 672 bytes inside a block of size 776 free'd ==5000== at 0x401BF6C: free (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so) ==5000== by 0x806BF1E: quit_handler (asterisk.c:1276) ==5000== by 0x806CD1C: monitor_sig_flags (asterisk.c:2531) ==5000== by 0x80FB954: dummy_start (utils.c:843) ==5000== by 0x402EF5A: pthread_start_thread (in /lib/libpthread-0.10.so) ==5000== by 0x41BDBE9: clone (in /lib/libc-2.3.6.so) ==5000== ==5000== Invalid read of size 1 ==5000== at 0x810B2F2: el_gets (read.c:258) ==5000== by 0x806EEB7: main (asterisk.c:2982) ==5000== Address 0x6C is not stack'd, malloc'd or (recently) free'd ==5000== ==5000== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==5000== Access not within mapped region at address 0x6C ==5000== at 0x810B2F2: el_gets (read.c:258) ==5000== by 0x806EEB7: main (asterisk.c:2982) From gdb: (gdb) bt full #0 0x041b451a in poll () from /lib/libc.so.6 No symbol table info available. #1 0x08085053 in ast_waitfor_nandfds (c=0xbcdff690, n=0, fds=0x0, nfds=0, exception=0x0, outfd=0x0, ms=0xbcdff68c) at channel.c:1981 kbrms = 500 start = {tv_sec = 1196421845, tv_usec = 249662} res = -1126173232 rms = 500 x = 0 y = 0 max = 0 sz = 168 now = 0 whentohangup = 0 diff = 500 winner = (struct ast_channel *) 0x0 __PRETTY_FUNCTION__ = "ast_waitfor_nandfds" #2 0x08085ea7 in ast_waitfor_n (c=0xa8, n=168, ms=0xa8) at channel.c:2043 No locals. #3 0x08070972 in autoservice_run (ign=0x0) at autoservice.c:84 chan = (struct ast_channel *) 0xa8 as = (struct asent *) 0xbcdff690 ms = 500 mons = {0x47f9338, 0x0 <repeats 222 times>, 0x4034700, 0x0 <repeats 12 times>, 0xbcdffbe0, 0x0, 0x0, 0x4038ff4, 0xbcdffbe0, 0x814d9e8, 0xbcdffa78, 0x4030355, 0x4033cd1, 0x0, 0x0, 0x4e9d920, 0x4e9d888, 0x0, 0xbcdffaa8, 0x4038ff4, 0x814d9e8, 0x0, 0xbcdffaa8, 0x4030742} x = 0 __PRETTY_FUNCTION__ = "autoservice_run" #4 0x080fb955 in dummy_start (data=0x1f4) at utils.c:843 _buffer = {__routine = 0x8067cd0 <ast_unregister_thread>, __arg = 0x91000f, __canceltype = -1126171928, __prev = 0x0} ret = (void *) 0x4e9d888 a = {start_routine = 0x80708e0 <autoservice_run>, data = 0x0, name = 0x4e9d888 "autoservice_run started at [ 114] autoservice.c ast_autoservice_start()"} ASTERISK-1 0x0402ef5b in pthread_start_thread () from /lib/libpthread.so.0 No symbol table info available. ASTERISK-2 0x041bdbea in clone () from /lib/libc.so.6 No symbol table info available. | ||
Comments: | By: Joshua C. Colp (jcolp) 2007-11-30 08:34:19.000-0600 Can you please give 1.4.15 a try? The autoservice stuff was improved a bit. As well what are the call flows like? Calls getting parked? By: Joshua C. Colp (jcolp) 2007-12-17 11:02:35.000-0600 Suspended due to lack of response. By: akron (akron) 2007-12-17 11:17:20.000-0600 We switched to 1.4.15 2 days ago and it crashed after 2 days of working. Now we have runing asterisk 1.4.15 with valgrind debuger. But for 1.4.14 - we changed 1.5 week ago our network configuration to not natted server and It wasnt crashed. Our simplify configuration for crash setup: NetworkSwitch with ports to: --[eth1-192.168.1.1]NatLinuxRouter -> internet --[eth0-192.168.1.2]Asterisk --[IpPhone1-192.168.1.5] --[IpPhone2-192.168.1.6] Maybe that kind of configuration is problematic. Good configuration [more than week without crash]: NetworkSwitch with ports to: |--[eth1-192.168.1.2]Asterisk-[eth0-222.222.222.222]-Internet |--[IpPhone1-192.168.1.5] |--[IpPhone2-192.168.1.6] By: akron (akron) 2007-12-18 09:00:21.000-0600 Next segfault: Core was generated by `/usr/sbin/asterisk -vvvvvv -dddddd -g -c'. Program terminated with signal 6, Aborted. #0 0x40117c81 in kill () from /lib/libc.so.6 (gdb) full bt Undefined command: "full". Try "help". (gdb) bt full #0 0x40117c81 in kill () from /lib/libc.so.6 No symbol table info available. #1 0x4002a4a1 in pthread_kill () from /lib/libpthread.so.0 No symbol table info available. #2 0x4002a87b in raise () from /lib/libpthread.so.0 No symbol table info available. #3 0x401178f8 in raise () from /lib/libc.so.6 No symbol table info available. #4 0x40118f00 in abort () from /lib/libc.so.6 No symbol table info available. ASTERISK-1 0x4014b6ce in __libc_message () from /lib/libc.so.6 No symbol table info available. ASTERISK-2 0x40151518 in malloc_consolidate () from /lib/libc.so.6 No symbol table info available. ASTERISK-3 0x40152422 in _int_malloc () from /lib/libc.so.6 No symbol table info available. ASTERISK-4 0x40153fa2 in calloc () from /lib/libc.so.6 No symbol table info available. ASTERISK-5 0x40605eb0 in sip_alloc (callid=0x0, sin=0x0, useglobal_nat=0, intended_method=3) at /home/jancio/asterisk-1.4.15/include/asterisk/utils.h:359 __PRETTY_FUNCTION__ = "sip_alloc" ASTERISK-6 0x40628daa in sip_poke_peer (peer=0x81aa870) at chan_sip.c:15601 xmitres = 0 __PRETTY_FUNCTION__ = "sip_poke_peer" ASTERISK-7 0x40629a79 in sip_poke_peer_s (data=0x0) at chan_sip.c:7813 No locals. ASTERISK-8 0x080ef2c8 in ast_sched_runq (con=0x81991a0) at sched.c:359 cur = (struct sched *) 0x81ef680 tv = {tv_sec = 135934256, tv_usec = -1098909392} numevents = 0 res = 135934256 ASTERISK-9 0x4064eb9e in do_monitor (data=0x0) at chan_sip.c:15492 res = 0 sip = (struct sip_pvt *) 0x0 peer = (struct sip_peer *) 0x0 t = 135672448 fastrestart = 0 lastpeernum = -1 ---Type <return> to continue, or q <return> to quit--- curpeernum = 11 reloading = 0 __PRETTY_FUNCTION__ = "do_monitor" ASTERISK-10 0x080fc2c5 in dummy_start (data=0x40030ff4) at utils.c:843 _buffer = {__routine = 0x8067d20 <ast_unregister_thread>, __arg = 0x2800a, __canceltype = -1098908952, __prev = 0x0} ret = (void *) 0x81a38b8 a = {start_routine = 0x4064e630 <do_monitor>, data = 0x0, name = 0x81a38b8 "do_monitor", ' ' <repeats 11 times>, "started at [15546] chan_sip.c restart_monitor()"} ASTERISK-11 0x40026f5b in pthread_start_thread () from /lib/libpthread.so.0 No symbol table info available. ASTERISK-12 0x401b6bea in clone () from /lib/libc.so.6 No symbol table info available. Valgrind debug in attachment. By: akron (akron) 2007-12-18 09:05:49.000-0600 Last from log: == Connect attempt from '192.168.100.2' unable to authenticate *** glibc detected *** corrupted double-linked list: 0x08215080 *** Aborted By: Tilghman Lesher (tilghman) 2007-12-24 10:45:49.000-0600 Please try the patch uploaded to ASTERISK-11083, as it appears to be the same issue. By: Digium Subversion (svnbot) 2007-12-26 14:40:19.000-0600 Repository: asterisk Revision: 94808 U branches/1.4/main/manager.c ------------------------------------------------------------------------ r94808 | tilghman | 2007-12-26 14:40:18 -0600 (Wed, 26 Dec 2007) | 6 lines Workaround for what is probably a glibc bug (but we'll see this crop up again and again, if we don't add the workaround). Reported by: rolek Patch by: tilghman (Closes issue ASTERISK-11083, closes issue ASTERISK-10937) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=94808 By: Digium Subversion (svnbot) 2007-12-26 14:46:11.000-0600 Repository: asterisk Revision: 94809 _U trunk/ U trunk/main/manager.c ------------------------------------------------------------------------ r94809 | tilghman | 2007-12-26 14:46:10 -0600 (Wed, 26 Dec 2007) | 14 lines Merged revisions 94808 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r94808 | tilghman | 2007-12-26 14:43:38 -0600 (Wed, 26 Dec 2007) | 6 lines Workaround for what is probably a glibc bug (but we'll see this crop up again and again, if we don't add the workaround). Reported by: rolek Patch by: tilghman (Closes issue ASTERISK-11083, closes issue ASTERISK-10937) ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=94809 |