Summary: | ASTERISK-19295: Segfault on "sip show peers" on Solaris | ||
Reporter: | Ben Klang (bklang) | Labels: | |
Date Opened: | 2012-02-02 09:39:35.000-0600 | Date Closed: | 2012-02-13 12:32:24.000-0600 |
Priority: | Critical | Regression? | Yes |
Status: | Closed/Complete | Components: | Channels/chan_sip/General |
Versions: | SVN | Frequency of Occurrence | Constant |
Related Issues: | |||
Environment: | OpenSolaris 2009.06 i386 | Attachments: | |
Description: | This morning I updated to the latest Asterisk 10 SVN branch. After restarting Asterisk, any time I run "sip show peers" Asterisk immediately crashes with a segfault. This appears to be another issue (as I've reported in several other cases) where a printf format string is being called with a null value. But this time it is wrapped up in some formatting code so I could not find the exact line. Backtrace: (gdb) bt full #0 0xfeca47a0 in countbytes () from /usr/lib/libc.so.1 No symbol table info available. #1 0xfecf0793 in _ndoprnt () from /usr/lib/libc.so.1 No symbol table info available. #2 0xfecf2dee in snprintf () from /usr/lib/libc.so.1 No symbol table info available. #3 0xfaa7eb6c in _sip_show_peers (fd=24, total=0x0, s=0x0, m=0x0, argc=144931016, argv=0xfa28d8d8) at netsock2.h:241 status = "Unmonitored\000\000\000\000\000\000\000\000" srch = "gmom", ' ' <repeats 22 times>, "(Unspecified)", ' ' <repeats 28 times>, "D N A 67.216.45.45 OK (54 ms)", ' ' <repeats 35 times>, "\n\000\000\000\000\000\000??\f?(?\000\000\000\000????\000\000\000\000\000\000\000\000H !\b\001\000\000\0008?(???(?5?(?\000\000\000\000\034\000\000\000\n\000\000\000??(?\000\000\000\000\000\000\000\000\"\000\000\000'8\033\b0\000\000\000\001", '\0' <repeats 11 times>, " \000\000\000 \000\000\000 \000\000\000 \000\000\000"... pstatus = 0 '\0' regexbuf = {re_nsub = 145037449, re_comp = 0x89c6429, re_cflags = -19184512, re_erroff = 1, re_len = 4294967295, re_sc = 0x1} havepattern = 0 peer = (struct sip_peer *) 0x8a39d58 i = {c = 0x0, flags = 0, bucket = 563, c_version = 27, obj = 0x0, version = 0} name = "gmom\000hi1b/a7265sWDeTl6zYge\000?:??\0002<?\000\000\000\000\000\000\000@\000\000\000@??\037\b\000\000????(?s???\0002<?\000\000\000@\000\000\000\000^???\0002<?\000\000????(?!\000\000\000?\020??\000\000\000\001\000\000\000\000\")?? v??\000\000????(?\226???\000\000\000\000\000\000??\021?(?\020?(?\0002<?\000\000\000\000?v??\030\000\000\0000?(?\020?(?l?(?:??0?(?\020?(?\001\000\000\000\001\000\000\000?\023\037\b\000\000??"... total_peers = 26 peers_mon_online = 9 peers_mon_offline = 2 peers_unmon_offline = 1 peers_unmon_online = 1 id = 0x0 idtext = '\0' <repeats 255 times> realtimepeers = 1 objcount = 144940376 k = 12 __PRETTY_FUNCTION__ = "_sip_show_peers" #4 0xfaa7f201 in sip_show_peers (e=0xfab0fa08, cmd=0, a=0x0) at chan_sip.c:17195 No locals. #5 0x080d4b65 in ast_cli_command_full (uid=-2, gid=-2, fd=24, s=0xfa28da14 "sip show peers") at cli.c:2502 args = {0xfab0fa08 "\200\030?\b\204\030?\b\211\030?\b", 0x89c6420 "sip", 0x89c6424 "show", 0x89c6429 "peers", 0x0, 0x8d5b8d7 "", 0x0 <repeats 47 times>, 0xfefe5026 "[\201?\216g\001", 0x806d0fc "ite", 0xfec8fc1e "ite", 0x0, 0x0, 0x0, 0x0, 0x0, 0x84e0 <Address 0x84e0 out of bounds>, 0x84e <Address 0x84e out of bounds>, 0xfeffb7b4 "??\003", 0xfa28da24 "nt"} e = (struct ast_cli_entry *) 0xfab0fa08 x = 3 duplicate = 0x89c6420 "sip" tmp = "sip show peers\000ent\000ast 0\0000", '\0' <repeats 38 times> retval = 0x0 a = {fd = 24, argc = 3, argv = 0xfa28d8d8, line = 0x0, word = 0x0, pos = 0, n = 0} __PRETTY_FUNCTION__ = "ast_cli_command_full" #6 0x080d4cf6 in ast_cli_command_multiple_full (uid=-2, gid=-2, fd=24, size=15, s=0xfa28dc74 "sip show peers") at cli.c:2525 cmd = "sip show peers\000ent\000ast 0\0000\000??(??(?\000\000\000\000?D??\000\000\000\001\000\000\000\000\")??", '\0' <repeats 12 times>, "????\000\000\000\000\000\000??\224?(?\037D??\0002<?\000\000\000\000??(?~C??\020", '\0' <repeats 11 times>, "\004\000\000\000?D??\000\000????(??W???D??\000\000\000\000\000\000\000\000?V????\006\b\030\000??H\005??\0002<?\220~\000\003\003\000\000??\005\b?????D??\000\000??\024?(?1???\0002<"... x = 15 y = 0 count = 0 #7 0x080930be in netconsole (vconsole=0x8212260) at asterisk.c:1294 hostname = "grant", '\0' <repeats 250 times> tmp = "sip show peers\000\000tleast 0\000logger mute silent\000\000\032\000\000?\001\000\000??????(??b??I\211?????391\033B????\001??37mh\000\000\000\000\f\020??<\b??\230\002??????r??[0m: Peer 'v\204?(?\030\000v?\0002<?\000\000\000\000\200\000\000\000H\005??\030\000v?\000\000\000\000\001\"\000\0001ms l?(???(?P?(?t?(????\004E??\000\000\000\000\000\000\000\000????"... res = 0 fds = {{fd = 24, events = 1, revents = 1}, {fd = 28, events = 1, revents = 0}} __PRETTY_FUNCTION__ = "netconsole" #8 0x081885b1 in dummy_start (data=0x0) at utils.c:1010 _cleanup_info = {pthread_cleanup_pad = {0, 4196982732, 134813588, 41}} ret = (void *) 0x8092e84 a = {start_routine = 0x8092e84 <netconsole>, data = 0x8212260, name = 0x8ccd118 "netconsole", ' ' <repeats 11 times>, "started at [ 1372] asterisk.c listener()"} #9 0xfed2cd66 in _thrp_setup () from /usr/lib/libc.so.1 No symbol table info available. #10 0xfed2cff0 in __csigsetjmp () from /usr/lib/libc.so.1 No symbol table info available. #11 0x00000000 in ?? () No symbol table info available. | ||
Comments: | By: Walter Doekes (wdoekes) 2012-02-07 02:20:01.459-0600 Interesting. I'm not sure why it would be in the netsock2.h code, where there is no snprintf. Also, there are no tmp_host and tmp_port in scope, while they should be in the same scope as `srch` (which incidentally is unused, see https://reviewboard.asterisk.org/r/1696/ ). I was thinking that a null tmp_port could cause this (this used to be an int in asterisk 1.8), but the code doesn't allow it to be null. You could try to remove the unused code by applying the patch from https://reviewboard.asterisk.org/r/1696/ . Next, I'd be curious what the values of these missing locals are. You could attach gdb and step through the function. (break _sip_show_peers; cont; next; next; next; print *tmp_host; cont) By: Ben Klang (bklang) 2012-02-10 12:55:10.652-0600 I've tried running the steps you suggested via gdb, but was not able to print anything related to *tmp_host: 0xfed31e75 in __read () from /usr/lib/libc.so.1 (gdb) break _sip_show_peers Breakpoint 1 at 0xfaace660: file chan_sip.c, line 17227. (gdb) cont Continuing. [New LWP 31 ] [New LWP 1331 ] [LWP 1331 exited] [New LWP 3 ] [New LWP 1332 ] [New Thread 1332 (LWP 1332)] [Switching to Thread 1332 (LWP 1332)] Breakpoint 1, _sip_show_peers (fd=0, total=0xfa17957c, s=0x0, m=0x0, argc=-99116924, argv=0xfa1798d8) at chan_sip.c:17227 17227 char idtext[256] = ""; (gdb) next 17210 { (gdb) next 17227 char idtext[256] = ""; (gdb) next 17210 { (gdb) p *tmp_host; No symbol "tmp_host" in current context. (gdb) next 17227 char idtext[256] = ""; (gdb) p *tmp_host; No symbol "tmp_host" in current context. (gdb) next 17210 { (gdb) p *tmp_host; No symbol "tmp_host" in current context. (gdb) cont Continuing. [New LWP 28 ] [New LWP 1333 ] [New LWP 1334 ] Program received signal SIGSEGV, Segmentation fault. 0xfeca47a0 in countbytes () from /usr/lib/libc.so.1 (gdb) p *tmp_host; No symbol "tmp_host" in current context. By: Walter Doekes (wdoekes) 2012-02-10 13:47:56.453-0600 Are you running latest 10.x now? I see tmp_host/tmp_port were added only recently. By: Ben Klang (bklang) 2012-02-10 13:53:51.265-0600 No, sorry, still running the same build from the original report. I'll update tonight when the system isn't in use. By: Ben Klang (bklang) 2012-02-13 12:32:10.510-0600 Since updating to the latest 10 SVN branch this morning I cannot reproduce the issue. At this point I'm inclined to think that I got a bad checkout that was subsequently fixed. Should it reoccur, I will reopen the ticket. |