[Home]

Summary:ASTERISK-14176: [patch] 'core stop when convenient' causes segfault
Reporter:Kevin Otte (kjotte)Labels:
Date Opened:2009-12-18 21:17:43.000-0600Date Closed:2010-02-19 13:11:47.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk-1.6.1.14-graceful-restart-segfault.patch
( 1) asterisk-1.6.2.2-graceful-restart-segfault.patch
( 2) gdb.txt
Description:Freshly compiled 1.6.2.0 segfaults every time 'core stop when convenient' is executed.  Also intermittent crashes at other times, but cannot stably reproduce.

****** ADDITIONAL INFORMATION ******

root@avalon:~# gdb /usr/sbin/asterisk core.29559
GNU gdb 6.8-debian
...
Core was generated by `asterisk -vvvgc'.
Program terminated with signal 11, Segmentation fault.
...
#0  0x0814dd01 in ast_timer_ack (handle=0xf, quantity=1) at timing.c:169
169 handle->holder->iface->timer_ack(handle->fd, quantity);
(gdb) set pagination off
(gdb) bt full
#0  0x0814dd01 in ast_timer_ack (handle=0xf, quantity=1) at timing.c:169
No locals.
#1  0xb73e1e3b in timing_read (id=0x9062a88, fd=15, events=32, cbdata=0x0) at chan_iax2.c:8793
res = <value optimized out>
processed = <value optimized out>
totalcalls = <value optimized out>
tpeer = <value optimized out>
drop = <value optimized out>
now = {tv_sec = 1261191966, tv_usec = 357094}
__PRETTY_FUNCTION__ = "timing_read"
#2  0x080e5621 in ast_io_wait (ioc=0x905f070, howlong=-1) at io.c:288
res = 1
x = 1
origcnt = 2
#3  0xb73e1c2d in network_thread (ignore=0x0) at chan_iax2.c:11692
res = <value optimized out>
count = -1211639770
wakeup = -1
f = (struct iax_frame *) 0x0
__PRETTY_FUNCTION__ = "network_thread"
#4  0x08155e6b in dummy_start (data=0x906ce58) at utils.c:968
__cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {151445840, 0, 0, -1232821304, 491184275, 1052551662}, __mask_was_saved = 0}}, __pad = {0xb684a490, 0x0, 0x0, 0x0}}
not_first_call = <value optimized out>
ret = <value optimized out>
ASTERISK-1  0xb7abd4c0 in start_thread () from /lib/i686/cmov/libpthread.so.0
No symbol table info available.
ASTERISK-2  0xb7ced6de in clone () from /lib/i686/cmov/libc.so.6
No symbol table info available.
Comments:By: Matthias Nick (mnick) 2009-12-19 09:50:25.000-0600

Be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then \'make install\' after enabling, reproduce the crash, and then execute the instructions in doc/backtrace.txt.

When complete, attach that file to this issue report. Thanks!

By: Clod Patry (junky) 2009-12-20 16:22:10.000-0600

I get segfault too on "core restart when convenient", but it looks like:
(gdb) bt
#0  0x000000000053abef in term_beep (el=0xc50eb0) at term.c:865
#1  0x000000000053c336 in ?? ()
#2  0x0000000000441cd4 in ?? ()
#3  0x00007fe2ef72b5a6 in __libc_start_main () from /lib/libc.so.6
#4  0x000000000041e139 in ?? () at ../sysdeps/x86_64/elf/start.S:113
ASTERISK-1  0x00007fff4f706928 in ?? ()
ASTERISK-2  0x000000000000001c in ?? ()
[...]

By: Nic Bellamy (nic_bellamy) 2009-12-20 17:39:17.000-0600

I have the same thing on 1.6.1.11/1.6.1.12, but hadn't filed a bug report yet as I was still doing some digging on my own when I saw this.

My digging through the source appears to show that chan_iax2 still has an open timer at the point res_timing_dahdi unregisters it; a run through valgrind confirms this:

==5049== Thread 28:
==5049== Invalid read of size 4
==5049==    at 0x8144F0C: ast_timer_ack (timing.c:169)
==5049==    by 0x1C220A00: ??? (chan_iax2.c:8690)
==5049==    by 0x80DEF16: ast_io_wait (io.c:288)
==5049==    by 0x1C22E683: ??? (chan_iax2.c:11615)
==5049==    by 0x814E47E: dummy_start (utils.c:968)
==5049==    by 0x1BB87E50: pthread_start_thread (manager.c:309)
==5049==    by 0x1BB1C8A9: clone (clone.S:102)
==5049==  Address 0x1BC866E8 is 8 bytes inside a block of size 12 free'd
==5049==    at 0x1B904B04: free (vg_replace_malloc.c:152)
==5049==    by 0x8144D24: ast_unregister_timing_interface (timing.c:111)
==5049==    by 0x1F91303E: ??? (res_timing_dahdi.c:196)
==5049==    by 0x80E1169: ast_module_shutdown (loader.c:459)
==5049==    by 0x80726F4: quit_handler (asterisk.c:1390)
==5049==    by 0x8072E48: handle_stop_when_convenient (asterisk.c:1665)
==5049==    by 0x8072E84: handle_stop_when_convenient_deprecated (asterisk.c:1671)
==5049==    by 0x809EF9C: ast_cli_command (cli.c:1888)
==5049==    by 0x8072A6E: consolehandler (asterisk.c:1536)
==5049==    by 0x807770A: main (asterisk.c:3543)

I can't see any form of refcounting in main/timing.c - so ast_unregister_timing_interface() happily runs off and frees a timer that still has users.

By: Nic Bellamy (nic_bellamy) 2009-12-20 18:13:27.000-0600

Looks like this is fixed in ASTERISK-14979 - but this needs backporting to 1.6.1 and 1.6.2 (and possibly other branches).

By: Nic Bellamy (nic_bellamy) 2009-12-21 14:17:23.000-0600

I can confirm that pulling the main/loader.c patch from r228798 fixes this crash for me on 1.6.1.12.

See changes in http://svnview.digium.com/svn/asterisk/trunk/main/loader.c?r1=228798&r2=228797&pathrev=228798

By: frawd (frawd) 2010-01-14 05:47:57.000-0600

I can confirm that too, same bug, same backtrace, and same solution for 1.6.2.1-rc1.

By: Ernesto Ongaro (ernestoongaro) 2010-02-09 17:26:48.000-0600

Confirmed in Asterisk 1.4.26 - unloading the iax2 module causes the crash as well as restart when convenient. Need the patch from 0016062 back-ported to 1.4

By: Tony Vroon (chainsaw) 2010-02-11 10:13:33.000-0600

Backported to 1.6.2 & 1.6.1

By: frawd (frawd) 2010-02-11 10:39:05.000-0600

Thanks, just in time to make it in 1.6.2.3!

By: Tony Vroon (chainsaw) 2010-02-19 08:51:48.000-0600

Could this block 1.6.2.3 as applying the attached patch solves an intermittent memory corruption in the SIP register address for me (corruption manifests as an OR with 0x40 on the 11th, 12th or 13th character).

By: Digium Subversion (svnbot) 2010-02-19 13:04:58.000-0600

Repository: asterisk
Revision: 248008

_U  branches/1.6.0/
U   branches/1.6.0/channels/chan_console.c
U   branches/1.6.0/main/loader.c

------------------------------------------------------------------------
r248008 | tilghman | 2010-02-19 13:04:57 -0600 (Fri, 19 Feb 2010) | 20 lines

Merged revisions 228798 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

(closes issue ASTERISK-14176)
Reported by: kjotte

........
 r228798 | tilghman | 2009-11-09 01:37:52 -0600 (Mon, 09 Nov 2009) | 14 lines
 
 Fix various problems detected with Valgrind.
  * chan_console accessed pvts after deallocation.
  * The module loader did not check usecount on shutdown, which led to chan_iax2
  reading a timer that was already unloaded.
 (closes issue ASTERISK-14979)
  Reported by: alexanderheinz
  Patches:
        20091109__issue16062.diff.txt uploaded by tilghman (license 14)
  Tested by: tilghman
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=248008

By: Digium Subversion (svnbot) 2010-02-19 13:05:35.000-0600

Repository: asterisk
Revision: 248009

_U  branches/1.6.1/
U   branches/1.6.1/channels/chan_console.c
U   branches/1.6.1/main/loader.c

------------------------------------------------------------------------
r248009 | tilghman | 2010-02-19 13:05:35 -0600 (Fri, 19 Feb 2010) | 20 lines

Merged revisions 228798 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

(closes issue ASTERISK-14176)
Reported by: kjotte

........
 r228798 | tilghman | 2009-11-09 01:37:52 -0600 (Mon, 09 Nov 2009) | 14 lines
 
 Fix various problems detected with Valgrind.
  * chan_console accessed pvts after deallocation.
  * The module loader did not check usecount on shutdown, which led to chan_iax2
  reading a timer that was already unloaded.
 (closes issue ASTERISK-14979)
  Reported by: alexanderheinz
  Patches:
        20091109__issue16062.diff.txt uploaded by tilghman (license 14)
  Tested by: tilghman
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=248009

By: Digium Subversion (svnbot) 2010-02-19 13:11:46.000-0600

Repository: asterisk
Revision: 248012

_U  branches/1.4/
U   branches/1.4/main/loader.c

------------------------------------------------------------------------
r248012 | tilghman | 2010-02-19 13:11:45 -0600 (Fri, 19 Feb 2010) | 5 lines

Backport crash fix from trunk to 1.4, whereby 'core show gracefully' could crash Asterisk.

(closes issue ASTERISK-14176)
Reported by: kjotte

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=248012