Summary:ASTERISK-09426: Crash on Channel Hangup
Reporter:coredump (coredump)Labels:
Date Opened:2007-05-11 13:05:52Date Closed:2011-06-07 14:00:59
Versions:Frequency of
Description:Asterisk 1.2.17 crashes while hanging up a channel, freeing a bad pointer without checking.


(gdb) bt
#0  0xb7e34027 in raise () from /lib/tls/libc.so.6
#1  0xb7e35747 in abort () from /lib/tls/libc.so.6
#2  0xb7e67579 in __libc_message () from /lib/tls/libc.so.6
#3  0xb7e6ffd6 in malloc_printerr () from /lib/tls/libc.so.6
#4  0xb7e6ecbd in _int_free () from /lib/tls/libc.so.6
ASTERISK-1  0xb7e6da2b in free () from /lib/tls/libc.so.6
ASTERISK-2  0x080604bd in ast_channel_free (chan=0xb71883d8) at channel.c:955
ASTERISK-3  0x08060e90 in ast_hangup (chan=0xb5b49848) at channel.c:1390
ASTERISK-4  0x0809005f in __ast_pbx_run (c=0xb5b49848) at pbx.c:2487
ASTERISK-5  0x08090a8f in pbx_thread (data=0x0) at pbx.c:2537
ASTERISK-6 0xb7fd40fb in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-7 0xb7ec397e in clone () from /lib/tls/libc.so.6

Line 955 of channel.c:

       /* loop over the variables list, freeing all data and deleting list items */
       /* no need to lock the list, as the channel is already locked */

       while ((vardata = AST_LIST_REMOVE_HEAD(headp, entries)))

It appears that the vardata list is corrupted or bad, and hence freeing the pointer without checking causes crash.
Comments:By: Joshua C. Colp (jcolp) 2007-05-14 11:36:41

Do you have the console output of when this happened? Any details about what was executing at the time? Can you try 1.2.18? (although I checked and nothing changed that would have solved this if it was a critical issue... but never know)

By: coredump (coredump) 2007-05-14 11:43:44

Nothing that I'm able to find in the log ( I don't run it from console, as it's a production system ) during the time when this crashed ( or on other crashes with the same error ).

It seems that the code in channel.c assumes all the pointers are valid when it tries to access them.

Moving to 1.2.18 would be problematic as it's production, and I would need to stage it in the lab before upgrading.  Also, nothing changes in the code in 1.2.18 in the area where it's crashing.  

There may be a fix elsewhere that keeps the channel vars from being corrupted or causing whatever triggers this fix, but it seems to me that a better long term fix would be adding paranoia checks in channel.c before accessing/freeing the pointers.

By: Russell Bryant (russell) 2007-06-06 15:53:00

There is no way to determine if a pointer is valid other than it not being NULL.  So, there is no way to improve the block of code that you referenced.

I committed a change earlier today that may prevent this crash from happening.  However, if you try the latest 1.2 code and you still have a problem, we'll need a backtrace that is not optimized.  Install asterisk with "make dont-optimize".  Then, provide the gdb output of both "bt" and  "bt full".