Summary:ASTERISK-14825: crash because of invalid cdr->dst string
Reporter:fhackenberger (fhackenberger)Labels:
Date Opened:2009-09-14 06:34:39Date Closed:2009-11-30 15:48:33.000-0600
Versions:Frequency of
Environment:Attachments:( 0) bt.txt
Description:The actual crash is due to a race condition with SQLAllocHandle. Asterisk reconnects to the DB if executing a statement fails. If, at the same time, another thread tries to execute a statement, we have a crash. Threads 1, 3 in the attached backtrace. However, a failing DB statement during normal operation should not happen in the first place. The statement which fails is an INSERT into the asterisk cdr. The reason for aborting the statement seems to be the field 'dst' which is set to:
(gdb) print /x cdr.dst
$3 = {0xff, 0x0, 0x32, 0x37, 0x0 <repeats 76 times>}
cdr.dst is set to chan.exten when the cdr struct is initialised. chan.exten in turn is set to "" when the channel struct is initialised. I cannot see a way for it to be uninitialised.
Comments:By: Leif Madsen (lmadsen) 2009-09-15 12:06:39

I believe the backtrace would be more useful to the developers if you're able to reproduce this issue and obtain a backtrace when DONT_OPTIMIZE is enabled in the Compiler Flags section of menuselect.

More information is available in doc/backtraces.txt


By: Leif Madsen (lmadsen) 2009-09-30 09:52:15

This could be another one of those issues I've been seeing lately where the backtrace is fine, but shows values optimized out for some reason. I'm setting this to Acknowledged since it seems YOU were able to get something useful out of it and mentioned it in your report.

By: Matthew Nicholson (mnicholson) 2009-11-10 15:02:07.000-0600

I am not sure how a race in SQLAllocHandle() would cause this.  SQLAllocHandle() is called in several places in asterisk without any locking protecting and a quick google search seems to indicate that it is thread safe.  It does appear that one thread in asterisk is attempting to allocate a statement handle while another thread is attempting to allocate a connection handle.  The connection handle used in thread 1 has probably already been destroyed by thread 3.  If this is the case, then calling SQLAllocHandle() on a connection handle that has been freed will cause problems.

Please provide the value of obj->con for those two threads.  You can get this information by doing typing the following commands at the gdb command line:

thread 1
frame 9
print obj->con

thread 3
frame 7
print obj->con

Those commands are specific to the BT you loaded, the thread and frame numbers may be different for other core dumps.

I don't think this is CDR related, it is more likely a race condition in the way asterisk handles reconnecting to the database.

By: Matthew Nicholson (mnicholson) 2009-11-10 15:07:26.000-0600

On second thought, the information I requested won't be very useful as the value of obj->con will be the same in both of those threads.  I am going to try and reproduce this here.  Please stand by.

By: Matthew Nicholson (mnicholson) 2009-11-11 15:19:26.000-0600

I was unable to reproduce this with asterisk trunk.  Please try to reproduce this with the trunk branch.

Also please provide instructions on how you reproduce this crash.