Summary:ASTERISK-16064: [patch] [regression] asterisk 1.4.31 crashes (segmentation fault) on sip to sip call via iax2 trunk
Reporter:vieri (vieri)Labels:
Date Opened:2010-05-06 13:05:36Date Closed:2011-06-07 14:00:53
Versions:Frequency of
Environment:Attachments:( 0) 1.4.31_core_gdb_1_b.txt
( 1) 1.4.31_core_gdb_1.txt
( 2) 1.4.31_core_gdb_2_b.txt
( 3) 1.4.31_core_gdb_2.txt
( 4) bug17301.diff.txt
Description:Attaching 2 core dump traces when Asterisk 1.4.31 crashed while calling from sip extension to sip provider via a LAN IAX2 trunk between 2 asterisk servers.
Comments:By: Paul Belanger (pabelanger) 2010-05-06 13:17:19

Is this a result of upgrading to 1.4.31, or have you been able to reproduce it with a previous version?

By: vieri (vieri) 2010-05-06 13:22:26

I upgraded from 1.4.29 to 1.4.31 this morning. I never had this kind of crash with previous versions.

By: Paul Belanger (pabelanger) 2010-05-06 13:26:40

If you are able to reproduce this easily, could you try 1.4.30?  I'd like to see when this was introduced.

By: vieri (vieri) 2010-05-06 13:29:57

I'll need some time to downgrade and test because it's a production server and I can switch the system at certain times of day only.
Will let you know.

By: Russell Bryant (russell) 2010-05-06 14:02:42

Can you run these commands for me?

gdb asterisk core.12345
(gdb) frame 1
(gdb) p x
(gdb) p name
(gdb) p cdr_readonly_vars[x]

By: vieri (vieri) 2010-05-06 17:21:34

I've attached the requested output.

By: Alec Davis (alecdavis) 2010-05-08 04:15:28

From the debug print that russell got you to print it appears as though it's related to 'dnid'.

Please try patch bug17301.diff.txt when possible.

patch testing: dnid is still recorded to mysql on a 1.6.1 production box.

In hindsite: my patch doesn't change the circumstances that cause this crash.

What doesn't make sense is why a simple loop segfaults while scanning a const char array of 'field' names for a match with "dnid". We're not yet dealing with the value of 'dnid' which is a pointer to c->cid.cid_dnid which could possibly disappear, which may cause a segfault?

By: David Vossel (dvossel) 2010-05-19 12:23:35

This is a interesting crash.  I've looked into and it is not at all obvious to me ho this could happen.  Some sort of memory corruption is happening.  Would you be able to run Asterisk under valgrind?

By: vieri (vieri) 2010-05-20 09:05:45

You're not going to believe this but I haven't had another crash since I reported the first 2 on 2010-05-06. I was waiting for a third crash before applying the patch alecdavis submitted but that never happened... Just for the record, the first two crashes occurred several hours after I upgraded Asterisk, shutdown the machine and booted it (I did not simply restart the process).

Since 2010-05-06 I rebooted the server once (a week ago, just to see what would happen). It hasn't crashed since.

I tried to run exactly the same actions that supposedly caused the crash (over and over) but I haven't been able to reproduce it.

I think you can change the "Severity" level of this report.

Thanks all

By: David Vossel (dvossel) 2010-05-20 09:47:27

Based on the backtraces, this issue does not look possible without some sort of memory corruption, and it is impossible to know how that occurred with the current information.

Given the information provided, this issue can not be moved further.  If it occurs again and you are able to provide some new information (valgrind report) please re-open this.