[Home]

Summary:ASTERISK-07867: Asterisk crash and no core debug provided
Reporter:Eric Romang / DCLUX (eromang)Labels:
Date Opened:2006-10-04 02:55:20Date Closed:2006-11-20 11:07:11.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk-debug.txt
Description:Hello,

We have installed asterisk 1.2.12.1 on a RHEL ES 4 server (2 * 2 dual core).

Into the Makefile we have provide this parameters :

OPTIONS= -ffast-math
DEBUG=-g # -g3 -ggdb -pg

For the DEBUG, we have try with -g3, -g, -ggdb

DEBUG_THREADS = -DDEBUG_THREADS #-DDUMP_SCHEDULER -DDEBUG_SCHEDULER -DDEBUG_THREADS -DDO_CRASH -DDETECT_DEADLOCKS

We have also try with :

DEBUG_THREADS = -DDEBUG_THREADS -DDUMP_SCHEDULER -DDEBUG_SCHEDULER -DDO_CRASH -DDETECT_DEADLOCKS

Randomly 2 or 3 times per day asterisk crash and no more process are present.

Into /var/log/message

Oct  3 16:56:40 5_lu_supernode kernel: asterisk[21421]: segfault at ffffffff00a3a960 rip 0000003b65169316 rsp 0000000040305b50 error 6

Oct  4 01:45:31 5_lu_supernode kernel: asterisk[23090]: segfault at 0000323674656588 rip 0000003b651684cb rsp 0000000040180ea0 error 4

Did somebody have the same problem ?

Or could someone help me to debug this crash ?

Regards.
Comments:By: Marcus Hunger (fnordian) 2006-10-04 05:55:24

Try ulimit -c unlimited before starting asterisk.

By: Eric Romang / DCLUX (eromang) 2006-10-04 10:08:16

Hello,

My init.d file contain :

ulimit -c unlimited
and
ulimit -n 2048

Also asterisk run as root.

I got also more debugging info, but still no core

Thxs for your time and help.

Oct  4 11:15:57 DEBUG[7560] chan_sip.c: Stopping retransmission on '58fd36ae2a408b70573322d908ce1ef8@register.voipgate.com' of Request 102: Match Found
Oct  4 11:15:57 DEBUG[7560] app_dial.c: Exiting with DIALSTATUS=CANCEL.
Oct  4 11:15:57 DEBUG[7560] cdr_addon_mysql.c: cdr_mysql: inserting a CDR record.
Oct  4 11:15:57 DEBUG[7560] cdr_addon_mysql.c: cdr_mysql: SQL command as follows: INSERT INTO cdr (calldate,clid,src,dst,dcontext,channel,dstchannel,last
app,lastdata,duration,billsec,disposition,amaflags,accountcode,userfield) VALUES ('2006-10-04 11:13:57','\"anonymous\" <35220202373>','35220202373','*324
761230499','CUSTOMER', 'SIP/cabine4-9c395e30','SIP/C_plan4-008a47f0','Hangup','',120,0,'FAILED',3,'1948',',reseller_id=\'4502\',route=\'e\',group_id=\'75
4\',profile_id=\'1964\',definition=\'CALL_OUT\',prefix=\'32476\',price=\'18\',zones=\'\',currency_id=\'1\',currency_rate=\'1\',carrier=\'PLAN\'')
Oct  4 11:15:57 DEBUG[31705] chan_sip.c: Stopping retransmission on '58fd36ae2a408b70573322d908ce1ef8@register.voipgate.com' of Request 102: Match Not Fo
und
Oct  4 11:15:57 DEBUG[31705] chan_sip.c: Stopping retransmission on '58fd36ae2a408b70573322d908ce1ef8@register.voipgate.com' of Request 102: Match Found
Oct  4 11:15:57 DEBUG[7560] chan_sip.c: update_call_counter(cabine4) - decrement call limit counter
Oct  4 11:15:57 DEBUG[31705] chan_sip.c: Stopping retransmission on '279991DE3301663A917A1F0D095C5@85.93.199.132' of Response 4159: Match Found
Oct  4 11:15:57 DEBUG[31705] res_config_mysql.c: MySQL RealTime: Everything is fine.
Oct  4 11:15:57 DEBUG[31705] res_config_mysql.c: MySQL RealTime: Retrieve SQL: SELECT * FROM real_friends WHERE name = '4959333987'
Oct  4 11:15:57 DEBUG[31706] res_config_mysql.c: MySQL RealTime: Everything is fine.
Oct  4 11:15:57 DEBUG[31706] res_config_mysql.c: MySQL RealTime: Update SQL: UPDATE real_friends SET ipaddr = '62.166.51.58', port = '1676', regseconds =
'1159954357', servername = '5_lu_supernode', protocol = 'IAX2' WHERE name = 'emiledoppert'
Oct  4 11:15:57 DEBUG[31706] res_config_mysql.c: MySQL RealTime: Updated 1 rows on table: real_friends
Oct  4 11:42:42 DEBUG[7689] res_config_mysql.c: MySQL RealTime: Everything is fine.
Oct  4 11:42:42 DEBUG[7689] res_config_mysql.c: MySQL RealTime: Retrieve SQL: SELECT * FROM real_friends WHERE name = 'register.voipgate.com'
Oct  4 11:42:42 DEBUG[7689] chan_sip.c: Stopping retransmission on '23903c78596d68b35dbcebf14ec03e7a@register.voipgate.com' of Request 102: Match Found
Oct  4 11:42:42 DEBUG[7689] res_config_mysql.c: MySQL RealTime: Everything is fine.

Oct  4 11:15:57 5_lu_supernode kernel: asterisk[31705]: segfault at 0000003d0000000f rip 00000000004a18bd rsp 000000004013d188 error 4
Oct  4 11:16:07 5_lu_supernode mon[30658]: failure for localhost asterisk-pid-start 1159953367 localhost
Oct  4 11:16:27 5_lu_supernode mon[30658]: failure for localhost asterisk-pid-start 1159953387 localhost
Oct  4 11:16:27 5_lu_supernode mon[30658]: calling alert register.down for localhost/asterisk-pid-start (/usr/lib64/mon/alert.d/register.down,) localhost

Oct  4 11:16:27 5_lu_supernode asterisk: asterisk shutdown failed
Oct  4 11:16:27 5_lu_supernode asterisk: asterisk startup succeeded

Oct  4 14:13:52 5_lu_supernode kernel: asterisk[10429] general protection rip:3b6516ff72 rsp:40181a48 error:0

By: Anthony LaMantia (alamantia) 2006-10-04 10:29:05

can you provide us any infomration on what the source of these message " 5_lu_supernode kernel:" may be on your platform. are you positive your kernel has the proper SMP configuration..etc....

a core dump would be very very helpfull...

another easy method to provide us with some information would be to  run asterisk
via gdb



gdb asterisk

then once in gdb type

asterisk -cvv

then once it crashes type  bt

and then upload the backtrace to this issue in mantis so we can get an idea of were the trouble is happening.

By: Eric Romang / DCLUX (eromang) 2006-10-05 03:33:47

Hello,

Attached to this comment all the backtraces and debug what we could do.

I think this is another problem, cause we don't have the segfault into the /var/log/message

Thxs for your time and your help

By: Anthony LaMantia (alamantia) 2006-10-10 15:31:58

when you have GDB attached is asterisk crashing before you build the backtrace?

By: Eric Romang / DCLUX (eromang) 2006-10-11 03:10:51

Hello,

I have start asterisk as you say :

gdb asterisk
gdb> asterisk -cvv

After a 1 hour or 1/2 hour (randomly) asterisk crash into the gdb

After this crash i build the backtrace.

Regards and thxs for your help and time

By: Anthony LaMantia (alamantia) 2006-10-17 15:46:46

i see no mention of a segfault/signal 11 inside of the backtrace you uploaded.

By: Eric Romang / DCLUX (eromang) 2006-10-18 03:00:29

Hello,

I explain again ;) Sorry for the confusions.

When I run asterisk as root:root or asterisk:asterisk with this options :

- unlimit -c unlimited
- nice -5 on the lunch of the daemon
- DEBUG=-g3

Randomly asterisk crash with this error messages :

Oct 12 09:27:05 5_lu_supernode kernel: asterisk[5782] general protection rip:3a9f1684cb rsp:401b8ba0 error:0
Oct 16 11:31:48 5_lu_supernode kernel: asterisk[2459]: segfault at 0000003d00000007 rip 0000002a962ef100 rsp 00000000402f67e0 error 4
Oct 16 12:22:57 5_lu_supernode kernel: asterisk[2709]: segfault at 0000003d0000000f rip 00000000004884a9 rsp 0000000040185020 error 4

This crash appears 5 - 6 times per day, and no core dump are provided.
Into the WARNING and ERROR debug nothing special appears.

I have then submit a bug to you. You say's me to lunch asterisk with gdb to provide a core dump.

So, I have lunch asterisk, with the same options :

shell> gdb asterisk
gdb> asterisk -cvv

When i lunch asterisk with gdb, he crash more often, with the gdb i give you.
In this case i don't have any segfault, but asterisk just hangup to respond. No more registration possible, no more calls possible, and the cli don't respond any more.
With asterisk running into a gdb i don't have any core dump also, only the backtrace from gdb, and the WARNING and ERROR debug nothing special appears.

This is actually our situation, no core dump provided in the 2 case of crash.

I could provide you the src.rpm and patches we have create for asterisk

I hope this will help you to understand our situation.

Thank you for your time and help.

Regards.

By: Bas (basv) 2006-11-09 05:07:43.000-0600

Hello,

I am also experiencing (seemingly) random crashes with 2 pbx's running 1.2.12.1. I am suspecting that it has to do with call limit counters not being set appropriately, but I'm not sure.

I can provide at least 3 core dumps if necessary. Please let me know if this is desired.

Basv

edit: additional info;
Running CentOS 4.4
Linux <HOSTNAME> 2.6.9-42.0.2.ELsmp #1 SMP Wed Aug 23 00:17:26 CDT 2006 i686 i686 i386 GNU/Linux
asterisk-1.2.12.1
zaptel-1.2.8
libpri-1.2.3
asteriska-addons-1.2.4
asterisk-sounds-1.2.1
A total of 3 pbx's are connected via iax2, local mediagateways are cisco.



By: Serge Vecher (serge-v) 2006-11-14 15:28:18.000-0600

You need to carefully read this page in order to provide the proper backtrace:
http://www.voip-info.org/tiki-index.php?page=Asterisk%20debugging

1) Asterisk must be built with 'make dont-optimize'
2) You must start asterisk with -g option, so that the core is dumped.
3) Analyze the core with gdb as per the page above.

By: Serge Vecher (serge-v) 2006-11-14 15:28:58.000-0600

make sure that you are testing with Asterisk 1.2.13 tarball downloaded from asterisk.org

By: Bas (basv) 2006-11-16 06:18:49.000-0600

Hello,

I did a gdb + bt on my core dumps and it turned out I got bitten by this bug: http://bugs.digium.com/view.php?id=4871

I haven't had any crashes since I switched to madplay for the music on hold (touch wood).

Suppose I was barking up the wrong tree.

basv

By: Serge Vecher (serge-v) 2006-11-20 11:07:10.000-0600

eromang: we haven't heard from you for a month. If you can still reproduce this with 1.2.13 and provide debug info as per note 0054631, please reopen the issue with information attached.