Summary:ASTERISK-04076: [support script] this script alows a person to grab a core of the running or deadlocked process
Reporter:derrick (derrick)Labels:
Date Opened:2005-05-04 15:38:14Date Closed:2008-01-15 15:34:38.000-0600
Versions:Frequency of
Environment:Attachments:( 0) ast_grab_core.sh
( 1) snarf_asterisk_core.sh
Description:dumps core and emails the admin a bt.  i made this so when i'm unavailable and a box messes up we can still get some useful info from it.  there was a thread a month or so ago on -dev mentioning a desire for such a feature.

i guess this could go in addons?


disclaimer on file.
Comments:By: James Golovich (jamesgolovich) 2005-05-04 15:41:33

I put together something similar over a year ago.  I just started working on some thread stuff recently so I'll pull it out and upload it here.  It walked the channel list and a few other lists displaying the info, as well as dumping all the verbose logs

By: Kevin P. Fleming (kpfleming) 2005-05-04 21:02:10

I think this script is a reasonable addition, it could go into contrib/scripts.

James, are you suggesting you have a shell script that does all this channel walking and displaying, or some other kind of tool? I'm not sure that's the same idea as this script, this one is much simpler and would be a good first step.

By: Mark Spencer (markster) 2005-05-04 23:21:05

I'm okay with the script going into contrib/scripts.  Maybe we should call it something involving asterisk if its intent is for asterisk specifically?

By: Mark Spencer (markster) 2005-05-04 23:23:05

Okay, to clarify, i mean *starting* with something involving asterisk (e.g. ast_snarf_core)...  I *do* realize that snarf_asterisk_core.sh does have asterisk in the name.

By: James Golovich (jamesgolovich) 2005-05-04 23:59:18

Basically what I have is the same thing.  A shell script that generates a gdb script to run on the core.  I just checked it out and it doesn't even work properly because of the changes to remove flags and go to the ast_set_flags macros.  Since this script drops the core for later inspection it might be better to have a seperate script to just parse out some useful information.

My original script was originally going to be forked and executed by a monitor thread on a dieing thread to snarf all the info or when a deadlock is detected to automatically dump the bt.  I ran out of time before when I was working on it and lost half the work when my laptop was stolen a while ago, but I might start working on that again.

By: paradise (paradise) 2005-05-05 00:14:55

well done.
but i'm under the impression to write a script which will act as an wrapper and detects both crashs and deadlocks. detecting crashs is not a hard job but there is no clue at the moment to detect deadlocks automatically.
a deadlocked box is a real nightmare when the poor admin isn't at company.

By: Clod Patry (junky) 2005-05-05 07:05:32

for that part of code:
/usr/bin/mail -s "${HOSTNAME} core dumped at ${DUMPDIR}/asterisk_${DATE}.core.${PID}" ${ADMINEMAIL} < /tmp/gdb_dump.${PID}
is there any reason why it's here?
Cause if we running safe_asterisk, there's already NOTIFY and MACHINE to email in case of crashes, no?

By: Kevin P. Fleming (kpfleming) 2005-05-05 08:52:47

derrick: I'm ready to commit this under the name 'ast_grab_core' in contrib/scripts, if that's OK with you (you have an email address embedded in it, so if you want that changed to match please let me know).

By: derrick (derrick) 2005-05-05 11:00:24

junky:  that's so whenever one of my co-workers runs it i get an email and page and know that something is amiss and should take a look.   it's also an easy way to keep a history of your bt's.

hmmm, is there a way i could include the asterisk version in this?  if it's deadlocked i potentially couldn't -r "show version" and we won't know the build directory (if anyone does know how to figure oute $cdir from gdb i'd love to know) so i'm not sure we could include the verison.  that would be handy though.  is there a global var w/ the version that i haven't seen?

kpflemming:  i agree w/ the naming changes i should have done that from the start but just used my old name.  i was hoping 'snarf' would make it in there somehow, but no biggie :)   i would appreciate you updating the email address to reflect the name too.  thank you for offering.

By: James Golovich (jamesgolovich) 2005-05-05 12:17:20

There isn't a global var containing the version, although it seems worthwhile to create one for this purpose.  ast_startuptime and ast_lastreloadtime might be good additions.

This doesn't seem the best command to locate asterisk.  On my box if I'm not running asterisk in the foreground (which I always start with asterisk -cvvvg) it doesn't locate the PID
PID=`ps auxwf|grep asterisk|grep vv|head -1|awk '{print $2}'`
This works a bit better
PID=`ps auxwf|grep asterisk|grep -v grep|head -1|awk '{print $2}'`

By: derrick (derrick) 2005-05-05 14:31:18

that only works if you are not also running safe_asterisk or editing a file w/ asterisk in the name. ie, anything else in the process table that might match asterisk, could be matched instead of the real process id.  do you not have an asterisk.conf and astrundir defined? the `ps` was just a last resort and i do not see a way for it to ever really be reliable.

i guess it's safe to assume the .pid will be written somewhere, and if we can't find an astrundir in asterisk.conf to just default to /var/run/asterisk.pid?  i'll  add some more checking for the pid file

By: derrick (derrick) 2005-05-05 15:04:47

i changed the parsing of asterisk.conf in case there are no spaces around => and just a fallback to /var/run/asterisk.pid and warn if that can not be found and using `ps`

By: Kevin P. Fleming (kpfleming) 2005-05-15 01:02:09

Committed to CVS HEAD, thanks! (I did remove the .sh suffix, to match the other scripts)

By: Digium Subversion (svnbot) 2008-01-15 15:34:38.000-0600

Repository: asterisk
Revision: 5671

A   trunk/contrib/scripts/ast_grab_core

r5671 | kpfleming | 2008-01-15 15:34:38 -0600 (Tue, 15 Jan 2008) | 2 lines

add script for grabbing core dump from running Asterisk process (bug ASTERISK-4076)