Summary:ASTERISK-08419: Parking causing crashes
Reporter:jlimb (jlimb)Labels:
Date Opened:2006-12-22 01:53:21.000-0600Date Closed:2007-02-19 08:40:16.000-0600
Versions:Frequency of
Environment:Attachments:( 0) cli_output_SVN-branch-1.2-r48584.txt
( 1) gdb_bt_output_10129.txt
( 2) gdb_bt_output_10687.txt
( 3) gdb_bt_output_3154.txt
( 4) gdb_bt_output_6998.txt
( 5) gdb_output_1.2.12.1.txt
( 6) gdb_output_1.2.14.txt
( 7) gdb_outputSVN-branch-1.2-r48584.txt
( 8) log_full_SVN-branch-1.2-r48584.txt
Description:park seems to crash asterisk randomly.

I started with asterisk, snom phones, and freepbx 2.1.3

Trying to reproduce this at home on several other boxes with several versions.  I would park and unpark the same call from another sip phone over and over again using the atxfer and blindxfer codes in features.conf

I have tried  1.2.14  1.4.0-beta4 and can crash them all

one thing about the the SVN-branch-1.2-r48584 version is that it will not let you repark the call using the feature code where in the you can.

I was also able to crash it also using the snom's transfer button to the parkext although it seemed easier to crash it using the feature codes.


I Would really like it if someone else could reproduce this on different phones.
Comments:By: Serge Vecher (serge-v) 2006-12-27 13:51:32.000-0600

hmm, as per gdb output, Asterisk is crashing all over the place. I've noticed that you use format_mp3 for moh. My production server used to crash at randomly until I've converted all mp3 into native ulaw files.

By: jlimb (jlimb) 2007-01-09 10:06:25.000-0600

Over the weekend I converted the mp3s on my text box running using something like this:
and was unable to crash it by parking calls over and over so I decided to do the same on my production server, also
Unfortunately, twice yesterday the system became unresponsive.  Previously it had been running smooth for a few weeks before giving the park buttons and lights back to the users.  Both times asterisk was still running.  "Show channels" showed nothing but "sip show channels" showed invites to most of the phones.  The second time it showed ringing to most of the phones.  Which was what the phones were stuck doing the second time.  Restarting asterisk fixed it both times.
Is there a better diagnostic I can run when it is in this stuck state?

Also, This morning I removed the wav files and left only the pcm files to see if it does any better.

Thanks for any help you can offer.

By: jlimb (jlimb) 2007-01-09 14:16:33.000-0600

It happened again.
Since converting to the mp3s the system hasn't crashed hard but the end results is the same.
Is the format for the moh files I am using correct?

By: Serge Vecher (serge-v) 2007-01-09 14:21:05.000-0600

update to 1.2.14 -- had known issues ...

By: jlimb (jlimb) 2007-01-10 00:23:02.000-0600

I tried 1.2.14 on my test boxes again with my new sound files and was able to reproduce the problem at home.

one thing with 1.2.14 that is different from 1.2.12 is that it will not let you repark the call using the features.conf blindxfer setting (in my case *2).

I simply call 1 phone to another then park the call over and over using transfer 70, then pick up the call by dialing 71, then park it again using transfer 70, etc. etc. until it breaks.  sometimes its only 5 or 6 times, sometimes it is more like 20 to 40 times but it will break on any box i try with any version i try.  I now have one of my test boxes in this stuck state running 1.2.14

By: jlimb (jlimb) 2007-01-10 00:27:31.000-0600

I did get this from one of my hard crashes:
*** glibc detected *** double free or corruption (!prev): 0x099a5388 ***

By: jlimb (jlimb) 2007-01-15 10:54:07.000-0600

So I completely disabled music on hold by not loading it on my production server and it has not crashed since.
Doing testing on my test boxes at home ( and 1.2.14) I wanted to figure out why sometimes I could park and unpark 50+ times no problem and sometimes 1-6 times would crash it.
If I park and unpark over and over really fast it won't crash because musiconhold does not get a chance to start playing the file.  It seems like the best way to crash it is to park and unpark right as it begins to play music.
I dont think it is a problem with musiconhold directly because when people get put on hold the music plays fine and the system doesn't crash.  I believe it is the interaction of the park feature and music on hold.
As far as I can tell this problem is on any version that I use.
Its almost like an unlock doesn't get called or something.

By: Serge Vecher (serge-v) 2007-01-15 11:12:29.000-0600

there was a patch 20060524_bug7053_trail.patch in bug 7053 that didn't get enough testing, as the original reporter could not test it any longer. If you are able to test, can you give that patch a shot?

By: Jason Parker (jparker) 2007-01-16 15:28:40.000-0600

I reviewed all of the backtraces attached to this bug, and none of them appear to even be similar.

Could you please provide a "thread apply all bt full" for the next crash you experience?

By: jlimb (jlimb) 2007-01-19 02:59:35.000-0600

The system has not crashed since I switched from mp3 to ulaw moh files, instead chan_sip is hanging.  So I don't know that I will get another crash.  The good news is that 2 days ago I also implemented the 20060524_bug7053_trail.patch in bug 7053 and turned moh back on.  Since then, chan_sip has not hung yet (which is longer than it ever had with moh turned on.)
I will wait to see how it goes.

By: jlimb (jlimb) 2007-01-25 02:33:58.000-0600

It has been a week without sip hanging.  It looks like 20060524_bug7053_trail.patch worked on my install.

So should I be able to repark calls using the blindxfer feature setting in 1.2.14 or later?   Does it have something to do with why you said "update to 1.2.14 -- had known issues ... "?  Do I need to start a new bug report?

By: Serge Vecher (serge-v) 2007-01-30 13:58:16.000-0600

1) No, there is no need to start a new bug report.
2) Please confirm that 1.2.14 with the 20060524_bug7053_trail.patch does not have parking issues.

By: Joshua C. Colp (jcolp) 2007-02-15 10:28:06.000-0600

Fixed in 1.2 as of revision 54622, 1.4 as of revision 54623, and trunk as of revision 54624. (or atleast from the look of the backtraces... it should be...) if not please reopen. Thanks!