[Home]

Summary:ASTERISK-15750: Nested Dial()s that use U() or M() results in: '&(audiohook)->lock' freed more times than we've locked!
Reporter:Dennis DeDonatis (dennisd)Labels:
Date Opened:2010-03-04 21:11:04.000-0600Date Closed:2010-06-08 10:11:28
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) debug.zip
( 1) gdb.txt
( 2) stack.11253
Description:/include/asterisk/lock.h: audiohook.c line 648 (audio_audiohook_write_list): Error obtaining mutex: Invalid argument
/include/asterisk/lock.h: audiohook.c line 662 (audio_audiohook_write_list): mutex '&(audiohook)->lock' freed more times than we've locked!
/include/asterisk/lock.h: audiohook.c line 662 (audio_audiohook_write_list): Error releasing mutex: Invalid argument


****** STEPS TO REPRODUCE ******

Call in from a SIP provider, then connect to a Dial() that includes an extension with either U() or M() that goes to a context that Dial()s out using a SIP provider.

****** ADDITIONAL INFORMATION ******

This has been happening for most of the 1.6 releases, but I could never get a core dump or reproduce it.

It does the same thing with SVN-trunk-r250730M, but the core file info and debug are from 1.6.2.6-rc1.

I can supply my dialplan if the steps to reproduce don't make sense.  :)

I'm sorry I couldn't come up with a better summary.  
Comments:By: Leif Madsen (lmadsen) 2010-03-08 09:23:41.000-0600

Thanks for the well filed issue (even if the topic is long, but it makes sense after I read the description :)). I've acknowledged this issue. Thanks!

By: jiri uncovsky (jiri uncovsky) 2010-04-29 07:09:18

Hello, I also met this bug on CentOS 5.4/x86_64 with Asterisk 1.6.1.18. I can reproduce it simply by running Asterisk on the machine again but I don't know what steps are required to reproduce it on demand.

If there is any way I can help with resolution of this bug, please let me know.

Just a speculation: I am unable to reproduce the bug on other machines. Unlike the other Asterisks, the Asterisk that exhibits the bug is configured to communicate via two network interfaces - one interface is a SIP trunk to the outer world, the other interface is for local telephone network.

Stack trace after the crash is attached as stack.11253.



By: Leif Madsen (lmadsen) 2010-04-30 13:50:11

Jiri: the only way to move this issue forward at the moment is to provide a patch which resolves the issue. All information necessary is already provided I believe.

By: Dennis DeDonatis (dennisd) 2010-06-01 21:02:07

I just tried the test case that ALWAYS crashed Asterisk for me, but with 1.6.2.9-rc1 and I CANNOT reproduce the problem anymore.  This may have been fixed in 1.6.2.8 (or possibly earlier).

OR this may be because I'm now using Dahdi_dummy as a timing source on my x86_64 machine.  There is definitely something "funny" with res_timing_timerfd and x86_64, but I've been unable to prove it in any way so I haven't opened up another bug report.

I just tried 1.6.2.9-rc1 after commenting out noload => res_timing_timerfd.so in modules.conf and Asterisk still doesn't crash.

To reproduce this every time, I'd call in on a SIP DID and dial 0, which would dial out to two cell phones using U().  This crashed it every time.  I haven't had asterisk crash in a while, which seemed odd <g>, so I revisited this issue.

If Jiri can reproduce this then definitely keep it open and I'll try some more to crash it.  Otherwise, this isn't a problem for me anymore.

By: Leif Madsen (lmadsen) 2010-06-08 10:11:28

I agree that res_timing_timerfd has some issues odd like this, as does res_timing_pthread.

Unfortunately unless we can reproduce these consistently we won't be able to move them forward.

But as you stated, moving to dahdi_dummy seems to resolve the issue, so if you're able to reproduce the issues you're having with res_timing_timerfd in the future, please do open another issue so we can try to catch and resolve those issues.

Thanks!