[Home]

Summary:ASTERISK-12873: asterisk random crashes: IMAP toolkit crash: Unlock when not locked
Reporter:Perry The Cynic (perry the cynic)Labels:
Date Opened:2008-10-10 18:33:22Date Closed:2008-12-18 16:00:17.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Applications/app_voicemail/IMAP
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk-2.gdb.txt
( 1) asterisk-4.gdb.txt
( 2) asterisk-4.log
( 3) asterisk-gdb.log
( 4) asterisk-run.log
Description:Asterisk crashes randomly. The system log always contains this message:

Oct 10 13:23:47 hermod asterisk: IMAP toolkit crash: Unlock when not locked

Needless to say, this makes my wife very unhappy. :)

Asterisk-1.4.22 with zaptel-1.4.12.1. No local hacks.
Debian GNU/Linux 4.0 (fully updated); Linux kernel 2.6.21.5 (custom); on a custom VIA Esther-based mini-box.
IMAP server is Mac OS X Server 10.5.5 (Cyrus IMAP).

Please let me know what else I should be providing. This IMAP library error supposedly indicates a re-entry of threading bug in the caller, which in this case would be asterisk...


****** ADDITIONAL INFORMATION ******

Asterisk seems to be functioning otherwise normally. It doesn't do much - just PBX duty for our house. POTS interface is a fully loaded TDM400P REV I (which seems to work normally).

asterisk/voicemail.conf contents:

[general]
format=wav|gsm ; recording formats (compressed wav, also save gsm

imapserver=imap.cynic.org ; IMAP server
imapport=143 ; plain IMAP
imapflags=novalidate-cert ; don't check server's SSL authority
expungehangup=no ; don't delete marked messages when the phone drops

; authentication information to the IMAP server
authpassword=<removed>

serveremail=voicemail@cynic.org ; email From:
attach=yes ; attach sound recording

;maxmsg=100 ; limit messages per mailbox
maxmessage=240 ; maximum message length (s)
;minmessage=3 ; minimum message length (s)
;maxgreet=60 ; maximum greeting length (s)
skipms=3000 ; skip forward/back increment (ms)
maxsilence=5 ; end recording after silence (s)
silencethreshold=128 ; silence threshold (0..255)
maxlogins=3 ; max password tries
volgain=20.0       ; make recorded voicemail louder


emailsubject=Voicemail from ${VM_CALLERID}
emailbody=${VM_CALLERID}\r\n${VM_DATE} (${VM_DUR})\r\n\r\n

[vm-cynics]
main => <removed>,Cynics,,,imapuser=voicemail
--- end ---

where <removed> was elided to protect the guilty.


Comments:By: Philip Rosenberg-Watt (kc0bvu) 2008-10-10 19:44:16

No doubt whoever knows enough to be able to fix the issue will want a backtrace log attached.

By: Mark Michelson (mmichelson) 2008-10-13 15:14:06

This is a tricky issue to try to solve, from Asterisk's standpoint, mainly because I'm not convinced that it is an actual issue in Asterisk. The problem here is that the UW IMAP c-client is attempting to unlock a mutex which it thinks is locked. It isn't the responsibility of the application to lock this lock prior to making library calls to the c-client. In fact, if I were to add a mail_lock() call prior to most c-client library functions, the opposite problem would occur ("lock when already locked") when the c-client would attempt to recursively lock the mutex.

You mention in the description that "This IMAP library error supposedly indicates a re-entry of threading bug in the caller, which in this case would be asterisk.." I'm just curious where you read this since I've been hearing complaints about this for a while and would really like it if there were something that could be done in Asterisk to fix it.

Also, as kc0bvu pointed out, a backtrace of the crash may be helpful. Instructions for getting a backtrace may be found in doc/backtrace.txt in the Asterisk source directory.

By: Leif Madsen (lmadsen) 2008-10-14 11:07:49

Changing to feedback as we're looking for a backtrace if at all possible.

By: Perry The Cynic (perry the cynic) 2008-10-17 23:43:18

It turns out that the IMAP c-client kit calls abort(), which does not seem to leave a core dump, even with -g and all the trimmings. Bummer. So I ended up hand-running asterisk, attaching a gdb to it from the side, setting a breakpoint on abort(), and leaving myself lots of messages. It didn't really take all that long (three tries).

I'm trying to attach the asterisk and gdb logs, but this webpage is giving me error messages ("unable to find tag"), so I'm not sure if this is working. I'll include the salient bits below.
Possibly interesting points:
-) There are lots of unexpected-this-and-that warnings coming out of c-client even before things go noticeably bad.
-) The IMAP server (a Cyrus under Mac OS X Server 10.5.5) does not exhibit any notable problems.
-) During the test run (but minutes before the crash), at least one of the messages was deleted from the IMAP store using a mail client.

From a prior source (which, naturally, I now can't find), I understand that the "Unlock when not locked" error is often the result of accessing the c-client simultaneously from multiple threads, perhaps as the result of making c-client calls from callbacks. I'm not familiar enough with c-client to understand the details (or, for that matter, what exactly c-client's contract with its caller is). From the source code, it certainly doesn't seem likely that the locking hierarchy would get messed up from a single thread.

I have automatic mail actions configured on the server (via "sieve") that can delete mail messages from the IMAP store as soon as they're being delivered. (Part of a voice-junk filter system.) If asterisk uses IDLE or other notification hooks, it may get notifications before it's entirely done delivering the message. That's probably not a well-tested path.

If there's a particular path that you suspect may lead to a thread race with imap_send, let me know and I can retry with breakpoints in place there.

Cheers
 -- perry


So here's the relevant part of asterisk's debug output:

[hermod:~] root# asterisk -vvvvcg
Asterisk 1.4.22, Copyright (C) 1999 - 2008 Digium, Inc. and others.
Created by Mark Spencer <markster@digium.com>
Asterisk comes with ABSOLUTELY NO WARRANTY; type 'core show warranty' for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type 'core show license' for details.
=========================================================================
 == Parsing '/etc/asterisk/asterisk.conf': Found
[... lots of verbiage ...]
Asterisk Ready.
*CLI>     -- Starting simple switch on 'Zap/3-1'
   -- Executing [s@incoming-pots:1] Set("Zap/3-1", "~~EXTEN~~=s") in new stack
   -- Executing [s@incoming-pots:2] NoOp("Zap/3-1", "Incoming call CID="KIEHTREIBER AUD" <4083787798>") in new stack
   -- Executing [s@incoming-pots:3] Set("Zap/3-1", "__TRANSFER_CONTEXT=local-functions") in new stack
   -- Executing [s@incoming-pots:4] Answer("Zap/3-1", "") in new stack
   -- Executing [s@incoming-pots:5] Set("Zap/3-1", "LASTCALLER=4083787798|g") in new stack
[Oct 17 20:51:49] WARNING[14346]: pbx.c:5910 pbx_builtin_setvar: The use of the 'g' flag is deprecated.  Please use Set(GLOBAL(foo)=bar) instead
 == Setting global variable 'LASTCALLER' to '4083787798'
   -- Executing [s@incoming-pots:6] GotoIf("Zap/3-1", "0?7:9") in new stack
   -- Goto (incoming-pots,s,9)
   -- Executing [s@incoming-pots:9] NoOp("Zap/3-1", "Finish if-incoming-1") in new stack
   -- Executing [s@incoming-pots:10] GotoIf("Zap/3-1", "0?11:22") in new stack
   -- Goto (incoming-pots,s,22)
   -- Executing [s@incoming-pots:22] NoOp("Zap/3-1", "Finish if-incoming-2") in new stack
   -- Executing [s@incoming-pots:23] Set("Zap/3-1", "blstate=") in new stack
   -- Executing [s@incoming-pots:24] GotoIf("Zap/3-1", "?25:27") in new stack
   -- Goto (incoming-pots,s,27)
   -- Executing [s@incoming-pots:27] NoOp("Zap/3-1", "Finish if-incoming-4") in new stack
   -- Executing [s@incoming-pots:28] Dial("Zap/3-1", "Zap/1&Zap/2&SIP/pap2t-1&SIP/pap2t-2&SIP/xlite-loki|27|dtwM(accept-call)") in new stack
   -- Called 1
   -- Called 2
Really destroying SIP dialog '5d54cec70a9330df53d5a27821d34a4d@192.231.102.144' Method: INVITE
[Oct 17 20:51:49] WARNING[14346]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
Really destroying SIP dialog '6770df0c0911962c629184dc13e767fc@192.231.102.144' Method: INVITE
[Oct 17 20:51:49] WARNING[14346]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
Really destroying SIP dialog '0815dc20506c14da39244970158e2734@192.231.102.144' Method: INVITE
[Oct 17 20:51:49] WARNING[14346]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Nobody picked up in 27000 ms
   -- Hungup 'Zap/2-1'
   -- Hungup 'Zap/1-1'
   -- Executing [s@incoming-pots:29] NoOp("Zap/3-1", "Local ring => NOANSWER") in new stack
   -- Executing [s@incoming-pots:30] Goto("Zap/3-1", "sw-6-NOANSWER|10") in new stack
   -- Goto (incoming-pots,sw-6-NOANSWER,10)
   -- Executing [sw-6-NOANSWER@incoming-pots:10] Macro("Zap/3-1", "voicemail") in new stack
   -- Executing [s@macro-voicemail:1] Wait("Zap/3-1", "0.5") in new stack
   -- Executing [s@macro-voicemail:2] Playback("Zap/3-1", "sorry2") in new stack
   -- <Zap/3-1> Playing 'sorry2' (language 'en')
   -- Executing [s@macro-voicemail:3] VoiceMail("Zap/3-1", "main@vm-cynics") in new stack
   -- <Zap/3-1> Playing 'vm-intro' (language 'en')
   -- <Zap/3-1> Playing 'beep' (language 'en')
   -- Recording the message
   -- x=0, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/NsDsxc format: wav, 0x81e7650
   -- x=1, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/NsDsxc format: gsm, 0x81d9410
   -- Recording automatically stopped after a silence of 5 seconds
   -- <Zap/3-1> Playing 'auth-thankyou' (language 'en')
   -- Executing [s@macro-voicemail:4] NoOp("Zap/3-1", "VM Result=SUCCESS") in new stack
   -- Executing [s@macro-voicemail:5] Hangup("Zap/3-1", "") in new stack
 == Spawn extension (macro-voicemail, s, 5) exited non-zero on 'Zap/3-1'
   -- Hungup 'Zap/3-1'
[Oct 17 20:53:02] WARNING[14339]: app_voicemail.c:1710 mm_log: IMAP Warning: Unknown message data: 1 FETCH
[Oct 17 20:53:11] WARNING[14339]: app_voicemail.c:1710 mm_log: IMAP Warning: Unknown message data: 1 FETCH
   -- Starting simple switch on 'Zap/3-1'
   -- Executing [s@incoming-pots:1] Set("Zap/3-1", "~~EXTEN~~=s") in new stack
   -- Executing [s@incoming-pots:2] NoOp("Zap/3-1", "Incoming call CID="KIEHTREIBER AUD" <4083787798>") in new stack
   -- Executing [s@incoming-pots:3] Set("Zap/3-1", "__TRANSFER_CONTEXT=local-functions") in new stack
   -- Executing [s@incoming-pots:4] Answer("Zap/3-1", "") in new stack
   -- Executing [s@incoming-pots:5] Set("Zap/3-1", "LASTCALLER=4083787798|g") in new stack
[Oct 17 20:53:33] WARNING[14348]: pbx.c:5910 pbx_builtin_setvar: The use of the 'g' flag is deprecated.  Please use Set(GLOBAL(foo)=bar) instead
 == Setting global variable 'LASTCALLER' to '4083787798'
   -- Executing [s@incoming-pots:6] GotoIf("Zap/3-1", "0?7:9") in new stack
   -- Goto (incoming-pots,s,9)
   -- Executing [s@incoming-pots:9] NoOp("Zap/3-1", "Finish if-incoming-1") in new stack
   -- Executing [s@incoming-pots:10] GotoIf("Zap/3-1", "0?11:22") in new stack
   -- Goto (incoming-pots,s,22)
   -- Executing [s@incoming-pots:22] NoOp("Zap/3-1", "Finish if-incoming-2") in new stack
   -- Executing [s@incoming-pots:23] Set("Zap/3-1", "blstate=") in new stack
   -- Executing [s@incoming-pots:24] GotoIf("Zap/3-1", "?25:27") in new stack
   -- Goto (incoming-pots,s,27)
   -- Executing [s@incoming-pots:27] NoOp("Zap/3-1", "Finish if-incoming-4") in new stack
   -- Executing [s@incoming-pots:28] Dial("Zap/3-1", "Zap/1&Zap/2&SIP/pap2t-1&SIP/pap2t-2&SIP/xlite-loki|27|dtwM(accept-call)") in new stack
   -- Called 1
   -- Called 2
Really destroying SIP dialog '0743c6e6264ea8f92147de6c376926f0@192.231.102.144' Method: INVITE
[Oct 17 20:53:33] WARNING[14348]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
Really destroying SIP dialog '6d56235e2e5d5f3b085f7caa2ee4321d@192.231.102.144' Method: INVITE
[Oct 17 20:53:33] WARNING[14348]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
Really destroying SIP dialog '0ac8129265c0eb8d6b3617426f6ee19f@192.231.102.144' Method: INVITE
[Oct 17 20:53:33] WARNING[14348]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Nobody picked up in 27000 ms
   -- Hungup 'Zap/2-1'
   -- Hungup 'Zap/1-1'
   -- Executing [s@incoming-pots:29] NoOp("Zap/3-1", "Local ring => NOANSWER") in new stack
   -- Executing [s@incoming-pots:30] Goto("Zap/3-1", "sw-6-NOANSWER|10") in new stack
   -- Goto (incoming-pots,sw-6-NOANSWER,10)
   -- Executing [sw-6-NOANSWER@incoming-pots:10] Macro("Zap/3-1", "voicemail") in new stack
   -- Executing [s@macro-voicemail:1] Wait("Zap/3-1", "0.5") in new stack
   -- Executing [s@macro-voicemail:2] Playback("Zap/3-1", "sorry2") in new stack
   -- <Zap/3-1> Playing 'sorry2' (language 'en')
   -- Executing [s@macro-voicemail:3] VoiceMail("Zap/3-1", "main@vm-cynics") in new stack
   -- <Zap/3-1> Playing 'vm-intro' (language 'en')
   -- <Zap/3-1> Playing 'beep' (language 'en')
   -- Recording the message
   -- x=0, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/dd6Pvp format: wav, 0x81d8670
   -- x=1, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/dd6Pvp format: gsm, 0x81d89f0
   -- Recording automatically stopped after a silence of 5 seconds
   -- <Zap/3-1> Playing 'auth-thankyou' (language 'en')
   -- Executing [s@macro-voicemail:4] NoOp("Zap/3-1", "VM Result=SUCCESS") in new stack
   -- Executing [s@macro-voicemail:5] Hangup("Zap/3-1", "") in new stack
 == Spawn extension (macro-voicemail, s, 5) exited non-zero on 'Zap/3-1'
   -- Hungup 'Zap/3-1'
   -- Starting simple switch on 'Zap/3-1'
   -- Executing [s@incoming-pots:1] Set("Zap/3-1", "~~EXTEN~~=s") in new stack
   -- Executing [s@incoming-pots:2] NoOp("Zap/3-1", "Incoming call CID="KIEHTREIBER AUD" <4083787798>") in new stack
   -- Executing [s@incoming-pots:3] Set("Zap/3-1", "__TRANSFER_CONTEXT=local-functions") in new stack
   -- Executing [s@incoming-pots:4] Answer("Zap/3-1", "") in new stack
   -- Executing [s@incoming-pots:5] Set("Zap/3-1", "LASTCALLER=4083787798|g") in new stack
[Oct 17 20:55:55] WARNING[14350]: pbx.c:5910 pbx_builtin_setvar: The use of the 'g' flag is deprecated.  Please use Set(GLOBAL(foo)=bar) instead
 == Setting global variable 'LASTCALLER' to '4083787798'
   -- Executing [s@incoming-pots:6] GotoIf("Zap/3-1", "0?7:9") in new stack
   -- Goto (incoming-pots,s,9)
   -- Executing [s@incoming-pots:9] NoOp("Zap/3-1", "Finish if-incoming-1") in new stack
   -- Executing [s@incoming-pots:10] GotoIf("Zap/3-1", "0?11:22") in new stack
   -- Goto (incoming-pots,s,22)
   -- Executing [s@incoming-pots:22] NoOp("Zap/3-1", "Finish if-incoming-2") in new stack
   -- Executing [s@incoming-pots:23] Set("Zap/3-1", "blstate=") in new stack
   -- Executing [s@incoming-pots:24] GotoIf("Zap/3-1", "?25:27") in new stack
   -- Goto (incoming-pots,s,27)
   -- Executing [s@incoming-pots:27] NoOp("Zap/3-1", "Finish if-incoming-4") in new stack
   -- Executing [s@incoming-pots:28] Dial("Zap/3-1", "Zap/1&Zap/2&SIP/pap2t-1&SIP/pap2t-2&SIP/xlite-loki|27|dtwM(accept-call)") in new stack
   -- Called 1
   -- Called 2
Really destroying SIP dialog '6e4e3baf7bf6a3250065f6095dbac4d0@192.231.102.144' Method: INVITE
[Oct 17 20:55:55] WARNING[14350]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
Really destroying SIP dialog '023055d0486d6f8a6fb731e6572810f2@192.231.102.144' Method: INVITE
[Oct 17 20:55:55] WARNING[14350]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
Really destroying SIP dialog '21a019bb0f905d8c30d436ce38ea12ee@192.231.102.144' Method: INVITE
[Oct 17 20:55:55] WARNING[14350]: app_dial.c:1242 dial_exec_full: Unable to create channel of type 'SIP' (cause 20 - Unknown)
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Zap/1-1 is ringing
   -- Zap/2-1 is ringing
   -- Nobody picked up in 27000 ms
   -- Hungup 'Zap/2-1'
   -- Hungup 'Zap/1-1'
   -- Executing [s@incoming-pots:29] NoOp("Zap/3-1", "Local ring => NOANSWER") in new stack
   -- Executing [s@incoming-pots:30] Goto("Zap/3-1", "sw-6-NOANSWER|10") in new stack
   -- Goto (incoming-pots,sw-6-NOANSWER,10)
   -- Executing [sw-6-NOANSWER@incoming-pots:10] Macro("Zap/3-1", "voicemail") in new stack
   -- Executing [s@macro-voicemail:1] Wait("Zap/3-1", "0.5") in new stack
   -- Executing [s@macro-voicemail:2] Playback("Zap/3-1", "sorry2") in new stack
   -- <Zap/3-1> Playing 'sorry2' (language 'en')
   -- Executing [s@macro-voicemail:3] VoiceMail("Zap/3-1", "main@vm-cynics") in new stack
   -- <Zap/3-1> Playing 'vm-intro' (language 'en')
   -- <Zap/3-1> Playing 'beep' (language 'en')
   -- Recording the message
   -- x=0, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/rjKRjM format: wav, 0x81d96f8
   -- x=1, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/rjKRjM format: gsm, 0x81e7c78
   -- Recording automatically stopped after a silence of 5 seconds
   -- <Zap/3-1> Playing 'auth-thankyou' (language 'en')
[Oct 17 20:59:00] WARNING[14350]: app_voicemail.c:1710 mm_log: IMAP Warning: Unknown message data: 2 FETCH
[Oct 17 20:59:00] WARNING[14350]: app_voicemail.c:1710 mm_log: IMAP Warning: Unexpected tagged response: Voicemail FROM "KIEHTREIBER AUD" <4083787798> (("Asterisk PBX" NIL "voicemail" "cynic.org")) ((
[Oct 17 20:59:00] WARNING[14350]: app_voicemail.c:1710 mm_log: IMAP Warning: IMAP server sent a blank line
[Oct 17 20:59:00] WARNING[14350]: app_voicemail.c:1710 mm_log: IMAP Warning: Missing IMAP reply key: )
[Oct 17 20:59:00] WARNING[14350]: app_voicemail.c:1710 mm_log: IMAP Warning: Unexpected tagged response: 00000d2e OK Completed (0.000 sec)
   -- Executing [s@macro-voicemail:4] NoOp("Zap/3-1", "VM Result=SUCCESS") in new stack
   -- Executing [s@macro-voicemail:5] Hangup("Zap/3-1", "") in new stack
 == Spawn extension (macro-voicemail, s, 5) exited non-zero on 'Zap/3-1'
   -- Hungup 'Zap/3-1'
[Oct 17 21:00:00] ERROR[14339]: app_voicemail.c:1773 mm_fatal: IMAP access FATAL error: Unlock when not locked


And here is the gdb traceback you've been waiting for. Note that c-client is still optimized, so some tail-call optimizations happened there:

[hermod:/files/src/asterisk-1.4.22] root# gdb
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu".

(gdb) attach 14329
Attaching to process 14329
Reading symbols from /usr/sbin/asterisk...done.
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libdl.so.2
Reading symbols from /lib/tls/i686/cmov/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread -1210258592 (LWP 14329)]
[New Thread -1222538320 (LWP 14340)]
[New Thread -1221047376 (LWP 14339)]
[New Thread -1217889360 (LWP 14338)]
[New Thread -1217234000 (LWP 14337)]
[New Thread -1216386128 (LWP 14336)]
[New Thread -1214084176 (LWP 14335)]
[New Thread -1210999888 (LWP 14334)]
[New Thread -1210754128 (LWP 14333)]
[New Thread -1210508368 (LWP 14332)]
[New Thread -1210262608 (LWP 14331)]
Loaded symbols for /lib/tls/i686/cmov/libpthread.so.0
Reading symbols from /lib/libncurses.so.5...done.
[... lots more "reading symbols" messages ...]
(gdb) b abort
Breakpoint 1 at 0xb7df8eb6
(gdb) b exit
Breakpoint 2 at 0xb7dfa2e6
(gdb) b _exit
Breakpoint 3 at 0xb7e5cdf4
(gdb) c
Continuing.
[New Thread -1222829136 (LWP 14346)]
[Thread -1222829136 (LWP 14346) exited]
[New Thread -1222829136 (LWP 14348)]
[Thread -1222829136 (LWP 14348) exited]
[New Thread -1222829136 (LWP 14350)]
[Switching to Thread -1221047376 (LWP 14339)]

Breakpoint 1, 0xb7df8eb6 in abort () from /lib/tls/i686/cmov/libc.so.6
(gdb) info threads
 14 Thread -1222829136 (LWP 14350)  0xffffe410 in __kernel_vsyscall ()
 11 Thread -1210262608 (LWP 14331)  0xffffe410 in __kernel_vsyscall ()
 10 Thread -1210508368 (LWP 14332)  0xffffe410 in __kernel_vsyscall ()
 9 Thread -1210754128 (LWP 14333)  0xffffe410 in __kernel_vsyscall ()
 8 Thread -1210999888 (LWP 14334)  0xffffe410 in __kernel_vsyscall ()
 7 Thread -1214084176 (LWP 14335)  0xffffe410 in __kernel_vsyscall ()
 6 Thread -1216386128 (LWP 14336)  0xffffe410 in __kernel_vsyscall ()
 5 Thread -1217234000 (LWP 14337)  0xffffe410 in __kernel_vsyscall ()
 4 Thread -1217889360 (LWP 14338)  0xffffe410 in __kernel_vsyscall ()
* 3 Thread -1221047376 (LWP 14339)  0xb7df8eb6 in abort ()
  from /lib/tls/i686/cmov/libc.so.6
 2 Thread -1222538320 (LWP 14340)  0xffffe410 in __kernel_vsyscall ()
 1 Thread -1210258592 (LWP 14329)  0xffffe410 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7df8eb6 in abort () from /lib/tls/i686/cmov/libc.so.6
#1  0xb7440ee0 in fatal (string=0xb74b4ce8 "Unlock when not locked") at ftl_unix.c:38
#2  0xb7477362 in imap_send (stream=0x81cead8, cmd=0xb7515df0 "FETCH", args=0xb73837a8)
   at imap4r1.c:3115
#3  0xb74776f0 in imap_fetch (stream=0x81cead8, sequence=0x81e7860 "2", flags=128)
   at imap4r1.c:5563
#4  0xb747a377 in imap_search (stream=0x81cead8, charset=0x0, pgm=0x81e7870,
   flags=<value optimized out>) at imap4r1.c:2089
ASTERISK-1  0xb744d21e in mail_search_full (stream=0x81cead8, charset=0x0, pgm=0x81e7870, flags=0)
   at mail.c:2351
ASTERISK-2  0xb7423d4a in messagecount (context=0xb7383e6d "vm-cynics",
   mailbox=0xb7383e68 "main", folder=0x0) at app_voicemail.c:1223
ASTERISK-3  0xb74248c0 in has_voicemail (mailbox=0x81b79e8 "main@vm-cynics", folder=0x0)
   at app_voicemail.c:1393
ASTERISK-4  0x0805eaf2 in ast_app_has_voicemail (mailbox=0x81b79e8 "main@vm-cynics", folder=0x0)
   at app.c:172
ASTERISK-5  0xb73d282c in do_monitor (data=0x0) at chan_dahdi.c:7024
ASTERISK-6 0x0810055f in dummy_start (data=0x819b040) at utils.c:912
ASTERISK-7 0xb7f7f240 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
ASTERISK-8 0xb7e9a49e in clone () from /lib/tls/i686/cmov/libc.so.6

By: Perry The Cynic (perry the cynic) 2008-10-19 15:47:06

And another IMAP abort, with a somewhat different traceback. This was leaving an about 2:10 long message and letting it time out (5 seconds quiet). See asterisk-2.gdb. Note that I just switched from imap-2006i to imap-2007a (the new abort is with the 2007a version). Sadly, that doesn't seem to have helped.

The debug log leading up to this was

   -- Zap/2-1 is ringing
   -- Nobody picked up in 27000 ms
   -- Hungup 'Zap/2-1'
   -- Hungup 'Zap/1-1'
   -- Executing [s@incoming-pots:29] NoOp("Zap/3-1", "Local ring => NOANSWER") in new stack
   -- Executing [s@incoming-pots:30] Goto("Zap/3-1", "sw-6-NOANSWER|10") in new stack
   -- Goto (incoming-pots,sw-6-NOANSWER,10)
   -- Executing [sw-6-NOANSWER@incoming-pots:10] Macro("Zap/3-1", "voicemail") in new stack
   -- Executing [s@macro-voicemail:1] Wait("Zap/3-1", "0.5") in new stack
   -- Executing [s@macro-voicemail:2] Playback("Zap/3-1", "sorry2") in new stack
   -- <Zap/3-1> Playing 'sorry2' (language 'en')
   -- Executing [s@macro-voicemail:3] VoiceMail("Zap/3-1", "main@vm-cynics") in new stack
   -- <Zap/3-1> Playing 'vm-intro' (language 'en')
   -- <Zap/3-1> Playing 'beep' (language 'en')
   -- Recording the message
   -- x=0, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/z5Wp56 format: wav, 0x81e8508
   -- x=1, open writing:  /var/spool/asterisk/voicemail/vm-cynics/main/tmp/z5Wp56 format: gsm, 0x81e8720
   -- Recording automatically stopped after a silence of 5 seconds
   -- <Zap/3-1> Playing 'auth-thankyou' (language 'en')
[Oct 19 13:49:59] WARNING[25759]: app_voicemail.c:1710 mm_log: IMAP Warning: Unexpected tagged response: 00005a2b OK [APPENDUID 1218413773 86] Completed
[Oct 19 13:50:58] ERROR[25778]: app_voicemail.c:1773 mm_fatal: IMAP access FATAL error: Unlock when not locked

By: Perry The Cynic (perry the cynic) 2008-10-19 17:30:59

Okay, I spent another two hours messing around with this. At this point I'll declare the problem fully reproducible (as far as my system is concerned, at least).
The procedure (works each time) is to leave a 2:00 long message (I play Enya into the phone :-). Short messages (<15 seconds) work fine every time.
Check out the new files asterisk-4.log and asterisk-4.gdb.txt. This shows a 2:00+ message being left. When I stop talking, asterisk tries to deliver the message. Something goes wrong (Unexpected tagged response), and asterisk gets hung up in IMAP code - the line is still busy and connected (we haven't hung up). We sat like this for at least 10 seconds. So I broke in the debugger and pulled debug logs.
Once I told gdb to continue, it hit the abort() call almost immediately. This is reproducible; the hang during delivery can be turned into the IMAP abort() just by breaking in gdb and immediately continuing. (Some thread reordering effect, probably.)
I pulled tracebacks for the moment-of-death, too.

At this point, IMAP voicemail delivery is effectively useless for me; I'll have to turn it off to survive - with IMAP enabled, anyone who leaves a long voice mail crashes asterisk (sometimes after leaving the line up for a few hours as the IMAP delivery hangs itself senseless, which of course makes the system useless as well, since our POTS line is busied out).

I gather IMAP mailboxes actually *work* for people, mostly? I wonder what's different in my setup...

By: Mark Michelson (mmichelson) 2008-10-20 14:41:00

Thanks for all your hard work in trying to locate the issue here. I will attempt a reproduction of the long message test as I don't think I've ever done that when testing IMAP voicemail features.

In one of your notes, you mentioned about making calls to c-client library functions from within callbacks. I've made the mistake of trying that locally, but that actually usually gives the "lock when already locked" error instead of the "unlock when not locked" error. The reason is that the c-client in both cases calls mail_lock() as part of the library call and does not use recursive mutexes, which leads to a crash.

There's another issue open right now regarding IMAP storage, issue ASTERISK-12855. In it, the reporter suggests using mutex locks around all calls to the c-client mail_* calls. If you wouldn't mind trying it, the second patch attached to that issue adds these locks. If you wouldn't mind, I would be very interested in hearing test results from trying this patch out.

By: Perry The Cynic (perry the cynic) 2008-10-20 22:48:59

Okay, I grabbed  asterisk-1.4.21.2-appvoicemail-sharedimap-lock.patch, patched it into the 1.4.21.2 I used before switching to 1.4.22, and tried that. The IMAP hangs and crashes SEEM TO BE GONE with that code; I can leave 4:00 messages without trouble. I'll leave that version installed and run with it for a day or two to see if our luck is holding up.
I agree that re-entering the c-client code ought to get us "lock when already locked" rather than "unlock when not locked" errors. I have no explanation of why we're getting the latter. I have *never* seem "lock when already locked" errors, ever.

Note that the various dangerous-looking c-client warning messages also seem to have disappeared. I'm still seeing
     WARNING[12838]: app_voicemail.c:8765 mm_log: IMAP Warning: Unknown message data: 5 FETCH
whenever a message is marked for deletion, and
     WARNING[12838]: app_voicemail.c:8765 mm_log: IMAP Warning: Unknown message data: 1 EXPUNGE
whenever I expunge a deleted message. I don't know whether these are worrisome or expected; they don't seem to bother asterisk's operation.

Thanks, and tentative cheers
 -- perry

By: Leif Madsen (lmadsen) 2008-10-21 14:30:18

Thanks Perry so much for the hard work on this. Switching back to Acknowledged while waiting for the a developer to move this forward.

By: Tilghman Lesher (tilghman) 2008-12-18 16:00:17.000-0600

I am closing this issue out as the patch which fixed the crashes here has been committed to 1.4 and up.