[Home]

Summary:ASTERISK-10111: IMAP integration with MS Exchange crashes Asterisk-1.4.10.1
Reporter:Enrique1 (ewilhelmsen)Labels:
Date Opened:2007-08-17 12:59:51Date Closed:2008-03-27 12:13:05
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Applications/app_voicemail
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 10487.patch
Description:When setting voicemail.conf to MS Exchange IMAP server I am able to record a message and it will send to the users Exchange Inbox but when MS Exchange sends an IMAP response to Asterisk it does not know the tagged response for SEARCH and NOOP. After this Warning:

[Aug 17 10:59:19] WARNING[26191]: app_voicemail.c:8572 mm_log: IMAP Warning: Unexpected tagged response: 0000003a OK SEARCH completed.
[Aug 17 10:59:19] WARNING[26204]: app_voicemail.c:8572 mm_log: IMAP Warning: Unexpected tagged response: 0000003e OK NOOP
completed.

Asterisk crashes and needs to be Ctrl-C to exit and restarted.
Comments:By: Mark Michelson (mmichelson) 2007-08-17 15:08:09

Hmmm, this is interesting. It's the IMAP c-client that generates that message you're seeing, and Asterisk relays it to the console. Could you please provide a backtrace and bt full for when Asterisk becomes unresponsive?

Thanks for bringing this issue to our attention.

By: Enrique1 (ewilhelmsen) 2007-08-17 15:22:04

I will do this as soon as I can, hopefully in a few hours.  I want to try and help you resolve this.  
Thanks,
Erik

By: Enrique1 (ewilhelmsen) 2007-08-17 15:49:55

I just tried to get a dump. The issue is that it never dumps. I just hangs.  I have to Ctrl-C to stop Asterisk.  I looked in /tmp and there are no dump files.  Is there a way to pull a backtrace without a dump?

By: Mark Michelson (mmichelson) 2007-08-17 15:52:17

Yes. You can use the kill command to send Asterisk a SIGSEGV signal. This will segfault Asterisk and cause it to generate a core file.

By: rayjay (rayjay) 2007-08-19 03:27:43

I am running Asterisk 1.4.10.1 with an IMAP backend for VM and have experienced exactly the same issue using Courier-IMAP as my backend.  Asterisk just hangs.  You connect up to the console and see nothing like something has been locked somewhere.  The only way to get calls going again is to restart Asterisk.  I haven't managed to track down what is happening just yet and my generated core file simply said:

Program terminated with signal 6, Aborted.
#0  0xb7d55947 in raise () from /lib/tls/libc.so.6                                                                                                                                

This is a pretty serious problem for us which has happened twice now in the last week.  It is worse than a crash as safe_asterisk restarts asterisk in that case, whereas in this case it requires manual intervention to get things going again!

By: Mark Michelson (mmichelson) 2007-08-20 08:30:55

rayjay, see if you can get a better core file by sending signal 11 instead.

kill -11 <asterisk's pid>

By: rayjay (rayjay) 2007-08-20 20:19:40

Ok, I will kill with -11 flag next time and let you know the results.  By the way the hanging has reoccured 3 times now and that was after going back to 1.4.9 code, so the bug may be older than just 1.4.10.1.  By the way, I am compiling against UW_IMAP 2006j - is this okay or do you recommend an earlier version?

By: Mark Michelson (mmichelson) 2007-08-21 08:22:48

Well, I can't necessarily recommend any specific version of UW IMAP, but if you have the ability to, could you see if 2006i or 2006h also cause the crash?

Thanks very much.

By: Enrique1 (ewilhelmsen) 2007-08-22 13:03:58

I just rebuilt my asterisk box with the newest version 1.4.11 and c-client version imap-2006h and I still get asterisk hanging intermittently and I have to force an exit and restart to get it to respond again. So it is not the C-Client version

By: Olivier Krief (okrief) 2007-08-22 13:56:07

With 1.4.11, we ran into a problem which looks similar : Asterisk seems to be frozen and wouldn't even switch audio calls.

When using a different IMAP server, things ran perfectly.

Differences between both IMAP servers are :
1- one is hosted and managed by 1&1 in Germany (we are located in France)
2- one is hosted on the same LAN

The good one is the 2nd one.

On the "bad" side:
When a caller tries to leave a voicemail, * sends the voicemail to the imap server.
Using "netstat", the number of packets in the out-q of the imap connection grows... but the in-q of the SIP port grows even faster!

By: rayjay (rayjay) 2007-08-22 22:35:30

Okay, after 2 more days I got a hang again... The last lines in the console I see before the hang were:

[Aug 23 15:38:51] WARNING[7379]: app_voicemail.c:8572 mm_log: IMAP Warning: Unexpected tagged response: 00004715 OK NOOP completed
[Aug 23 15:38:51] WARNING[18852]: app_voicemail.c:8572 mm_log: IMAP Warning: Unexpected tagged response: 00004719 OK SEARCH done.

So, similar to the above debug!  The console again was responsive but all my debug and SIP processing stopped and I believe the in-progress calls stayed up fine.  A kill happily then tore down all the calls as normal with BYE messages to the clients, so Asterisk didn't crash as such - he just got locked up somehow?

I'm running courier-imap.  Does anyone think moving to Dovecot might help?

I did a kill -11 on Asterisk but still my core debug only shows:

Core was generated by `/usr/sbin/asterisk -U asterisk -G asterisk -vvvg -c'.
Program terminated with signal 11, Segmentation fault.
#0  0xb7ef1401 in __read_nocancel () from /lib/tls/libpthread.so.0

so no clues in the core, probably as Asterisk didn't actually crash it just got stuck in a bad state?

By: rayjay (rayjay) 2007-08-23 06:09:14

@okrief - can I ask what Mail server you are running on your LAN which does not exhibit any problems?  I am considering migrating from courier-imap to dovecot to see if that helps resolve the issue...

By: rayjay (rayjay) 2007-08-23 23:31:52

I've had a little look into the problem and it appears to be related to Asterisk not being able to handle an 'Unexpected Tagged response' event in the UW-IMAP code (imap4r1.c).  It seems that the reply->tag is not the same as the tag passed to imap_reply, so UW-IMAP reports an error, sets the stream to unhealthy and returns a "[CLOSED] IMAP connection broken" event.

I always seem to get Asterisk to hang when this happens according to the log.  It seems Asterisk can't handle the unexpected here and gets stuck.  My guess is that it is happening via a 'mail_search_full' call in the app_voicemail.c code.  I will keep investigating and try to replicate the issue.

By: Olivier Krief (okrief) 2007-08-27 01:27:30

rayray :

courier-imap v0.58-4 (debian) with Maildir and qmail-ldap-1.03 (ldap v20060201 controls v20060403b)

By: Mark Michelson (mmichelson) 2007-08-30 15:43:19

Hey, everyone. Sorry that progress isn't being made on this issue, but since I am unable to reproduce the issue locally, it's not possible to fix this without more output from a failure case.

First off, to those who have attempted killing Asterisk with the -11 signal, do you have the DONT_OPTIMIZE compiler flag selected in menuselect? Selecting this flag will, in general, generate more useful backtraces. Even in a non-crash situation, a backtrace (more specifically a "thread apply all bt full") can be extremely helpful so that it can be seen which threads are waiting for locks.

Also, for those of you running version 1.4.11, if you compile using the DEBUG_THREADS compiler flag in menuselect, then when the lockup happens, you can issue the command "core show locks" on the CLI and give the output. I'm not confident it will help a whole lot in this issue, but in general, deadlocks can be debugged somewhat easier with this output.

Now, if those two methods don't generate any better data, then the best we can do is to use debug and verbose log information from the time around the crash. So far, I've seen the two lines about the unexpected tagged response, but it would help to see other debug messages from around the time so that I can see exactly which part of voicemail code is causing that message to occur.

I appreciate everyone's input on this, and hope we can get this settled soon!

By: Olivier Krief (okrief) 2007-08-31 01:16:22

Maybe the crash we met comes from a different issue than the one ewilhelmsen opened.

Anyway, our was easy to reproduce and work around when using or not an hosted IMAP server.
Would you like me set the machine up again and give someone an ssh to access the box ?

By: Olivier Krief (okrief) 2007-08-31 01:20:49

To further underline I may be hijacking a bug report, I'm using no Exchange server at all.
Should I open another bug report now or shall we use this one up to the moment we've found something ?

By: Mark Michelson (mmichelson) 2007-08-31 08:58:22

okrief: Does your problem exhibit the same symptoms as the original reported problem (i.e. warnings about unexpected tagged responses, followed by a lock-up)? Given what rayjay has reported, he's seeing the same issue using courier-imap, and so I'm willing to bet this isn't an Exchange-only problem. I would not recommend opening a new bug report unless your issue is much different from the original reported issue.

By: Enrique1 (ewilhelmsen) 2007-08-31 09:31:11

This is the same issue on both Exchange and Courier IMAP.  I have tried both servers and have gotten the same Warnings about unexpected tagged responses followed by a lockup of asterisk.
I will be working on this more today and try to get a dump.

By: rayjay (rayjay) 2007-09-03 06:26:59

I concur that this is not an "Exchange only" problem.  We have had several hangs with Courier IMAP now.  Although none for the last 5 or so days.  I'm not sure if this is due to the fact that Asterisk has been restarted manually a couple of times during that time or whether it is due to changes on the IMAP server to increase idle timeouts etc.

I'm not sure if it is related or not but in my testing when configuring up a Dovecot server I can get Asterisk to crash every time by getting Dovecot to allow both PLAIN and LOGIN authentication.  When I do this Asterisk core dumps immediately when trying to connect to the IMAP server immediately after a 'Security Problem' warning line from UW-IMAP auth_pla.c code:

[Sep  3 23:35:52] WARNING[31741]: app_voicemail.c:8594 mm_log: IMAP Warning: SECURITY PROBLEM: insecure server advertised AUTH=PLAIN
myhost*CLI>
Disconnected from Asterisk server
/usr/sbin/safe_asterisk: line 111: 31515 Segmentation fault      (core dumped) nice -n $PRIORITY ${ASTSBINDIR}/asterisk ${CLIARGS} ${ASTARGS} >&/dev/${TTY} </dev/${TTY}
Asterisk ended with exit status 139
Asterisk exited on signal 11.

Again, I wouldn't expect Asterisk to hang/core dump on the back of a IMAP warning response from UW-IMAP which appears to be the case with the 'Unexpected tagged response' too?  I would expect Asterisk rather to simply fail the Voicemail connection.

By: rayjay (rayjay) 2007-09-06 04:33:49

Here is a backtrace from the latest crash.  I think the key part is the 'Lock when already locked' message which seems to stem from the 'mail_search_full' call in 'messagecount' function (called by inboxcount).  I hope this helps track down the problem?

I've removed the hostname and user name for privacy reasons...

# 0  0xb7cb2947 in raise () from /lib/tls/libc.so.6
# 1  0xb7cb40c9 in abort () from /lib/tls/libc.so.6
# 2  0xb71d9aa0 in fatal (string=0xb6e853f4 "Lock when already locked, mbx={myhost:143/imap/novalidate-cert/user=\"myuser\"}INBOX")
   at ftl_unix.c:38
# 3  0xb71e619d in mail_lock (stream=0xb6cdb360) at mail.c:3177
# 4  0xb720fcd2 in imap_send (stream=0xb6cdb360, cmd=0xb72adf07 "SEARCH", args=0xb6e85d20) at imap4r1.c:2902
# 5  0xb72135f2 in imap_search (stream=0xb6cdb360, charset=0x0, pgm=0xb65776b8, flags=0) at imap4r1.c:2029
# 6  0xb71e64ee in mail_search_full (stream=0xb6cdb360, charset=0x0, pgm=0xb65776b8, flags=0) at mail.c:2351
# 7  0xb71c8f9b in messagecount (context=0xb72466de "default", mailbox=0xb65c8690 "myuser", folder=0xb724d1a7 "INBOX") at app_voicemail.c:2457
# 8  0xb71c9c4f in inboxcount (mailbox_context=0xb65c8690 "myuser", newmsgs=0xb6e88318, oldmsgs=0xb6e88314) at app_voicemail.c:2520
# 9  0xb6ec305b in sip_send_mwi_to_peer (peer=0xb65c80a0) at chan_sip.c:15164
# 10 0xb6ecf960 in do_monitor (data=0x0) at chan_sip.c:15370
# 11 0x08132651 in dummy_start (data=0x821ca50) at utils.c:775
# 12 0xb7f2b0bd in start_thread () from /lib/tls/libpthread.so.0
# 13 0xb7d559ee in clone () from /lib/tls/libc.so.6



By: Mark Michelson (mmichelson) 2007-09-06 11:31:28

Thanks very much for providing a backtrace.

Interesting. So we're looking at a problem where mail_lock attempts to lock the same stream twice. A possible solution to this would be to add some local locking in app_voicemail before any calls to c-client mail_* functions so that there's no possibility of such locking failures. I'll work on getting this implemented and tested locally and will have a patch posted when it's ready.

By: Mark Michelson (mmichelson) 2007-09-06 13:24:36

I've uploaded a patch which adds locking in voicemail around calls to c-client's mail functions. See if this clears up any of the problems experienced.

The patch uploaded is against the latest SVN checkout of 1.4, so it will most likely not apply cleanly if you attempt to patch a version that is out of date.



By: rayjay (rayjay) 2007-09-10 05:30:17

Some feedback on the patch so far...

The good news is that I haven't had any crashes or need to restart Asterisk since applying the patch yet (although it is still fairly early days).  One thing I have noticed though is the locking between  mailbox queries sometimes seems to last for several seconds - during which time no signalling/debug occurs in my console.  

What I think is happening is that some of my mailboxes are fairly large (100+ voice messages) and since Courier isn't the fastest mail system at retrieving messages from larger mailboxes, the mail_search queries often can take several seconds particularly if not cached by the mail file system.

Is there a way of not locking everything up (including SIP signalling) while awaiting a response from the mail system?  I am also doing MWI queries regularly  which adds more mailbox 'queries' to the mix.  One other thing that I noticed:  If you have a failed login between Asterisk and the IMAP server, typically the IMAP server will 'sleep' for 5 seconds before returning an authentication failure (I think this is standard practice with IMAP servers to avoid dictionary attacks etc.)  When this auth failure occurs Asterisk also seems to lock up and all other call attempts etc. seem to hang waiting for the lock to release.  I think there might be a more fundamental issue in using locking when relying on an external resource like an IMAP mail server which may have long search times, failed auth problems etc.

I realise this may have gone wildly off-topic.  If you think my IMAP/locking concerns should be taken elsewhere then let me know.  For now though your patch appears to be achieving *its* job of stopping the crashes/hangs.

By: Enrique1 (ewilhelmsen) 2007-09-10 07:40:13

I tried this patch for the past several days and I also noticed that asterisk locks up SIP during queries to the IMAP server.  I had to go back to dovecot because this was unusable to make a receive calls even in my 5 user test environment.  I think it is getting closer though as Asterisk did not lockup at all after voicemails were left and checked several times by multiple SIP accounts.  Thanks again for working on this putnopvut.

By: Mark Michelson (mmichelson) 2007-09-10 09:16:01

Thank you very much for the feedback.

I'm glad to hear that the patch appears to be doing what it's supposed to, but those delays are unacceptable. I want to work on optimizing the patch so that it does not take as long, but first I would like to hear more confirmation that this patch is actually fixing the issue at hand.

By: rayjay (rayjay) 2007-09-10 16:26:26

Actually I think the locking tends to occur more during the 'checkmwi' events when the mailboxes get checked for new messages and the IMAP server receives lots of queries all at once.  I might turn off MWI checking until this is resolved.  It would be nice to have MWI as a push event to Asterisk rather than a polling/pull process to take unnecessary load off IMAP/Asterisk.  Alternatively I might setup an IMAP proxy/cache to speed things up a little.

Thanks for the feedback!

By: Enrique1 (ewilhelmsen) 2007-09-13 12:30:54

Now I am starting to get lockups of Asterisk using Dovecot IMAP.  Before I applied this patch I never got lockups with Dovecot and now Dovecot is locking up.  I think this is a IMAP Asterisk issue and not certain IMAP servers (IE. Dovecot, Courier, Exchange) I now have IMAP turned off and will continue to test patches but will not go back to IMAP until these lockups are resolved.  Thanks for the continued work on this.
Erik

By: rayjay (rayjay) 2007-09-13 16:21:22

I noticed some fairly big lockups and then a couple of days ago I saw a big lockup which forced me to kill -11 asterisk and restart the daemon as I had live customers trying to make calls.  Backtrace in core shows something different this time:

Program terminated with signal 11, Segmentation fault.
#0  __ast_pthread_mutex_unlock (filename=0xb6f5a2e0 "chan_sip.c", lineno=13498, func=0xb6f5f812 "handle_request_invite",
   mutex_name=0xb6f58228 "&p->refer->refer_call->owner->lock", t=0x78) at /usr/src/asterisk/include/asterisk/lock.h:371
371             if (t->reentrancy && (t->thread[t->reentrancy-1] != pthread_self())) {
(gdb) bt
#0  __ast_pthread_mutex_unlock (filename=0xb6f5a2e0 "chan_sip.c", lineno=13498, func=0xb6f5f812 "handle_request_invite",
   mutex_name=0xb6f58228 "&p->refer->refer_call->owner->lock", t=0x78) at /usr/src/asterisk/include/asterisk/lock.h:371
#1  0xb6f46e5b in handle_request_invite (p=0xb65684f0, req=0xb6ee409c, debug=1, seqno=1, sin=0xb6ee52c0, recount=0xb6ee52d0,
   e=0xb6ee42bf "sip:xxxx@xxx.xxx.xxx.xxx", nounlock=0xb6ee52d4) at chan_sip.c:13498
#2  0xb6f4f262 in handle_request (p=0xb65684f0, req=0xb6ee409c, sin=0xb6ee52c0, recount=0xb6ee52d0, nounlock=0xb6ee52d4) at chan_sip.c:14988
#3  0xb6f5149d in sipsock_read (id=0x81c35f8, fd=22, events=1, ignore=0x0) at chan_sip.c:15136
#4  0x080c6cf0 in ast_io_wait (ioc=0x824def0, howlong=1) at io.c:279
ASTERISK-1  0xb6f2c75e in do_monitor (data=0x0) at chan_sip.c:15342
ASTERISK-2  0x08132651 in dummy_start (data=0x821c6b8) at utils.c:775
ASTERISK-3  0xb7fc80bd in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-4  0xb7df29ee in clone () from /lib/tls/libc.so.6

Again, I have protected the user's privacy involved.  I have now set my checkmwi to 3600000 (1000 hours), so I am no longer getting any check MWI polling against the mail server and users must SUBSCRIBE for MWI events now.  Since I effectively disabled checkmwi I have noticed the platform is much more responsive and stable - I have had no locks/hangs occur for over 2 days and the load on the IMAP server has gone from 1.5 to 0.0, so I think that the constant polling of mailboxes by the checkmwi is causing most of the instability.  It doesn't explain the bug away - but disabling checkmwi does seem to have a big impact on how frequently the system hangs or locks up because of a massive reduction in IMAP traffic?  I think the other problem is that checkmwi seems to poll all the mailboxes in one big hit rather than spread over a larger timeframe.

For stability reasons I'm going to have to leave checkmwi off for now, so I'm not sure if I'll get any more crashes/core dumps now...

By: Mark Michelson (mmichelson) 2007-09-13 17:30:03

ewilhelmsen:

Are you saying that Dovecot is crashing with this patch applied, or without?

rayjay:

The backtrace from that lockup indicates some problem with SIP. I'd recommend searching the tracker to see if a similar issue has already been opened, and if not, open a new one. That single backtrace might help some, but a "thread apply all bt" would be more helpful since it appears you are dealing with a deadlock. Also, if you are running Asterisk with DEBUG_THREADS enabled, and you have another lockup, you can issue the command "core show locks" and upload it to the bug report since it can be very useful in debugging deadlocks.

Another question, are you still running the patch I provided, or did you only turn checkmwi off?

By: rayjay (rayjay) 2007-09-13 18:00:02

Yes, I am still running with your patch applied (still on 1.4.10.1) but now checkmwi is effectively disabled.  I will try and get some additional info as requested.  The backtrace did look a lot different to previous hangs/locks which might indicate a separate problem.  Is there any way the 2 could be related though?  i.e. the locking applied around the IMAP connections is affecting the generic SIP locking issues?

I will let you know if I see any further issues in our environment.

By: Mark Michelson (mmichelson) 2007-09-14 09:03:57

I find it highly unlikely that the segfault you saw is related to this voicemail patch, but I've been known to be wrong before :)

I took another look through the code, and it appears that the MWI function will ALWAYS search the entire mailbox for new messages. There should be a method for telling mwi not to do this if there isn't anything new.

Older versions of 1.4 had a way that tried to do this, but since the behavior was broken, I just removed it altogether. I'm going to work on a way of reimplementing this since it seems like it would help to clear up this issue.

By: Enrique1 (ewilhelmsen) 2007-09-14 17:30:45

Dovecot is locking up with the patch removed.  I added the patch and the lockups stopped but the delays curing calls were not tolerable so I removed the patch and now dovecot locks.  Before I applied the patch dovecot never locked up but now after applying the patch and then removing it again Dovecot now locks as often as other IMAP servers.  Weird?  I don't know why I recompiled and installed Asterisk clean and it still locks now.

By: Enrique1 (ewilhelmsen) 2007-09-20 19:23:15

putnopvut, have you had a chance to work on the patch anymore?  I appreciate your hard work in getting as far as we've come.  Just wanted to know if you have had a chance to look at the locking between calls that delays asterisk?  Thanks again!

By: Mark Michelson (mmichelson) 2007-09-21 08:55:47

I worked for a few hours on this Monday, but since then, I've been pulled away from coding in order to prepare presentations for Astricon next week.

Unfortunately, I haven't yet come up with a way that is simple and clean for finding a new message count. The IMAP c-client provides a mail_ping function which is made for such a purpose, but its use is geared more towards an application which keeps a continuous connection to an IMAP mailbox and "pings" it occasionally to see if there are new messages. Because of the way Asterisk opens and closes connections as needed, pinging the box doesn't actually do anything. What I figured out from my tests on Monday was that simply opening a mailbox could be enough to figure out if the mailbox has new messages, but it will take some testing to figure out what exactly to look for to indicate that there are new messages, plus more testing to figure out if my changes work.

I will update this issue with a note this afternoon with my progress. Sorry for not leaving any notes for a while.

By: Mark Michelson (mmichelson) 2007-09-22 00:44:34

Thanks for bearing with me on this. I didn't make much progress on the issue today. Since I am spending next week at Astricon, I most likely will not get an opportunity to work on this while I am there.

By: Mark Michelson (mmichelson) 2007-10-01 11:01:10

I've isolated a way to figure out how many messages are present by simply opening the mailbox (without having to perform a search). Recoding this will be a bit tricky and will require some testing. When I have a patch I will post it. This issue, by the way, is my top priority at this point, so you don't have to worry about me working on other issues.



By: Mark Michelson (mmichelson) 2007-10-02 15:43:45

Bad news. My last note I sent out is simply not true. I was misinterpreting debug output on the console, and when I followed up on that, I found that I was completely wrong about what I was seeing.

I spent all day yesterday and today working on this, and I've come to the conclusion that it's just not possible to accurately get MWI information without searching the entire mailbox (or doing a similarly expensive operation). The reason for this is that in the case of IMAP, changes to the mailbox can be made outside of Asterisk, and aside from searching the mailbox, there's no way for Asterisk to know what changes have been made.

I came up with a few ideas for speeding things up, but the problem is that none of these local fixes could detect changes that occur due to a user checking voicemail via an e-mail client.

As such, I really have no option but to keep the code as is, but recommend that either 1) you disable MWI or 2) you set your MWI interval to be longer than the default. I don't think you need to go to quite the same lengths that rayjay did, but maybe set MWI to be checked every 2 or 3 minutes and things should hopefully not be as slow moving.

I hate having to compromise like this, but aside from a total rewrite of how Asterisk's IMAP storage works, this is the only solution I can offer.

I'm going to leave this issue open for a while longer in case the MWI configuration changes recommended don't fix things.

By: rayjay (rayjay) 2007-10-02 16:43:32

I think one of the main problems with the MWI checking was that it seemed to all happen in bursts (i.e. all the mailboxes checked at the same time).  I might be wrong but that was what it looked like when I did a trace.  It would be better if the MWI checking was done at different times for different mailboxes. i.e. exactly 'x' seconds after that user registered?  Sorry, if I am misinterpreting things and this is already what happens?

Since I disabled MWI checking I haven't had a single crash, so I can't really add anything to my initial feedback.  Would it be possible to do MWI checking external to Asterisk (i.e. have a separate process that checks mailboxes and then sends a command to Asterisk to send out a MWI event?  Better still if we could control the sending of MWI events from the dialplan then whenever somebody exits the voicemail app then it would be nice to trigger an MWI event then, so that MWI events are only sent out when somebody has left a voicemail?

By: Mark Michelson (mmichelson) 2007-10-02 18:34:42

rayjay

When I was exploring my options, I found several ways that I could only transmit MWI when someone left a new message. That was easy enough. The tricky part with IMAP is that people generally use an e-mail client to check their messages as opposed to using the phone. Because of that, Asterisk has no means of knowing that the state of messages has changed except by searching the mailbox and checking the flags of the messages. It's this precise behavior which is causing the slowdown and potentially the crashes with MWI enabled.

By: Olivier Krief (okrief) 2007-10-03 22:48:27

I don't know much about IMAP, but the IMAP server must have the capability to detect changes in mailboxes at it centralizes in real time, demands from IMAP clients.
So asking IMAP storage server to somehow trigger MWI, seems to me a good option to optimize things. Asterisk would then have to wait for such notifications from IMAP server and trigger MWI.

Should we dig deeper into this ?

By: Mark Michelson (mmichelson) 2007-10-10 11:31:16

I understand where you're coming from. The problem is that Asterisk's implementation of its IMAP storage is somewhat non-traditional.

Typically, a client will connect to a mailbox at startup and ping the mailbox every so often to detect changes in the mailbox. If Asterisk were implemented this way, then the task of detecting changes in a mailbox would be as easy as pinging the mailbox and interpreting callbacks.

Asterisk, however, opens connections to mailboxes as they are needed. So as a result, it has no point of reference on the server to detect changes from. Because of this, the method of searching the mailboxes is in place to determine the number of old and new messages.

When trying to find a workaround, I found several ways to use to detect when a new message had arrived. I could NOT, however, find any way to detect when a user had checked their messages via an e-mail client. With regards to MWI, this means that the phone would never be able to tell that there were no new messages waiting.

As far as an explanation as to why Asterisk is implemented this way: I simply don't know. The original author of the IMAP code may have had some valid reasons, but I'm not sure what those were. I am considering rewriting the IMAP-specific voicemail code at some point, but even when that is finished, a change that invasive would not be merged into 1.4, but would be in trunk.

By: Olivier Krief (okrief) 2007-10-10 12:10:42

Thanks for taking time to reply : your explanation is crystal clear.

Then, what if we tried to get input from original IMAP code developer ? Using asterisk-dev mailing could help.

I agree that new code shouldn't come into 1.4 but it seems that today, IMAP 1.4 code is not usable at all (we step back from IMAP to non-unified messaging as we always reached deadlocks and issues).

Having a way to poll users and check if anyone is really using IMAP today would help to include new code in 1.4 (along trunk).

Does anyone agree-disagree with this (this = IMAP not usable today) ?

By: Olivier Krief (okrief) 2007-10-22 13:04:52

A quick poll using Asterisk user list shows that none of seven users could make IMAP storage work with Asterisk 1.4.
If this number is relevant, then it would imply that a code rewrite that breaks today's implementation could be qualified as bug fix not as feature improvement, as no regression is possible (as nobody uses IMAP successfully).

By: Olivier Krief (okrief) 2007-10-30 12:26:27

Is anybody watching ?

By: Louis (louis33) 2007-11-26 00:28:19.000-0600

I just spent the day upgrading to 1.4 in order to take advantage of the shiny new imap feature, but I ran into similar issues as the other posters.

A number of "IMAP Warning: Unexpected tagged response..." log entries for and followed by asterisk hangs.

Imap is unusable in its current state.

By: Louis (louis33) 2007-11-26 00:29:36.000-0600

p.s. I was trying to connect to an Imap mailbox on Fastmail.fm which uses Courier IMAP, if I'm not mistaken.

By: James Rothenberger (jaroth) 2007-12-06 14:24:55.000-0600

I have had a similar experience with one of the problems addressed in this report, specifically the following crash:

#3  0x006e1f4d in fatal (
   string=0x774b3a0 "Lock when already locked, mbx={XXX}INBOX")
   at ftl_unix.c:38
#4  0x006f0893 in mail_lock (stream=0x902b928) at mail.c:3165
ASTERISK-1  0x0071396b in imap_send (stream=0x902b928, cmd=0x7a4630 "SELECT", args=0x774bd50) at imap4r1.c:2902
ASTERISK-2  0x0071a2e1 in imap_open (stream=0x902b928) at imap4r1.c:970
ASTERISK-3  0x006f39fc in mail_open_work (d=0x7b43a0, stream=0x902b928,
   name=0xb521470 "{XXXC}INBOX", options=Variable "options" is not available.
) at mail.c:1308
ASTERISK-4  0x006f8f71 in mail_open (stream=0x902b928,
   name=0x774d900 "{XXXX}INBOX", options=Variable "options" is not available.
) at mail.c:1230
ASTERISK-5  0x006d141f in init_mailstream (vms=0x90263f0, box=0) at app_voicemail.c:5293
ASTERISK-6 0x006c7850 in inboxcount (mailbox=0x774eaf0 "67165@default", newmsgs=0x774ebfc, oldmsgs=0x774ebf8)

This seems to have happened when the mailbox was being accessed (leaving a message) and at the same time the VMCOUNT method was called on the same mailbox from the dialplan.  I can't seem to reproduce this problem, but it seems from the logs that both the VMCOUNT method and voicemail were both trying to open a new mailstream to the same mailbox around the same time (in mail_open).  It might be possible to use the portion of this patch that locks the mail_open code for exclusive access without the need to lock all of the mail access methods.  This might solve the problems with slow access to other mailbox functions.

By: jmls (jmls) 2008-02-17 12:37:45.000-0600

can you try with the latest 1.4 svn ? There have been several releases and patches since this was raised.

Thanks.

By: Mark Michelson (mmichelson) 2008-02-18 15:05:42.000-0600

Actually the one thing that's been added to Asterisk since the creation of this issue that would be most likely to help is the addition of configurable timeouts on IMAP connections. This is a trunk feature that was added in revision 96934. I know of a couple of people who used to have hangs on their machines when using IMAP who now no longer have it happen since implementing the timeouts.

By: Digium Subversion (svnbot) 2008-03-26 14:18:16

Repository: asterisk
Revision: 111049

U   branches/1.4/apps/app_voicemail.c

------------------------------------------------------------------------
r111049 | mmichelson | 2008-03-26 14:18:12 -0500 (Wed, 26 Mar 2008) | 9 lines

Add a lock to the vm_state structure and use the lock around mail_open calls
to prevent concurrent access of the same mailstream. This, along with trunk's
ability to configure TCP timeouts for IMAP storage will help to prevent
crashes and hangs when using voicemail with IMAP storage.

(closes issue ASTERISK-10111)
Reported by: ewilhelmsen


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=111049

By: Digium Subversion (svnbot) 2008-03-26 14:22:39

Repository: asterisk
Revision: 111067

_U  trunk/
U   trunk/apps/app_voicemail.c

------------------------------------------------------------------------
r111067 | mmichelson | 2008-03-26 14:22:36 -0500 (Wed, 26 Mar 2008) | 17 lines

Merged revisions 111049 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111049 | mmichelson | 2008-03-26 14:22:16 -0500 (Wed, 26 Mar 2008) | 9 lines

Add a lock to the vm_state structure and use the lock around mail_open calls
to prevent concurrent access of the same mailstream. This, along with trunk's
ability to configure TCP timeouts for IMAP storage will help to prevent
crashes and hangs when using voicemail with IMAP storage.

(closes issue ASTERISK-10111)
Reported by: ewilhelmsen


........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=111067

By: Digium Subversion (svnbot) 2008-03-26 14:23:28

Repository: asterisk
Revision: 111072

_U  branches/1.6.0/
U   branches/1.6.0/apps/app_voicemail.c

------------------------------------------------------------------------
r111072 | mmichelson | 2008-03-26 14:23:26 -0500 (Wed, 26 Mar 2008) | 25 lines

Merged revisions 111067 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r111067 | mmichelson | 2008-03-26 14:26:23 -0500 (Wed, 26 Mar 2008) | 17 lines

Merged revisions 111049 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111049 | mmichelson | 2008-03-26 14:22:16 -0500 (Wed, 26 Mar 2008) | 9 lines

Add a lock to the vm_state structure and use the lock around mail_open calls
to prevent concurrent access of the same mailstream. This, along with trunk's
ability to configure TCP timeouts for IMAP storage will help to prevent
crashes and hangs when using voicemail with IMAP storage.

(closes issue ASTERISK-10111)
Reported by: ewilhelmsen


........

................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=111072

By: Digium Subversion (svnbot) 2008-03-27 12:13:05

Repository: asterisk
Revision: 111440

_U  team/murf/bug11210/
U   team/murf/bug11210/UPGRADE.txt
U   team/murf/bug11210/apps/app_playback.c
U   team/murf/bug11210/apps/app_voicemail.c
U   team/murf/bug11210/channels/chan_sip.c
U   team/murf/bug11210/channels/misdn_config.c
U   team/murf/bug11210/contrib/scripts/autosupport
U   team/murf/bug11210/include/asterisk/ael_structs.h
U   team/murf/bug11210/include/asterisk/linkedlists.h
U   team/murf/bug11210/include/asterisk/pval.h
U   team/murf/bug11210/main/config.c
U   team/murf/bug11210/main/frame.c
U   team/murf/bug11210/main/http.c
U   team/murf/bug11210/main/pbx.c
U   team/murf/bug11210/pbx/pbx_ael.c
U   team/murf/bug11210/res/ael/pval.c

------------------------------------------------------------------------
r111440 | murf | 2008-03-27 12:13:02 -0500 (Thu, 27 Mar 2008) | 182 lines

Merged revisions 111067,111083,111123,111127,111130,111132,111185,111213,111246,111285,111295,111360,111410 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r111067 | mmichelson | 2008-03-26 13:26:23 -0600 (Wed, 26 Mar 2008) | 17 lines

Merged revisions 111049 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111049 | mmichelson | 2008-03-26 14:22:16 -0500 (Wed, 26 Mar 2008) | 9 lines

Add a lock to the vm_state structure and use the lock around mail_open calls
to prevent concurrent access of the same mailstream. This, along with trunk's
ability to configure TCP timeouts for IMAP storage will help to prevent
crashes and hangs when using voicemail with IMAP storage.

(closes issue ASTERISK-10111)
Reported by: ewilhelmsen


........

................
r111083 | file | 2008-03-26 13:29:26 -0600 (Wed, 26 Mar 2008) | 4 lines

Add expiry value to the sip show subscriptions CLI command.
(closes issue ASTERISK-11470)
Reported by: agx

................
r111123 | mmichelson | 2008-03-26 13:39:23 -0600 (Wed, 26 Mar 2008) | 12 lines

Merged revisions 111121 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111121 | mmichelson | 2008-03-26 14:37:36 -0500 (Wed, 26 Mar 2008) | 4 lines

This code change is made just for clarification. It does exactly
the same thing as before. It just doesn't look as wrong.


........

................
r111127 | kpfleming | 2008-03-26 13:52:27 -0600 (Wed, 26 Mar 2008) | 18 lines

Merged revisions 111126 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

................
r111126 | kpfleming | 2008-03-26 14:51:24 -0500 (Wed, 26 Mar 2008) | 10 lines

Merged revisions 111125 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.2

........
r111125 | kpfleming | 2008-03-26 14:49:30 -0500 (Wed, 26 Mar 2008) | 2 lines

update UPGRADE notes to document usage of the script

........

................

................
r111130 | file | 2008-03-26 13:56:40 -0600 (Wed, 26 Mar 2008) | 14 lines

Merged revisions 111129 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111129 | file | 2008-03-26 16:55:08 -0300 (Wed, 26 Mar 2008) | 6 lines

Update autosupport script.
(closes issue ASTERISK-10279)
Reported by: angler
Patches:
     autosupport.diff uploaded by angler (license 106)

........

................
r111132 | tilghman | 2008-03-26 13:58:09 -0600 (Wed, 26 Mar 2008) | 2 lines

Simplify new macro, simplify configfile logic, now that list is sorted

................
r111185 | tilghman | 2008-03-26 14:34:05 -0600 (Wed, 26 Mar 2008) | 2 lines

Oops, missed one

................
r111213 | twilson | 2008-03-26 15:23:29 -0600 (Wed, 26 Mar 2008) | 2 lines

Stupid strcasecmp function :-)

................
r111246 | qwell | 2008-03-26 17:27:33 -0600 (Wed, 26 Mar 2008) | 17 lines

Merged revisions 111245 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111245 | qwell | 2008-03-26 18:26:33 -0500 (Wed, 26 Mar 2008) | 9 lines

Remove excessive smoother optimization that was causing audio glitches (small "pops")
after (about 200ms later) an "incorrectly" sized frame was received.

While it would be very nice to keep this as optimized as possible, it makes no sense
for the smoother to be dropping random bits of audio like this.  Isn't that the
whole point of a smoother?

Closes issue ASTERISK-11533.

........

................
r111285 | qwell | 2008-03-26 18:25:56 -0600 (Wed, 26 Mar 2008) | 9 lines

Merged revisions 111280 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111280 | qwell | 2008-03-26 19:25:13 -0500 (Wed, 26 Mar 2008) | 1 line

Put this flag back so we don't change the API.
........

................
r111295 | qwell | 2008-03-26 18:27:35 -0600 (Wed, 26 Mar 2008) | 1 line

But we can change the API here.
................
r111360 | murf | 2008-03-26 22:47:12 -0600 (Wed, 26 Mar 2008) | 23 lines

Merged revisions 111341 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111341 | murf | 2008-03-26 21:21:05 -0600 (Wed, 26 Mar 2008) | 15 lines


(closes issue ASTERISK-11725)
Reported by: pj
Tested by: murf

These changes will set a channel variable ~~EXTEN~~ just before generating code
for a switch, with the value of ${EXTEN}. The exten is marked as having a switch,
and ever after that, till the end of the exten, we substitute any ${EXTEN}
with ${~~EXTEN~~} instead in application arguments; (and the ${EXTEN: also).
The reason for this, is that because switches are coded using
separate extensions to provide pattern matching, and
jumping to/from these switch extensions messes up the ${EXTEN} value,
which blows the minds of users.


........

................
r111410 | murf | 2008-03-27 07:29:41 -0600 (Thu, 27 Mar 2008) | 17 lines

Merged revisions 111391 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r111391 | murf | 2008-03-27 07:03:28 -0600 (Thu, 27 Mar 2008) | 9 lines

These small documentation updates made in response to a query in
asterisk-users, where a user was using Playback, but needed the
features of Background, and had no idea that Background existed,
or that it might provide the features he needed. I thought the
best way to avert these kinds of queries was to provide "See Also"
references in all three of "Background", "Playback", "WaitExten".
Perhaps a project to do this with all related apps is in order.


........

................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=111440