[Home]

Summary:ASTERISK-15162: [patch] message limit (maxmsg) can be exceeded in 1.6.x creating orphan voicemail
Reporter:sohosys (sohosys)Labels:
Date Opened:2009-11-18 08:27:01.000-0600Date Closed:2010-02-24 08:33:17.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Applications/app_voicemail
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20100107__issue16271.diff.txt
( 1) 20100108__issue16271__1.6.0.diff.txt
( 2) 20100108__issue16271__trunk.diff.txt
( 3) 20100108__issue16271.diff.txt
( 4) 20100115__issue16271__1.6.2.x.diff.txt
( 5) patch_16271.txt
Description:If two are more users are in the process of leaving a voicemail message, and the count of messages in the mailbox was maxmsg-1 when they started leaving the message, the application will record all new messages.

Messages will be numbered in the order they were completed, and all messages over the limit will be inaccessible to the end user. We have tested up to 5 callers in voicemail resulting in maxmsg+4 messages and 4 inaccessible messages. I assume there is no limit based on the code.

We run some high volume asterisk servers where this has become common. In the 1.6 series the message is not reported and not accessible by the end user, although the caller that left the last messages(s) had no indication of the full mailbox.

I guess this could be classified as minor or major depending on the content of the messages in the black hole and the job title of the mailbox owner.

In the 1.2.x series a crash is the result when the end user tries to access the last messages in the mailbox, and we fixed that by simply adding 10 to our desired maxmsg count in voicemail.conf and changing the code to test for msgcount <= maxmsg - 10 before accepting a new message, but the 1.6 code is more complex, and our 1.2 solution is too primitive for a current version, although transparently buffering the variables might be a good simple solution.

****** ADDITIONAL INFORMATION ******

create a mailbox, set maxmsg = 10 in vociemail.conf, record 9 messages int the mailbox, have 1 caller call the mailbox and start leaving a message, have 1 or more additional callers call the mailbox and start leaving messages before the first caller hangs up. try to listen to the last messages left.

I believe that can also be duplicated while transferrring a message to a mailbox while a caller is leaving a message.
Comments:By: Leif Madsen (lmadsen) 2009-11-18 09:25:48.000-0600

Can you confirm you've tested this on the latest release candidates? This bug sounds familiar to something else that was closed a few weeks ago. If so, then I'll mark this as Acknowledged. Thanks!

By: sohosys (sohosys) 2009-11-18 11:42:02.000-0600

i have not tried a release candidate, because searching the issue tracker did not reveal a previous resolved report of this issue. if you can point me to a related issue or code changes that have been made to app_voicemail.c i will be happy to test.

By: Leif Madsen (lmadsen) 2009-11-18 13:34:04.000-0600

Could you provide some console output and sip debugging for the issue in order to move this forward? After that I will acknowledge the issue. Thanks!

By: sohosys (sohosys) 2009-11-18 13:46:50.000-0600

we will set the test back up and capture console/debug output. prefer to not post console or debug in a public bug report as it often contains information useful to hackers. would you like it emailed or would you prefer to mark this issue private?

By: Leif Madsen (lmadsen) 2009-11-18 13:55:42.000-0600

I can mark the issue private as I won't be the one reviewing likely.

By: sohosys (sohosys) 2009-11-19 08:15:30.000-0600

the only unusual output on the console and ebug log is this;

[Nov 18 16:37:27] WARNING[11723] app_voicemail.c: No message attribute file?!! (/mnt/repl/var/spool/asterisk/voicemail/default/5555551234/INBOX/msg0000.txt)

this happens after you delete all messages that you can play (under the limit) and the only message left is numbered above the limit. at that time the msgcount in the mailbox is > 0 but there is no msg0000.

this is due to the renumbering routine being limited to messages within the maxmsg limit.

i belive the best solution would be to start the renumbering routine at maxmsg+10



By: sohosys (sohosys) 2009-11-19 10:32:07.000-0600

it looks to me like the issue is with the function resequence_mailbox, which assumes there are no more messages than maxmsg=, but should really loop until the count of messages in the directory has been exceeded. this will also address https://issues.asterisk.org/view.php?id=15117 which we have confirmed to be a real issue, if you lower the maxmsg on a productuon system you either get a crash (pre 1.6.0.12 versions), or orphan messages as described in this issue (current verions).

By: sohosys (sohosys) 2009-11-19 10:33:03.000-0600

you can make this public again, since we did not post anything sensitive.

By: Matthias Nick (mnick) 2009-11-23 15:46:18.000-0600

added a patch for trunk.
Can you please test it whether it works properly

By: Leif Madsen (lmadsen) 2009-12-02 14:22:12.000-0600

Pinging reporter for testing feedback! :)

By: sohosys (sohosys) 2009-12-02 16:47:08.000-0600

will try to test in the next few days.

By: Leif Madsen (lmadsen) 2009-12-22 11:16:53.000-0600

Pinging, it's been a few days :)

By: sohosys (sohosys) 2009-12-22 11:19:33.000-0600

yes, it was tested and does work. thank you. sorry for the late response.

By: Leif Madsen (lmadsen) 2009-12-22 13:45:30.000-0600

Marking this as Ready for Review as the reporter has stated it resolves the issue for him. Thanks for reporting back!

By: Jason Parker (jparker) 2010-01-04 12:03:52.000-0600

I agree that this is a bug.  However, I don't agree with the proposed fix.

I think the correct way to fix this, would be to check maxmsg when a recording is finished, and *discard it* if the maximum has been reached.  I am of the opinion that maxmsg should be an absolute hard limit.  If a message were discarded due to this, what would need to happen?  What happens for a message that legitimately exceeds maxmsg (ie; msg count is already at/above maxmsg when a user calls)?

I recognize that this answer might not be popular, but I think it's the only way to go here.  Thoughts?

By: sohosys (sohosys) 2010-01-04 12:22:13.000-0600

I strongly disagree. This fix will rarely allow more than the configured max, as the conditions to duplicate the problem are rare. No commercial voicemail system will ever allow a caller to record a message and then "discard it". The “full mailbox” notification is ALWAYS before the caller is allowed to record anything, discarding or trying to notify the user after they hang up (which callers do when they are done speaking, without waiting for additional confirmation) will not work, it will lead caller to believe they have left a message that was not actually saved, which will result in miscommunication and potential bad will. Our Asterisk clusters are configured with 2000 voicemail boxes per cluster, and we have only ever had a very small number of instances of max messages +1, and never an instance of max messages +2. Having a hard limit that can be exceeded only by the number of callers actively leaving a message when the full condition is reached seems like the only viable option. In other words, once the system invites the caller to record a message it needs to be recorded for delivery to the intended recipient, without exception.

By: Leif Madsen (lmadsen) 2010-01-04 14:36:46.000-0600

I have to agree with sohosys here. Callers should be alerted prior to recording a message, and not be left with the impression a voicemail was left, but was then discarded.

I'd prefer there to be some elasticity with leaving the voicemails in these rare circumstances that you can only leave maxmsg + current_active_recordings in the cases where the recording has been started prior to current_msg being evaluated to maxmsg >= current_msg counts.

If the patch does not already do so, it should document this elastic situation, or perhaps have a new option added for trunk:

;maxmsg=50,hard    ; hard mode will cause messages to be discarded after
                  ; being recorded in the situation where multiple active
                  ; recordings would cause the max message limit to be
                  ; exceeded. Note that the user will not be prompted if
                  ; their message is to be discarded.

;maxmsg=50,elastic ; elastic mode allows the situation where multiple
                  ; messages are being left, causing any of the active
                  ; recordings to exceed the max message limit, that those
                  ; recordings will be allowed to be saved beyond the maxmsg
                  ; limit. The elastic setting is enabled by default.

By: Leif Madsen (lmadsen) 2010-01-04 14:46:33.000-0600

From IRC:

<Qwell> blitzrage: ping!  re: 16271
<blitzrage> Qwell: pong!
<blitzrage> that was quick :)
<Qwell> What if any time a caller starts recording a message, it updates a counter to be added to maxmsg?
<Qwell> maxmsg = 10, there are 9 messages written to disk, and 1 caller currently recording a message.  another caller hits VM.  They would be rejected as though maxmsg was hit
<blitzrage> yes, that would make sense to me -- I'm more of the impression active recordings should not be discarded silently.
<Qwell> I never said silently rejected :D
<Qwell> but, consider the following case...
<blitzrage> you said discarded
<blitzrage> meaning, to me, that the msg was recorded, and then discarded
<blitzrage> (presumably at hang up)
<Qwell> maxmsg = 10, 9 messages on disk.  400 callers hit voicemail
<Qwell> what do you do?
<blitzrage> hence my new option
<Qwell> but then after N of those 400, they would be silently dropped
<blitzrage> give the operator of the system the choice to disgard or not
<blitzrage> huh?
<Qwell> if you allow N above maxmsg, you still run into the same problem if you have > N calls
<blitzrage> I'm saying only allowing N above maxmsg until there are no more active voicemails being recorded
<blitzrage> if I start recording a voicemail, it better damn well get delivered.
<blitzrage> as soon as you allow the caller to the recording voicemail part, it needs to be allowed and saved. You can do evaluations prior to recording and incrementing counters (reserving spots) and then reject prior to the recording, but not after
<Qwell> fair
<Qwell> so what about reserving spots?  can we safely assert that most people that hit VM (and stay there for any period of time) would be recording a message?
<blitzrage> lets say I have 8 msgs, and 10 is max, and 3 callers hit the voicemail. Just prior to hitting the recording part, the counter is updated and evaluated, so that the first 2 callers to either hit # (to skip listening to the msg) or to finish listening to the msg get to record and save their voicemail, and the 3rd is then provided with the "too many messages" prompt
<Qwell> correct
<blitzrage> I don't know how voicemail works to determine if the counter can be increased there, or if it needs to be increased as soon as Voicemail() is invoked
<Qwell> that's an implementation detail :D
<blitzrage> I'd prefer it to be increased as close to starting of the recording as possible
<blitzrage> righ
<blitzrage> but I think we agree on the approach
<Qwell> but sure, I agree
<blitzrage> ok

By: Leif Madsen (lmadsen) 2010-01-04 14:47:58.000-0600

And after some further discussion, Qwell and I are in agreement that what is stated above is logical and expected behaviour, and there would be no need to complicate matters by adding an additional feature to allow or disallow the reservation scheme.

By: sohosys (sohosys) 2010-01-08 09:32:57.000-0600

i do not believe the most recent patch eliminates the bug, but rather makes it more unlikely that it will surface. In a very heavily loaded system with a shared voicemail box it is still very possible to duplicate the bug by exceeding the maxmsg count by 11 messages.

why not derive the loop counter from the actual count of message files in the file structure rather than using a fudge factor of 10?

By: Tilghman Lesher (tilghman) 2010-01-08 12:58:56.000-0600

sohosys:  because the count can change immediately after you calculate that number.

lmadsen:  the reason we cannot add a counter as you and Qwell discussed is that the central structure for users is duplicated all over the place, in order to prevent race conditions.  Therefore, it would be necessary to either convert the central structure to astobj2 or to implement a separate container for the count.

By: Tilghman Lesher (tilghman) 2010-01-12 14:50:04.000-0600

I have uploaded a separate patch that tracks the count with a separate container, as specified above.  It needs testing.

By: Jered Sutton (jsutton) 2010-01-15 13:30:04.000-0600

I am working on this issue with sohosys. The original patch failed to apply to 1.6.0.13, 1.6.0.20 the output from the patch command is provided below.

Failed 1.6.0.20
patching file apps/app_voicemail.c
Hunk #1 FAILED at 102.
Hunk #2 succeeded at 438 with fuzz 2 (offset 29 lines).
Hunk #3 FAILED at 3891.
Hunk #4 FAILED at 4282.
Hunk ASTERISK-1 FAILED at 4292.
Hunk ASTERISK-2 succeeded at 4646 with fuzz 1 (offset 356 lines).
Hunk ASTERISK-3 FAILED at 4695.
Hunk ASTERISK-4 FAILED at 4704.
Hunk ASTERISK-5 succeeded at 4415 with fuzz 2 (offset 42 lines).
Hunk ASTERISK-6 succeeded at 4777 (offset 371 lines).
Hunk ASTERISK-7 FAILED at 4810.
Hunk ASTERISK-8 FAILED at 7932.
Hunk ASTERISK-9 FAILED at 9340.
Hunk ASTERISK-10 FAILED at 9365.
10 out of 14 hunks FAILED -- saving rejects to file apps/app_voicemail.c.rej

Failed 1.6.0.13
patching file apps/app_voicemail.c
Hunk #1 FAILED at 102.
Hunk #2 succeeded at 433 with fuzz 2 (offset 24 lines).
Hunk #3 FAILED at 3886.
Hunk #4 FAILED at 4277.
Hunk ASTERISK-1 FAILED at 4287.
Hunk ASTERISK-2 FAILED at 4314.
Hunk ASTERISK-3 FAILED at 4363.
Hunk ASTERISK-4 FAILED at 4372.
Hunk ASTERISK-5 FAILED at 4397.
Hunk ASTERISK-6 succeeded at 4369 with fuzz 2 (offset -37 lines).
Hunk ASTERISK-7 FAILED at 4402.
Hunk ASTERISK-8 FAILED at 7524.
Hunk ASTERISK-9 FAILED at 8932.
Hunk ASTERISK-10 FAILED at 8957.
12 out of 14 hunks FAILED -- saving rejects to file apps/app_voicemail.c.rej

I decided to try the patch against 1.6.2 after determining that the issue is still present. note: the crash issue is present in 1.2.x and 1.6.2.x, but 1.6.0 seems to handle the bug more gracefully. I was able to adapt most of the patch to 1.6.2, with the following exceptions listed below. After testing the modified patch on 1.6.2 we were able to determine that the issue no longer exists. The patch did however induce a secondary issue in which the max messages is effectively maxmsg-1. This may be due to our adaptation of the patch to the 1.6.2 code.

Not compatible
@@ -4202,7 +4253,7 @@
}
if (option_debug > 2)
ast_log(LOG_DEBUG, "Checking message number quota - mailbox has %d messages, maximum is set to %d\n",msgnum,vmu->maxmsg);
- if (msgnum >= vmu->maxmsg) {
+ if (msgnum >= vmu->maxmsg - inprocess_count(vmu->mailbox, vmu->context, 0)) {
res = ast_streamfile(chan, "vm-mailboxfull", chan->language);
if (!res)
res = ast_waitstream(chan, "");


@@ -7501,9 +7561,11 @@
#endif
if (!(vms.deleted = ast_calloc(vmu->maxmsg, sizeof(int)))) {
/* TODO: Handle memory allocation failure */
+ goto out;
}
if (!(vms.heard = ast_calloc(vmu->maxmsg, sizeof(int)))) {
/* TODO: Handle memory allocation failure */
+ goto out;
}

/* Set language from config to override channel language */

Not sure about

@@ -8931,6 +8994,10 @@
return AST_MODULE_LOAD_DECLINE;
}

+ if (!(inprocess_container = ao2_container_alloc(573, inprocess_hash_fn, inprocess_cmp_fn))) {
+ return AST_MODULE_LOAD_DECLINE;
+ }
+
my_umask = umask(0);
umask(my_umask);
res = ast_register_application(app, vm_exec, synopsis_vm, descrip_vm);

I have provided the adapted patch for 1.6.2.x in case anyone finds it useful.

By: Tilghman Lesher (tilghman) 2010-01-15 13:54:07.000-0600

Ah, it appears my patch was against 1.4, which the earliest version in which this issue can occur.  When bugs are reported, we generally look to see which version can be affected and move issues forward from there, although it does appear that the 1.6 versions are significantly different than 1.4.

By: Tilghman Lesher (tilghman) 2010-01-15 14:41:07.000-0600

Also uploaded separate patches for trunk and 1.6.0.

By: Digium Subversion (svnbot) 2010-01-15 14:52:29.000-0600

Repository: asterisk
Revision: 240414

U   branches/1.4/apps/app_voicemail.c

------------------------------------------------------------------------
r240414 | tilghman | 2010-01-15 14:52:28 -0600 (Fri, 15 Jan 2010) | 15 lines

Disallow leaving more than maxmsg voicemails.
This is a possibility because our previous method assumed that no messages are
left in parallel, which is not a safe assumption.  Due to the vmu structure
duplication, it was necessary to track in-process messages via a separate
structure.  If at some point, we switch vmu to an ao2-reference-counted
structure, which would eliminate the prior noted duplication of structures,
then we could incorporate this new in-process structure directly into vmu.
(closes issue ASTERISK-15162)
Reported by: sohosys
Patches:
      20100108__issue16271.diff.txt uploaded by tilghman (license 14)
      20100108__issue16271__trunk.diff.txt uploaded by tilghman (license 14)
      20100108__issue16271__1.6.0.diff.txt uploaded by tilghman (license 14)
Tested by: jsutton

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=240414

By: Digium Subversion (svnbot) 2010-01-15 14:54:25.000-0600

Repository: asterisk
Revision: 240415

_U  trunk/
U   trunk/apps/app_voicemail.c

------------------------------------------------------------------------
r240415 | tilghman | 2010-01-15 14:54:24 -0600 (Fri, 15 Jan 2010) | 22 lines

Merged revisions 240414 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r240414 | tilghman | 2010-01-15 14:52:27 -0600 (Fri, 15 Jan 2010) | 15 lines
 
 Disallow leaving more than maxmsg voicemails.
 This is a possibility because our previous method assumed that no messages are
 left in parallel, which is not a safe assumption.  Due to the vmu structure
 duplication, it was necessary to track in-process messages via a separate
 structure.  If at some point, we switch vmu to an ao2-reference-counted
 structure, which would eliminate the prior noted duplication of structures,
 then we could incorporate this new in-process structure directly into vmu.
 (closes issue ASTERISK-15162)
  Reported by: sohosys
  Patches:
        20100108__issue16271.diff.txt uploaded by tilghman (license 14)
        20100108__issue16271__trunk.diff.txt uploaded by tilghman (license 14)
        20100108__issue16271__1.6.0.diff.txt uploaded by tilghman (license 14)
  Tested by: jsutton
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=240415

By: Digium Subversion (svnbot) 2010-01-15 14:56:34.000-0600

Repository: asterisk
Revision: 240416

_U  branches/1.6.0/
U   branches/1.6.0/apps/app_voicemail.c

------------------------------------------------------------------------
r240416 | tilghman | 2010-01-15 14:56:33 -0600 (Fri, 15 Jan 2010) | 29 lines

Merged revisions 240415 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r240415 | tilghman | 2010-01-15 14:54:24 -0600 (Fri, 15 Jan 2010) | 22 lines
 
 Merged revisions 240414 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r240414 | tilghman | 2010-01-15 14:52:27 -0600 (Fri, 15 Jan 2010) | 15 lines
   
   Disallow leaving more than maxmsg voicemails.
   This is a possibility because our previous method assumed that no messages are
   left in parallel, which is not a safe assumption.  Due to the vmu structure
   duplication, it was necessary to track in-process messages via a separate
   structure.  If at some point, we switch vmu to an ao2-reference-counted
   structure, which would eliminate the prior noted duplication of structures,
   then we could incorporate this new in-process structure directly into vmu.
   (closes issue ASTERISK-15162)
    Reported by: sohosys
    Patches:
          20100108__issue16271.diff.txt uploaded by tilghman (license 14)
          20100108__issue16271__trunk.diff.txt uploaded by tilghman (license 14)
          20100108__issue16271__1.6.0.diff.txt uploaded by tilghman (license 14)
    Tested by: jsutton
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=240416

By: Digium Subversion (svnbot) 2010-01-15 14:57:07.000-0600

Repository: asterisk
Revision: 240417

_U  branches/1.6.1/
U   branches/1.6.1/apps/app_voicemail.c

------------------------------------------------------------------------
r240417 | tilghman | 2010-01-15 14:57:07 -0600 (Fri, 15 Jan 2010) | 29 lines

Merged revisions 240415 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r240415 | tilghman | 2010-01-15 14:54:24 -0600 (Fri, 15 Jan 2010) | 22 lines
 
 Merged revisions 240414 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r240414 | tilghman | 2010-01-15 14:52:27 -0600 (Fri, 15 Jan 2010) | 15 lines
   
   Disallow leaving more than maxmsg voicemails.
   This is a possibility because our previous method assumed that no messages are
   left in parallel, which is not a safe assumption.  Due to the vmu structure
   duplication, it was necessary to track in-process messages via a separate
   structure.  If at some point, we switch vmu to an ao2-reference-counted
   structure, which would eliminate the prior noted duplication of structures,
   then we could incorporate this new in-process structure directly into vmu.
   (closes issue ASTERISK-15162)
    Reported by: sohosys
    Patches:
          20100108__issue16271.diff.txt uploaded by tilghman (license 14)
          20100108__issue16271__trunk.diff.txt uploaded by tilghman (license 14)
          20100108__issue16271__1.6.0.diff.txt uploaded by tilghman (license 14)
    Tested by: jsutton
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=240417

By: Digium Subversion (svnbot) 2010-01-15 14:57:31.000-0600

Repository: asterisk
Revision: 240418

_U  branches/1.6.2/
U   branches/1.6.2/apps/app_voicemail.c

------------------------------------------------------------------------
r240418 | tilghman | 2010-01-15 14:57:31 -0600 (Fri, 15 Jan 2010) | 29 lines

Merged revisions 240415 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r240415 | tilghman | 2010-01-15 14:54:24 -0600 (Fri, 15 Jan 2010) | 22 lines
 
 Merged revisions 240414 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r240414 | tilghman | 2010-01-15 14:52:27 -0600 (Fri, 15 Jan 2010) | 15 lines
   
   Disallow leaving more than maxmsg voicemails.
   This is a possibility because our previous method assumed that no messages are
   left in parallel, which is not a safe assumption.  Due to the vmu structure
   duplication, it was necessary to track in-process messages via a separate
   structure.  If at some point, we switch vmu to an ao2-reference-counted
   structure, which would eliminate the prior noted duplication of structures,
   then we could incorporate this new in-process structure directly into vmu.
   (closes issue ASTERISK-15162)
    Reported by: sohosys
    Patches:
          20100108__issue16271.diff.txt uploaded by tilghman (license 14)
          20100108__issue16271__trunk.diff.txt uploaded by tilghman (license 14)
          20100108__issue16271__1.6.0.diff.txt uploaded by tilghman (license 14)
    Tested by: jsutton
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=240418

By: Olle Johansson (oej) 2010-02-24 08:33:17.000-0600

This patch causes a segmentation fault on FreeBSD. I'll try to get a crash backtrace.