[Home]

Summary:ASTERISK-01181: [post-1.0] [patch] Modify mailbox path on filesystem to allow for MANY boxes per context
Reporter:Rob Gagnon (rgagnon)Labels:
Date Opened:2004-03-09 14:08:28.000-0600Date Closed:2011-06-07 14:10:43
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Applications/app_queue
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk.conf.sample
( 1) patch20041116.txt
( 2) asterisk.conf.sample
( 3) patch20050516.txt
Description:If you manage to get many many mailboxes per context, the filesystem will eventually slow down due to the number of files in a single directory.

This patch solves this future problem.

****** ADDITIONAL INFORMATION ******

Code changes filesystem path pattern from:

Example:
mailbox "mailbox" inside of context "context" would change from:
/var/spool/asterisk/voicemail/context/mailbox/....
to:
/var/spool/asterisk/voicemail/context/m/a/mailbox/...

by using the first 2 characters of the mailbox name as more directories under the context.
Comments:By: Rob Gagnon (rgagnon) 2004-03-09 14:12:20.000-0600

If you have the special case, where the mailbox is only 1 character long, that is handled by using an underscore as the first directory.

Examples:

Mailbox:  "A"
Context:  "demo"
Path:     /var/spool/asterisk/voicemail/demo/_/A/A/....

Mailbox:  "RG"
Context:  "hello"
Path:     /var/spool/asterisk/voicemail/hello/R/G/RG/....

Mailbox:  "1234567890"
Context:  "longext"
Path:     /var/spool/asterisk/voicemail/longext/1/2/1234567890/....

By: James Golovich (jamesgolovich) 2004-03-09 14:18:35.000-0600

I think this is a good start.  Seems like a good option to have, but why not have a configurable number of characters to hash on? as well as the option of a choice of distribution method.  A better method for this hashing is going to be based on the reverse of the name.  So 12345 would be 5/4/12345  This provides a better distribution because names (and numbers) would tend to be grouped in the same dirs anyways.

This also doesn't take care of the mwi aspect, which is mostly handled in app.c

By: Rob Gagnon (rgagnon) 2004-03-09 14:46:02.000-0600

A configurable number of digits to hash on would make for a more complicated make_dir() function, but might be worth it.

I see your point about the reverse making it more distributed, but for any technician to track issues at the unix prompt, it is easier to see ext: 1234, and know the path will be /1/2/1234.

Also note, to implement this change, a lot of snprintf() calls that created directory or filename paths now use make_dir() (which is cleaner of course to have that filepath in one spot instead of all over)

By: Rob Gagnon (rgagnon) 2004-03-12 00:46:51.000-0600

In the meantime... I modified the /asterisk/contrib/scripts/addmailbox script so it aligns itself with the code in the patch included here.

Also modified app.c to use a similar make_dir() function to the one in app_voicemail.c for the mwi stuff you mentioned.

edited on: 03-12-04 00:58

By: Brian West (bkw918) 2004-04-17 23:59:17

you accually don't need addmailbox anymore just modify app_voicemail where it creates the dir.

By: Mark Spencer (markster) 2004-04-29 09:55:57

Am I missing something or does this totally break backwards compatibility?

By: Rob Gagnon (rgagnon) 2004-04-29 10:02:21

The patch is now quite old..

I do think something like this is needed still for a larger system.

However, two things I would still need to put in this:

- The path creation function would need to be public to all of asterisk due to other programs that interact with voicemail
- Backward compatibility is an issue since existing voicemails would be left hanging:  To solve this, I would imagine a conversion program, or script could be created to move existing files into the new path system--- Maybe even the load_module function of app_voicemail.c could run a function to convert the system if needed.

By: twisted (twisted) 2004-04-29 10:16:33

I can see this being useful... Is there a way we can:

1) update the patch to current cvs
2) allow this option to be enabled in the config, so that we can maintain backwards compataibility
and
3) agree on a directory structure ;)

Thanks.

By: Rob Gagnon (rgagnon) 2004-04-29 10:57:23

Is asterisk.c the best place for a common function like "ast_make_vm_dir()" or something along that line?

After writing the first patch, I found that the path needs to be used for MWI in different channel handlers, among other places.

By: Mark Spencer (markster) 2004-04-29 12:13:12

Also, I think the vmdir should be based on the *last* two digits not the first two, since (for example here at Digium) everyone's extension starts with 6 so you lose part of the advantage.  In real, large systems, they might be based on NPX's eg 721XXXX, 850XXXX and again there would be very little isolation from the top two.

By: Rob Gagnon (rgagnon) 2004-04-29 13:20:07

True..  Jamegolovich mentioned to make it configurable.   Will keep that in mind.

edited on: 04-29-04 12:14

By: James Golovich (jamesgolovich) 2004-04-29 13:32:44

If you need an idea on how to do this the right way look at procmail and cucipop (both use the same code).  You simply set a number to hash on.  Default is 0 which is the current behavior.  I prefer doing the hashing starting at the end because it gives you a much better distribution (in names or in numbers).

I think we will have a util.[ch] or utils.[ch] that this and other functions can go into.

By: Rob Gagnon (rgagnon) 2004-05-13 18:26:05

Step 1:

The directory name hashing function up for inclusion into utils.c as well as the changes for utils.h are in utils.patch.txt

If you would please look that over, after it is accepted, the files that need the voicemail path names (and there are a bunch) can then be changed to use this function

By: Rob Gagnon (rgagnon) 2004-05-14 12:01:46

utils.patch2.txt new patch to match today's cvs change in utils.c

All files except "utils.patch2.txt" can be removed from this bug now.

edited on: 05-14-04 11:00

By: Rob Gagnon (rgagnon) 2004-05-20 10:34:04

utils.patch3.txt new patch to match today's cvs change in utils.c

By: Mark Spencer (markster) 2004-05-20 10:37:58

Do you want me to merge this patch on its own?

By: Rob Gagnon (rgagnon) 2004-05-20 10:46:08

The idea was to get the function for directory creation accepted.. and then modify all areas where the function would be needed.

(IE: read config for number of characters to hash, other places where voicemails are checked for MWI, etc.... and the HasVoicemail() app etc...)

I guess I could do all in one, but thought that smaller pieces would be helpful

By: Tilghman Lesher (tilghman) 2004-05-20 12:40:03

Wouldn't it make more sense just to use a filesystem that supports many files in a single directory?  (i.e. reiser, xfs, jfs [but not ext2/ext3])

By: Rob Gagnon (rgagnon) 2004-05-20 13:08:54

but then that makes a filesystem type a requirement for running asterisk.

The system should adapt to the filesystem as needed.

By: Rob Gagnon (rgagnon) 2004-05-26 17:12:49

OK.  Fully implemented this this time around.

The files uploaded that are required now:
patch.txt
asterisk.conf.sample

The others can be removed.

Files in the patch that are modified:
- app.c
- apps/app_directory.c
- apps/app_hasnewvoicemail.c
- apps/app_voicemail.c
- astconf.h
- asterisk.c
- utils.c
- include/asterisk/utils.h

New file:
- asterisk.conf.sample
Read this for the documentation on how to use/set the hash length.  It is in asterisk.conf and not voicemail.conf due to the use of the voicemail path in many places.

By: Tilghman Lesher (tilghman) 2004-05-26 17:34:27

Actually, filesystem shouldn't even be a problem.  You can stick 300 subdirectories on an ext2 filesystem with no slowdown (10,000 subdirectories is a different matter).  Of course, if you have 10,000 submailboxes, you might want to rethink your classification system...

By: Rob Gagnon (rgagnon) 2004-05-26 17:38:04

10,000 per context is a lot?

If you are partitioning the system with the context basically equivalent to a customer number, and that customer has 10,000 or more mailboxes.  How would you re-classify things?

By: Tilghman Lesher (tilghman) 2004-05-26 18:14:08

Different contexts?  I would be surprised that you would have a single system that runs multiple customers, where each customer is more than 10,000 users.  Unless you're checking voicemail via the web, that's a heckuva lot of channel contention that that box is going to have, with people competing to check their voicemail.

In such a situation, it would make sense there to have a separate machine for each customer (having 10,000 mailboxes).

edited on: 05-26-04 17:14

By: Mark Spencer (markster) 2004-05-26 18:17:45

Any chance of someone actually measuring this with ext3 to see if there really is a performance difference?

By: Rob Gagnon (rgagnon) 2004-05-26 18:21:27

Well either way, the patch works, and it centralizes the subroutine for the generation of the path to an email box.  That, in itself, is an advantage.

As well, if the user does not configure VMHashLen in asterisk.conf, then there is no functional difference.

By: Rob Gagnon (rgagnon) 2004-06-10 22:29:53

Updated the patch file to keep it current with CVS.  The current list of active files above should be:

- patch20040610.txt
- asterisk.conf.sample

By: Rob Gagnon (rgagnon) 2004-06-24 14:22:09

Updated the patch file to keep it current with CVS. The current list of active files above should be:

- patch20040624.txt
- asterisk.conf.sample

By: Olle Johansson (oej) 2004-07-16 14:08:13

Does this apply to current CVS?
Sent mail to mailing list to get community feedback.

By: adam (adam) 2004-07-17 00:54:26

We actually came across this problem with the Firefly network. It wasn't performance, it's that EXT3 has a hard limit of 30k files/folders per folder. Our fix was to take the last digit of their number and use it as a subdirectory. Very similar to what this patch does. We also discovered that browsing/backing up this directory seemed to kill asterisk but I thought that was due to the old version of asterisk we are running (upgrading soon :))

I was going to raise a bug or post on the mailing list but I assumed I'd get abused because it's a filesystem problem, not asterisk.

When we upgrade, I'll remove my changes from voicemail and try this patch instead, should be a good stress test (we're approaching 100k users rapidly)

edited on: 07-17-04 00:38

By: Rob Gagnon (rgagnon) 2004-07-17 01:23:19

I think I may need to bring this patch up to speed.  Last change to it was 2004-06-24, so I bet it won't apply clean anymore...

By: Rob Gagnon (rgagnon) 2004-07-17 02:06:53

Patch "patch20040717.txt" now current with cvs files as shown below:

app.c v1.24
astconf.h v1.1
asterisk.c v1.104
utils.c v1.15
app_directory.c v1.25
app_hasnewvoicemail.c v1.7
app_voicemail.c v1.134
utils.h v1.6

edited on: 07-17-04 01:50

By: Rob Gagnon (rgagnon) 2004-07-18 03:10:00

patch20040718.txt is now latest due to changes in app_voicemail.c recently

By: Rob Gagnon (rgagnon) 2004-08-02 21:32:03

Patch20040802.txt is now the latest patch.  Just keeping this up to date.  No other changes.

Please delete other patchxxxxx.txt files.

By: twisted (twisted) 2004-08-02 21:33:22

Done

By: Rob Gagnon (rgagnon) 2004-08-17 17:24:43

Just keeping the patch current, repairing conflicts with current CVS... no other changes

By: Rob Gagnon (rgagnon) 2004-08-30 11:18:23

Just keeping the patch current, repairing conflicts with current CVS... no other changes.

Most recent file is now "patch20040830.txt (17,371 bytes) 08-30-04 11:17"

By: Rob Gagnon (rgagnon) 2004-09-20 10:23:20

Just keeping the patch current, repairing conflicts with current CVS... no other changes.

Most recent file is now "patch20040920.txt (17,354 bytes) 09-20-04 10:22"

By: Rob Gagnon (rgagnon) 2004-09-30 12:20:36

Just keeping the patch current, repairing conflicts with current CVS... no other changes.

Most recent file is now "patch20040930.txt (17,355 bytes) 09-30-04 12:20"

By: Rob Gagnon (rgagnon) 2004-10-04 12:40:50

Just keeping the patch current, repairing conflicts with current CVS... no other changes.

Most recent file is now "patch20041004.txt (17,435 bytes) 10-04-04 12:40"

By: twisted (twisted) 2004-10-27 16:33:35

Can we get an update on this?  Is this cvs ready?

By: Rob Gagnon (rgagnon) 2004-10-28 14:01:09

I will look at this very soon, and get an up to date patch for you.  Thanks

By: Rob Gagnon (rgagnon) 2004-10-28 16:06:06

Latest files are:

patch20041028.txt (17,328 bytes) 10-28-04 15:26
asterisk.conf.sample (5,420 bytes) 10-28-04 16:04

NOTE:  I documented all of the "options" section in asterisk.conf for you as well since I was there to do VMHashLen anyhow.

Note that I moved "VMHashLen" from the [Directories] section to the [Options] section, as that seems to make more sense.

By: twisted (twisted) 2004-11-14 21:34:25.000-0600

Great, now we just need to get the attention of the big kahuna again.  

/* poke poke mark */

--Housekpeeing

By: Rob Gagnon (rgagnon) 2004-11-15 02:06:29.000-0600

what type of comment needed from me now?  I got an email requesting comment

By: Rob Gagnon (rgagnon) 2004-11-16 10:39:32.000-0600

Current files are:
asterisk.conf.sample (5,410 bytes) 11-16-04 10:34
patch20041116.txt (17,329 bytes) 11-16-04 10:37

The rest of them can be dropped.

Updates:
- patch20041116.patch.txt: Just to bring things to current CVS.
- asterisk.conf.sample:  Fixed some typos in the doc

By: twisted (twisted) 2004-12-01 00:58:04.000-0600

Dropped all files except for the two you listed.  Let's see if we can get this one some more attention.  Try posting to the -dev list as a "call for testers" again, and, maybe catch markster on IRC to re-review it if it still applies to current cvs.   I'm going to mark this acknowledged instead of feedback required, pending testing.

By: twisted (twisted) 2004-12-15 20:37:40.000-0600

Two weeks ago i changed the status from feedback to acknowledged to allow for testing, however, i see that nothing has happened here since then, so it's going back into feedback status (meaning that this bug must have real activity or it will be closed).

--Housekeeping

By: twisted (twisted) 2004-12-28 10:39:23.000-0600

Last request for activity/status/etc.

By: adam (adam) 2005-01-09 16:49:54.000-0600

I've tested this patch and it works fine - the only issue is when it says you've reach extension <extension number> - it says the subdirectory numbers also. (eg extension of 123 would say 3.2.1.2.3) This is because it's a saydigit on the full directory name (which is dodge regardless)

Good patch, thanks

By: twisted (twisted) 2005-01-28 19:58:24.000-0600

Gonna wake this one back up.... rgagnon, is this patch going to be fixed to address the issue of voicemail playing back the entire directory path when stating the mailbox number?  Does this even still apply to cvs head?

By: Rob Gagnon (rgagnon) 2005-01-29 00:46:51.000-0600

I will check into it on Monday (2005-01-31) ... just to make sure you know i am still alive :-)

By: Rob Gagnon (rgagnon) 2005-01-31 13:25:36.000-0600

adam:

Can you tell me which function inside app_voicemail.c where you think it is playing the example "3.2.1.2.3" ?

OR:  if you know the prompt you are hearing  IE:

"The person at extension .... is on the phone" or
"The person at extension .... is unavailable"



I have the code brought up to date here locally now, and I want to be sure to close out your issue as well.

Thanks

edited on: 01-31-05 13:26

By: Olle Johansson (oej) 2005-02-13 13:25:16.000-0600

...moving this to "experimental features" for now.

By: Mark Spencer (markster) 2005-03-01 23:48:13.000-0600

Has this been tested?

By: rgagnon_ (rgagnon_) 2005-03-02 09:58:23.000-0600

I have been waiting for "adam" to respond positively as to whether he is still experiencing the problem.

ALSO... How do I retrieve a lost password?  I had to make this other account in order to respond

By: adam (adam) 2005-03-02 16:18:27.000-0600

I've modified my version to use a mysql to fetch a mailbox number so I'm not sure. Honestly, there's a chance it was a mistake on my end because looking at the code again, it looks good. I recall it was do a say_digits on the entire directory. The code is in use in production so I'm happy

By: Brian West (bkw918) 2005-03-17 21:16:54.000-0600

Updates.. updates.... is it updated?

/b

By: Kevin P. Fleming (kpfleming) 2005-03-20 09:33:41.000-0600

This has another side-effect not previously mentioned... it will break any systems using ODBC storage, unless you provide a script to rename the rows in the ODCB vm storage table (since they are named using the filesystem path they would have been stored in, which is certainly not a great design, but that's how it is).

So, for this to go into place, users will need a way to migrate filesystem-based VM storage and ODBC-based VM storage to the hashed directory names.

By: Clod Patry (junky) 2005-04-09 11:36:29

Is there any updates in here?
After kpfleming's note, i see no patch added.

By: Michael Jerris (mikej) 2005-05-15 21:16:22

Kevin, reviewing this, it does NOT have any backwards compatability issues by default.  This would only be an issue if the users changes the VMHashLen value in asterisk.conf.  There is a very clear note to not change this on a system that has voicemail boxes already created.  Can this go in?

By: Michael Jerris (mikej) 2005-05-15 21:18:50

Can we get one more update on this.  A diff on the sample conf would be nice as well, removing the comments from the top of the file about how your setting is the most interesting one in the file would probably be better too ;)

By: Rob Gagnon (rgagnon) 2005-05-15 21:41:55

ok. Just to let y'all know... I am still alive.

I just finally got around to tracking down my password to the site so I could log in.

It has been kinda frustrating not being able to respond to requests.

I will get a look at this probably tomorrow (monday 5/16/05)

By: Rob Gagnon (rgagnon) 2005-05-16 11:18:19

patch20050516.txt patches the following current CVS head files:

astconf.h, v1.1
asterisk.c, v1.153
utils.c, v1.44
apps/app_directory.c, v1.36
apps/app_hasnewvoicemail.c, v1.12
apps/app_voicemail.c, v1.214
include/asterisk/utils.h, v1.30

By: Michael Jerris (mikej) 2005-05-16 11:51:43

asterisk.conf.samples patch please, without the lines from the top of the file.

By: Rob Gagnon (rgagnon) 2005-05-16 12:13:01

Well that one is a new file someone can just edit when doing a CVS add, but...

if you insist

By: Michael Jerris (mikej) 2005-05-16 12:19:15

I missed that it was a new file.  No problem then. Thanks.



By: Michael Jerris (mikej) 2005-05-26 01:35:22

Can somone please code review this and provide testing feedback.

By: Michael Jerris (mikej) 2005-06-19 12:32:51

Can you please review this?

By: Kevin P. Fleming (kpfleming) 2005-07-11 23:44:08

This looks pretty good, except for a few things:

1) I _really_ don't want this configuration option to be in asterisk.conf. app_directory already knows how to look into voicemail.conf for other things, so it makes sense to me for it to read the hash length from there and pass it to the path-making function. app_hasnewvoicemail can do the same thing.

2) ast_util_vm_dir returns the length of the string it built, but none of its callers seem to care. This is wasteful; just make it a return success/failure, and the callers can use strlen() if they want to.

app_voicemail has been changed quite a bit by tonight's activity, so you'll probably need some time to get your patch merged in :-)

By: Michael Jerris (mikej) 2005-07-31 22:55:44

I hate to do this, but suspended due to no activity.  Can you please update this with the required updates and re-open when ready.  Thanks.