[Home]

Summary:ASTERISK-04856: [patch] [post 1.2] Japanese SayNumber functionality
Reporter:Takeshi Hiragimoto (hfritz)Labels:
Date Opened:2005-08-18 17:46:57Date Closed:2011-06-07 14:03:09
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/Internationalization
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk-japanese_say_functionality_2005Aug28.patch
( 1) asterisk-saynumber_jpn-1.0.patch
( 2) asterisk-saynumberjp.patch
( 3) asterisk-saynumberjp.patch.tgz
( 4) asterisk-saynumberjp-2006Jan13.patch
Description:I have added Japanese SayNumber functionality to say.c.
I have tested it and it is working fine.

Japanese speak rule is as follows,

0 to 9 are same as English
11    = ten + one
12    = ten + two
13    = ten + three
20    = two + ten
21    = two + ten + one
121   = hundred + two + ten + one
1243  = Thausand + two + Hundred + 4 + Ten + three
10000   = We say "man" and it means same as "ten thousand"
12534   = one + "man" + two + thousand + five + hundred + three + ten + 4


Comments:By: Clod Patry (junky) 2005-08-18 18:46:36

Please provide patch for CVS-HEAD, not for stable.
Thanks.

By: Takeshi Hiragimoto (hfritz) 2005-08-19 14:43:48

OK, I will work on that.
Thank you

By: Takeshi Hiragimoto (hfritz) 2005-08-20 13:03:01

almost done with writing codes,
now testing my new codes and it will need few more days,
I will upload the new patch as soon as I finish my testing,
Thank you

By: Takeshi Hiragimoto (hfritz) 2005-08-27 13:56:05

I just finished to test my patch.
Compared to original patch "asterisk-saynumber_jpn-1.0.patch",
the new patch "asterisk-japanese_say_functionality.patch" is
not only for SayNumber functionality, but also it contains
following functionalities,

- ast_say_number_full_jp
- ast_say_enumeration_full_jp
- ast_say_date_jp
- ast_say_date_with_format_jp
- ast_say_time_jp
- ast_say_datetime_jp
- ast_say_datetime_from_now_jp

and the latest patch requires following sound files
jp-day-1.gsm
jp-day-10.gsm
jp-day-11.gsm
jp-day-12.gsm
jp-day-13.gsm
jp-day-14.gsm
jp-day-15.gsm
jp-day-16.gsm
jp-day-17.gsm
jp-day-18.gsm
jp-day-19.gsm
jp-day-2.gsm
jp-day-20.gsm
jp-day-21.gsm
jp-day-22.gsm
jp-day-23.gsm
jp-day-24.gsm
jp-day-25.gsm
jp-day-26.gsm
jp-day-27.gsm
jp-day-28.gsm
jp-day-29.gsm
jp-day-3.gsm
jp-day-30.gsm
jp-day-31.gsm
jp-day-4.gsm
jp-day-5.gsm
jp-day-6.gsm
jp-day-7.gsm
jp-day-8.gsm
jp-day-9.gsm
jp-hour.gsm
jp-min.gsm
jp-nen.gsm
jp-sec.gsm
jp-zero.gsm
jpn-banme.gsm
100.gsm
1000.gsm
10000.gsm
100000000.gsm

Thank you

By: Takeshi Hiragimoto (hfritz) 2005-08-27 14:18:51

I made asterisk-japanese_say_functionality.patch with
wrong format, sorry.
please use asterisk-japanese_say_functionality_2005Aug28.patch instead,
and please remove asterisk-japanese_say_functionality.patch.
thank you

By: Takeshi Hiragimoto (hfritz) 2005-09-01 06:45:20

My latest patch "asterisk-japanese_say_functionality_2005Aug28.patch" is not for the branch "1.0.9".
I have retrieved the source code of the Asterisk from the latest cvs repository.
in this case, should I close the current issue and raise new issue for 1.2.0 branch ?
please advise.

By: Michael Jerris (mikej) 2005-09-01 06:57:14

You have already missed feature freeze for the 1.2 branch.  This will have to be developed for head and wait for post 1.2 release.

By: Olle Johansson (oej) 2005-11-22 10:29:06.000-0600

1. Please add the comments in doxygen format

2. I don't like prefixing sound files with jp- - they will be stored in the jp directory if needed for japanese. Give them all english names so we do not get any duplicates in any language set.

"Secs" should be named "secs" in any language, we can't have "se-secs", "jp-secs" and "secs" - that is not the way it works.

Thank you for contributing and thank you for updating your patch.

/O

By: Takeshi Hiragimoto (hfritz) 2005-11-22 18:28:29.000-0600

Thank you for your comment,
ok, I will try to modify my code as soon as possible.

By: Igor Goncharovsky (igorg) 2005-11-22 22:23:55.000-0600

May be it'll be useful to watch bug ASTERISK-5527: http://bugs.digium.com/view.php?id=5675
It is about rewrite say.c to make simple and small code.

By: Takeshi Hiragimoto (hfritz) 2005-11-23 00:13:10.000-0600

Thank you very much for your information.
fritz

By: Takeshi Hiragimoto (hfritz) 2005-12-02 20:02:55.000-0600

I have just finished modifying my say number functions
to fit the new say number source code from "http://bugs.digium.com/view.php?id=5675"
and successfully made a patch file
"asterisk-saynumberjp.patch.tgz".
I know the say number source codes from
"http://bugs.digium.com/view.php?id=5675"
is not a final version yet, therefore I will
keep watching it's status and try to fit
my codes accordingly.

By: Olle Johansson (oej) 2005-12-03 13:55:30.000-0600

Please do not compress patches into archives, it makes it hard to work with the issue tracker.

By: Takeshi Hiragimoto (hfritz) 2005-12-04 03:50:46.000-0600

OK, sorry, I will re-send my patch
Fritz

By: benjk (benjk) 2006-01-06 03:14:38.000-0600

The given description of counting rules is far too simplistic even for limited use in a PBX/IVR system.

There are many counting suffixes and irregularities to take into account, far beyond the scope of this note, but I have tried to list the ones I feel are most relevant to Asterisk.

First of all there are two sets of numbers: Native Japanese numbers and Sino-Japanese (Chinese originated) numbers. In the range from 1 to 10 either set of numbers are used depending on context.

Native Japanese numbers from 1-10 ...

hitotsu, futatsu, mittsu, yottsu, itsutsu, muttsu, nanatsu, yattsu, kokonotsu, toh.

Sino-Japanese numbers from 1-10 ...

ichi, ni, san, shi but mostly yon, go, roku, shichi but mostly nana, hachi, kyu, juu.

and from 11 on ...

juu-ichi, juu-ni, juu-san etc etc etc

Sometimes, the two systems are mixed.

For example the phone numbers for emergency services in Japan are 110 (Police) and 112 (Fire/Ambulance). The number 110 would normally be hyaku-juu, which is Sino-Japanese 100 (hyaku) and Sino-Japanese 10 (juu). However, the emergency phone number 110 is hyaku-toh-ban, which is Sino-Japanese 100 (hyaku) and native Japanese ten (toh) plus a suffic for number (ban).

The majority of uses of native Japanese numbers can probably ignored for the purpose of a PBX system, but the 110 emergency number is significant enough to warrant proper handling in the code. However, there are several other exceptions which would be relevant.

Another difficulty is that in Japanese, numbers are seldomly used on their own. Instead a number is followed by a counting suffix which depends on the object that is being counted. In some cases the suffix may be omitted but in most cases it is required.

Examples:

1) counting voicemail messages

if suffix is to be omitted, then the native Japanese numbers would be used from 1-9 ...

hitotsu, futatsu, mittsu, yottsu, itsutsu, muttsu, nanatsu, yattsu, kokonotsu

above 10, Sino-Japanese numbers would be used ...

juu, juu-ichi, juu-ni, juu-san, etc etc

if a suffix is to be used, then the most suitable suffix would be "ken" (= item) and Sino-Japanese numbers would be used ...

ikken, niken, sanken, yonken, goken, rokken, nanaken or shichiken (unusual though), hachiken, kyuken, jukken, juu-ikken, juu-niken, juu-sanken etc etc

Please note that there is a contraction of ichi (one) + ken (item) to ikken in this case

Also, note that more often then not, the count word for seven is based on the native Japanese seven (nanatsu) instead of the Sino-Japanese seven (shichi) especially over the telephone in order to avoid confusion with the similar sounding Sino-Japanese one (ichi).

Likewise, the count word for four is almost always based on the native Japanese four (yotsu) in order to avoid confusion between the Sino-Japanese four (shi) and seven (shichi).

2) counting people (such as participants in a telephone conference)

from 1 to 2 there are native Japanese words ...

1 person = hitori
2 persons = futari

from 3 on, Sino-Japanese numbers with suffix "nin" (person) is used ...

san-nin, yo-nin, go-nin, roku-nin, nana-nin or shichi-nin, hachi-nin, kyu-nin, juu-nin, juu-ichi-nin, juu-ni-nin, juu-san-nin etc etc etc

Note that there is no contraction when using the suffix "nin".

Note again the use of "yo"/"yon" in the count word for four instead of "shi".

3) counting seconds

the counting suffix for seconds is "byo" (second) and the line is regular without contraction ...

ichi-byo, ni-byo, san-byo, yon-byo, go-byo, roku-byo, nana-byo or shichi-byo (but unusual), hachi-byo, kyu-byo, juu-byo, juu-ichi-byo, juu-ni-byo, juu-san-byo etc etc

4) counting minutes

the counting suffix for minutes is "fun" (minute) and the line is irregular with contraction ...

ichi + fun => ippun
ni + fun => nifun
san + fun => sanpun
yon + fun => yonpun
go + fun => gofun
roku + fun => roppun
nana + fun => nanafun
hachi + fun => happun
kyu + fun => kyufun
juu + fun => juppun or jippun
juu-ichi + fun => juu-ippun
etc etc

5) counting hours

the counting suffix for hours is "jikan" (hour) and the line is regular without contraction ...

ichi-jikan, ni-jikan, san-jikan, yo-jikan, go-jikan, roku-jikan, nana-jikan or shichi-jikan, hachi-jikan, kyu-jikan, juu-jikan, juu-ichi-jikan, juu-ichi-jikan, juu-ni-jikan, juu-san-jikan etc etc

6) counting days

for counting days, the Sino-Japanese numbers and suffix "nichi" (day) is used for 1, 11-19 and above 20, the native Japanese numbers and suffix "ka" (day) is used for 2-7, 9, 10 while 8 and 20 are irregular ...

ichi-nichi, futsuka, mikka, yokka, itsuka, muika, nanoka, yohka, kokonoka, tohka, juu-ni-nichi, juu-san-nichi, juu-yo-nichi, juu-go-nichi, juu-roku-nichi, juu-shichi-nichi, juu-hachi-nichi, juu-kyu-nichi, hattsuka, ni-juu-ichi-nichi, ni-juu-ni-nichi, ni-juu-san-nichi etc etc

7) counting weeks

the counting suffix for week is "shukan" (week) and the line is mostly regular (other than for 1, 11, 21 etc) without contraction ...

ichi + shukan = isshukan (1 week)
nishukan (2 weeks), sanshukan (3 weeks) etc etc etc

but ...

juu-isshukan (11 weeks), ni-juu-isshukan (21 weeks) etc etc

8) counting months

the counting suffix for month is "kagetsu" (number of months) and the line is mostly regular ...

ichi + kagetsu => ikkagetsu (1 month, contraction)

then regular without contraction ...
ni-kagetsu, san-kagetsu etc etc etc

21 months could be either ni-juu-ichi-kagetsu (no contraction) or ni-juu-ikkagetsu (contraction), likewise for 31 months, 41 monts etc etc

9) counting years

the counting suffix for year is "nen" (year) and the line is regular without contraction, the duration suffix "kan" is further appended if a duration of n years is referred to ...

ichi-nen/ichi-nenkan, ni-nen/ni-nenkan, san-nen/san-nenkan, yo-nen/yonenkan etc etc

10) counting money

the counting suffix for money is "en" (yen) and the line is regular without contraction ...

ichi-en, ni-en, san-en, yon-en, go-en,

juu-en (10 yen)


but ...

hyaku-en (100 yen) not ichi-hyaku (Japanese always uses hundred, never one-hundred)

and ...

issen-en (1000 yen) not ichi-sen (contraction)

then again regular without contraction ...

ichi-man-en (10.000 yen), juu-man-en (100.000 yen), hyaku-man-en (1.000.000 yen), issen-man-en (10.000.000 yen), ichi-oku-en (100.000.000 yen), juu-oku-en (1.000.000.000 yen) etc etc

11) Telephone numbers

telephone numbers should generally be suffixed with "ban" (number) ...

most numbers would be spoken digit by digit ...

1234 => ichi-ni-san-yon-ban

but ...

it is very common that two-digit numbers ending in 0 would be spoken in full ...

50 => go-juu-ban (number fifty instead of five-zero)

likewise three- and four-digit numbers ending in 00 would be spoken as hundreds ...

500 => go-hyaku-ban (number five-hundred instead of five-zero-zero)

1500 => sen-go-hyaku-ban (number fifteen-hundred instead of one-five-zero-zero)

and four digit numbers ending in 000 would be spoken as thousands ...

1000 => issen-ban (number one-thousand instead of one-zero-zero-zero)

while five-digit and longer numbers are usually spoken digit by digit

Remember the exception of the 110 as an emergency telephone number ...

110 => hyaky-toh-ban (irregular mixed use of Sino-Japanese and native Japanese numbers)


PSTN telephone numbers are usually spoken in groups, where the area code is the first group and the number is split into two groups of n and four digits, ie eight digits are four and four, seven digits are three and four, six digits are two and four. The suffix "no" (possessive particle) is used as a group separator ...

03-7654-3210 => zero-san no nana-roku-go-yon no san-ni-ichi-zero-ban

052-7654-3210 => zero-go-ni no nana-roku-go-yon no san-ni-ichi-zero-ban

0412-654-3210 => zero-yon-ichi-ni no roku-go-yon no san-ni-ichi-zero-ban

a full international phone number is typically spoken with the word kuni-bango (country-number) before the country code followed by the country code spoken as a full number (not digit by digit) then followed by suffix "no" then followed by the remaining telephone number ...

+44 207 7654-3210 =>
kuni-bango yon-juu-yon no ni-zero-nana no nana-roku-go-yon no san-ni-ichi-zero-ban

12) IVR choices

single digit IVR choices are usually pronouned as Sino-Japanese numbers without any suffix ...

ichi, ni, san, yon, go, roku, nana, hachi, kyu, juu

Note the use of yon for four and nana for seven instead of shi and shichi to avoid confusion.

example: for Japanese press 1, for English press 2 would be ...

Nohongo no kata ha _ichi_ *pause* eigo no kata ha _ni_ wo oshite kudasai

Note the use of ichi instead of ichi-ban and ni instead of ni-ban in this context.

Further note that It will also require further patching to make the voice prompts follow Japanese grammer which is mostly RPN (reverse polish notation like those HP calculators - no kidding!)

Take a look at the above example broken down into its components and you will recognise the RPN stack ...

Nihon = Japan
go = postfix operator to indicate language
no = postfix operator to indicate possession/association
kata = person
ha = postfix operator to indicate context
ichi = one
wo = postfix operator to indicate object
oshite = press
kudasai = postfix operator to indicate a request

... which can be mapped to something akin to the English "As far as Japanese speaking persons are concerned, 1 is to be pressed, please." whereas in most other languages the phrase "please press" would appear before the object "one".

Thus for use in pre-coded IVR menus there will need to be further patching to get the order right.

hope this helps
benjk



By: benjk (benjk) 2006-01-06 06:26:18.000-0600

in response to oej's note ...

> "Secs" should be named "secs" in any language, we can't have "se-secs", "jp-secs" and "secs"
> - that is not the way it works.


If you want to get proper Japanese counting into Asterisk you will have to change the way Asterisk works - otherwise you'd have to get the Japanese language changed which is of course out of the question.

The situation with "seconds" isn't even problematic because it is fairly regular. But how do you want to deal with "minutes" which is irregular depending on the preceeding number?

one minute: ichi + fun = ippun
two minutes: ni + fun = nifun
three minutes: san + fun = sanpun

The phrase "x minutes" cannot easily be constructed from number + minutes and in fact many Japanese linguists consider "x minutes" to be a single word, not a compound that consists of two separate words.

Most likely Asterisk will need to be changed to accommodate different recordings for "one-minute", "two-minutes", "three-minutes" etc etc instead of "one" + "minute", "two" + "minutes", "three" + "minutes" etc etc.

Counting days is even more challenging because not only are there two different sets of numbers (native Japanese and Sino-Japanese) but even the word "day" changes depening on the number between Chinese (ON) and Japanese (KUN) readings. You could not possibly construct "x days" from the recordings of "x" and "days" - outright imposssible.

one day:
ichi + nichi = ichi-nichi (Sino-Japanese number + ON reading for suffix "day", regular)

two days:
futatsu + ka = futsu-ka (native Japanese number + KUN reading for suffix "day", irregular)

three days:
mittsu + ka = mikka (native Japanese number + KUN reading for suffix "day", regular contraction)

four days:
yottsu + ka = yokka (native Japanese number + KUN reading for suffix "day", regular contraction)

five days:
itsutsu + ka = itsuka (native Japanese number + KUN reading for suffix "day", irregular)

six days:
muttsu + ka = muika (native Japanese number + KUN reading for suffix "day", irregular)

seven days:
nanatsu + ka = nanoka (native Japanese number + KUN reading for suffix "day", irregular)

eight days:
yattsu + ka = yohka (native Japanese number + KUN reading for suffix "day", irregular)

nine days:
kokonotsu + ka = kokonoka (native Japanese number + KUN reading for suffix "day", irregular)

ten days:
toh + ka = tohka (native Japanese number + KUN reading for suffix "day", regular)

eleven days:
juu + ichi + nichi = juu-ichi-nichi (Sino Japanese numbers + ON reading for suffix "day", regular)

twenty days:
hattsuka (special case number with KUN reading for suffix "day", irregular)

twenty-one days:
ni + juu + ichi + nichi = ni-juu-ichi-nichi (Sino Japanese number + ON reading for suffix "day", regular)

even though the line will continue regular with Sino-Japanese numbers and the ON reading for suffix "day", there is still the odd case of "n days" for all n ending in four ...

twenty-four:
ni + juu + yon = ni-juu-yon

but

twenty-four days:
ni + juu + yon + nichi = ni-juu-yo-nichi

Further still, when counting days, numbers ending in seven use the Sino-Japanese word for seven "shichi" instead of the native Japanese word for seven "nana". The compounds "juu-shichi" and "juu-nana" both mean seven-teen, but "juu-nana" is generally used to avoid confusion with the Sino-Japanese word for one "ichi". However, when counting days, the pure Sino-Japanese form "juu-shichi-nichi" is commonly used.

The only way to accommodate all this is to treat "x days" as a single recording for all cases from "one day" to "10 days" and the irregular "20 days", further to have single recordings for 14, 24, 34, 44 etc, as well as two separate sets of 17, 27, 37, 47 etc or again single recordings for "17 days", "27 days", "37 days" etc etc.

And before anybody starts questioning the sanity of the way the Japanese counting works, please keep in mind that most European languages have similar oddities stemming from the fact that native numbers as well as Latin and Greek numbers are used, it's just on a smaller scale and we don't usually notice it anymore because we're used to it ...

uni-lateral, mono-lingual, bi-directional, tripod, tetraeder, pentagram, hexagon, septet, octothorpe, hecto-litre, kilo-gram, mega-watt, giga-byte, milli-meter, micro-farad, nano-techology etc

The equivalent of Latin and Greek in Asia is Chinese and many Asian languages therefore have Chinese originated loanwords which are used just in the same we we use Latin and Greek prefixes and suffixes, right alongside native equivalents only depending on context.

By: Toshihiko Okubo (keister) 2006-01-08 11:15:13.000-0600

I would like to pay my respects to benjk's fine work to specify various Japanese expressions on numbers. But I still consider hfritz's patch practical
and useful as a basic say number system.

His system is of same quality as those commonly used in PBX/IVR systems and
screen reader systems for blind persons in Japan.

I think some application software using dictionary would refine a great variety of Japanese numerical expressions.

By: Matt O'Gorman (mogorman) 2006-01-09 14:54:37.000-0600

can anyone confirm for me if this is correct enough to go into trunk as is?  as i dont speek japanese

By: benjk (benjk) 2006-01-10 01:04:22.000-0600

in reply to Mogorman asking ...

"can anyone confirm for me if this is correct enough to go into trunk as is?"

Think of this as equivalent to pigeon English or Chinglish. It is improper Japanese and every Japanese speaker will instantly realise that it is incorrect, but they will understand what is implied.

For somebody with a need where bad Japanese is better than no Japanese at all, it's acceptable.

The problem is though that people who don't know that this is bad Japanese will only read "Japanese IVR support" and turn this on in the belief that it is proper and their customers may then end up with an embarrassment.

Further, there is a danger that once this patch is official, the developer community will take an attitude along the lines of "we've already got Japanese support, there is no need to change anything" and as a consequence you would never get proper Japanese support for Asterisk.

The idea of using a dictionary as suggested by Keister won't work for this because SayNumber() has no knowledge about the context in which a number is spoken. It doesn't know what was spoken before and what will be spoken after the number. Without that information it is impossible to properly determine which recording should be picked.

A possible solution would be to break up the service SayNumber() provides into narrower usage cases and provide one function for each of those usage cases. For example ...

SayNumber() => default, isolated numbers

SayNumberOfSeconds() => speak the number of seconds appending the word "second" or "seconds"

likewise there would be: SayNumberOfMinutes(), SayNumberOfHours(), SayNumberOfDays(), SayNumberOfMonths(), SayNumberOfYears(), SayNumberOfPersons(), SayNumberOfMessages(), SayCurrencyAmount(), SayTelephoneNumber(), SayExtensionNumber() etc

whereby the behaviour of those functions would depend on the language which has been set.

In most languages those functions would just call SayNumber() and then speak the name of the counted object afterwards. However, in languages with more complex rules such as Japanese, these functions would be enabled to apply proper speaking rules as determined by the context.

This would be a very clean solution and still be a rather simple whilst covering the overwhelming majority of cases.

Moreover, it can be introduced first using the simplistic speaking rule <isolated-number> + <counted-object> and the proper rules can then be added in on a function by function basis without having any impact on the rest of the code base and other languages.

Let me close with a few comments on the subject of how acceptable a "bad Japanese" solution would be. My brother in law is a salesman for NEC selling PBXes, IVRs etc. He says that the products NEC sells today all have proper speaking rules. I also know from my own customers that they generally do not want anything that talks pigeon Japanese. The example quoted by Keister, screen readers for blind people is much different because for a blind person, if they have no other alternative to read, they will be happy with whatever is available. For an Asterisk deployment the alternative is and would remain custom patching to make it speak properly and that is what the majority of Japanese customers will choose if the built-in default is bad Japanese. So I guess if you want to add this, you may as well add something that people are going to use.

By: Olle Johansson (oej) 2006-01-10 01:28:49.000-0600

benjk: Trying to summarize what you say, you do not recommend that we include this patch into Asterisk since it's not usable?

By: benjk (benjk) 2006-01-10 01:42:15.000-0600

in respone to OEJ ...

My recommendation is to introduce the patch STRICTLY as a means of providing the basis for a context aware number speaking system that can speak numbers depending on counted object.

As far as usability is concerned think of using a telex machine as a word processor for writing a business letter in a European language. Even without accents and capitalisation, it is legible but in a world where everybody else uses wordprocessors which support those things, only a desperate person with no access to a modern wordprocessor would _want_ to use it.

By: Toshihiko Okubo (keister) 2006-01-11 08:56:07.000-0600

I would like to emphasize again that hfritz's patch is practical and useful enough as a say number system for giving audible numeric information in Japanese.
The arguments of other gentlemen are on general complexity of Japanese numerical expressions which we encounter only when we read Japanese sentences.



By: benjk (benjk) 2006-01-11 12:48:32.000-0600

in response to Keister ...

you seem to misunderstand what SayNumber() is being used for. Of course it is being used as part of reading whole sentences. There is another dialplan application called SayDigits() which is being used for isolated digits as the name implies and as far as that application is concerned your comment is applicable. Yet, as far as SayNumber() is concerned it is not insofar that SayNumber() is being used by Asterisk internally when reading back entire sentences and then CONTEXT MUST BE KNOWN in order to speak the number.

The scenarios I have given represent most (if not all the) scenarios in which Asterisk uses SayNumber() to form entire sentences. Whilst it is possible that somebody will use SayNumber() in a dialplan to speak isolated numbers, you cannot have one without the other and prevent Asterisk from using SayNumber() for speaking entire sentences.

As a result, the practicality and usefulness of the patch as it is, that is without taking context into account, would be offset and diminished by the fact that all those internal IVR functions will speak pigeon Japanese, thus useful and practical only for somebody who ordinary Japanese would describe as a 'denwa ohtaku' (roughly equivalent to telephone geek) desperate enough to want to use it, but not useful and not practical in a business environment where modern Japanese products do the same in a proper way.

By: Toshihiko Okubo (keister) 2006-01-11 19:00:59.000-0600

In response to benjk ...

I consider SayNumber() a number(multi-digits) reading system,
while SayDigits() is a single digit reading system.

In the circumstances of Japanese sentences where CONTEXT MUST BE KNOWN to number, NUMBER ALSO MUST BE KNOWN to context, as context itself varies according to the preceding number (e.g., 1-ppon, 2-hon, 3-bon) and such function is, in my understanding, nothing but a Japanese reading system using some dictionary data base.



By: Takeshi Hiragimoto (hfritz) 2006-01-11 19:46:56.000-0600

Guys
Thank you for your comments on my SayNumber patch.
I understand what benjk and keister are talking about.

but please let me try to make it clear.

benjk is concerning that my SayNumber patch can not cover "all" Japanese counting rules.
,but I still think my patch can be used for many other cases.

My opinion is as follows,

My SayNumber, SayDigits, SayUnixTime are fitting well with rizzo's SayNumber coding style.
I did no modification to rizzo's codes, I just added Japanese SayNumber functionalities,

in the other hand, benjk's functions like SayNumberOfPersons(), SayNumberOfMessages(), SayCurrencyAmount(), SayTelephoneNumber(), SayExtensionNumber() needs couple of changes to many codes,
and in this case we need more time and many more discussions because it is not only Japanese language problem,

so, I would suggest to put my patch in trunk for now and we start discussing how to implement a complex language like Japanese into Astrisk's "Say" functions.

By: Toshihiko Okubo (keister) 2006-01-11 20:02:56.000-0600

I agree to hfritz's suggestion to put his patch in trunk for now and to start discussion how to implement complex languages into Asterisk's "Say" functions.



By: Tilghman Lesher (tilghman) 2006-01-11 22:58:22.000-0600

Technical changes needed:

1) the formatting of the patch is inconsistent.  In some places, there is code indented in different ways in the same block.  There is also a mixture of tabs and spaces for indentation; the patch needs to have code indented only by tabs.

2) The implementation of ast_say_date_with_format_jp() is missing the 'q' and 'Q' formats.  These should be the natural way of saying a date, as a native speaker would say it, with any special way of saying a date for recent dates (for example, in English, we might say "yesterday" instead of saying "Wednesday, January 11th, 2005").  These formats are mandatory.

By: Takeshi Hiragimoto (hfritz) 2006-01-11 23:00:52.000-0600

Corydon76
Thank you for your comment
I will work on this tonight.

By: benjk (benjk) 2006-01-12 00:27:19.000-0600

The danger here is this:

You get this patch accepted now and it will only be useful for isolated numbers, that is a fraction of what is really needed to call it "Japanese IVR support".

Next, you or somebody else starts work on doing it properly and nobody will be listening, nobody will be willing to accept any changes, they will say "We have already got Japanese support".

Meanwhile Asterisk talks pigeon Japanese in those cases where it makes internal use of SayNumber() to construct numbers spoken in context. Voicemail IVR will be off, conferencing IVR will be off, speaking minutes will be off, speaking days will be absolutely horrifying.

The wise approach here is to use this issue in order to get other developers to understand and accept that changes must be made throughout Asterisk to the way numbers are spoken.

And I would like to reiterate that this can be done without any dictionary.

If the approach I suggested is used, then the impact will be minimal.

The compound functions would be implemented as function pointers. For each compound there would be one default function which is assigned to the function pointer. The default functions would simply call SayNumber() and then speak the counted object. Languages such as Japanese can implement their own functions to deal with the more complex speaking rules properly.

When SetLanguage() is called it will check for every function pointer if the target language has implemented its own function. If it has, the function will be assigned to the function pointer. If it doesn't, the default function will be assigned to the function pointer.

for example ...

struct ast_saynum_delegates {
  int (*sayNumOfItems)(int);
  int (*sayNumOfPersons)(int);
  int (*sayNumOfMessages)(int);
  int (*sayNumOfSeconds)(int);
  int (*sayNumOfMinutes)(int);
  int (*sayNumOfHours)(int);
  int (*sayNumOfDays)(int);
  int (*sayNumOfWeeks)(int);
  int (*sayNumOfMonths)(int);
  int (*sayNumOfYears)(int);
  int (*sayCurrencyAmount)(float);
  int (*sayExtension)(int);
  int (*sayTelephoneNumber)(int);
}

each of these will be set to a default function ...

delegate->sayNumOfItems = &DefaultSayNumOfItems;

and it may be changed by SetLanguage() ...

if (this language implements delegate) {
  delegate->sayNumOfItems = &thisLanguageSayNumOfItems;
  ...
}

every part, app/module that speaks compounds will then use the currently set delegate to speak ...

// speak number of voice mail messages
delegate->sayNumOfItems(num);

... instead of ...
SayNumber(num);
Say(recording of items/messages in the current language);

The above is fairly straigtforward and the best time to push for this is NOW, before the patch is accepted because the inability to speak numbers in Japanese properly with the current design carries enough weight to get the decision makers to agree to such changes.

Last but not least I would like to clear up a misunderstanding. I do not advocate that we cover all possible scenarios, there are just too many of them in Japanese. However, I advocate that we cover all the scenarios which are already covered by Asterisk today plus one additional case: currency amount.



By: Tilghman Lesher (tilghman) 2006-01-12 00:53:14.000-0600

benjk:  if you want to create specific functions for Japanese usage and declare that certain functions should not used with the Japanese language, that's fine, but I don't think we need to radically change the interface, since most languages do not need it.  Please submit a disclaimed implementation for the dates and times, as said in Japanese, and we'll worry about the specific implementation for numbers at a later time.

By: Takeshi Hiragimoto (hfritz) 2006-01-13 21:57:01.000-0600

I have corrected following parts,
1) indentaion
2) added 'q' and 'Q' formats

To implement SayPersons(), SayMessages(), etc ...
I think the best way to do this is to create new application
that can be used in extensions.conf.
I will work on this shortly.

By: crich (crich) 2006-01-18 13:12:48.000-0600

Sorry for bumping in between. I've been working on a app_saynumber which can be configured for different languages, by reading a generic number-language-definition file. You might like to have a look at it: ASTERISK-6050

By: Takeshi Hiragimoto (hfritz) 2006-01-18 16:59:17.000-0600

crich-san, thank you very much for your comment.
OK, I will take a look into your solution and try
to write Japanese counting rules with using your script.
Thanks

By: crich (crich) 2006-01-18 17:03:07.000-0600

Thanks hfritz-san, i hope it works for you.

Probably we can have such scripts for the date/time and all the other stuff also. I'll think about it.

By: Takeshi Hiragimoto (hfritz) 2006-01-21 19:34:50.000-0600

crich-san
hello
I just finished writing say_number script.
and now I want to test it, however it looks like
your patch is not for current CVS-HEAD.
I got an error message like below.

[root@sss2 asterisk_say]# patch -p1 < ../app_say_number.patch
patching file app_say_number.c
patching file Makefile
Hunk #1 FAILED at 25.
1 out of 1 hunk FAILED -- saving rejects to file Makefile.rej
patching file README.saynum
patching file README.alllanguage
patching file say_number.conf.sample

could you please double check ?
Thank you

By: crich (crich) 2006-01-22 02:44:54.000-0600

hfritz-san i have uploaded a new patch (-2), which applies against revision 8431 of asterisk trunk.

By: Takeshi Hiragimoto (hfritz) 2006-01-22 04:41:02.000-0600

crich-san
Thank you very much ! :)

By: crich (crich) 2006-01-22 07:13:14.000-0600

Please add some info to ASTERISK-6050 if it works in your environment. Bug reports are welcome, patches even more ;)

crich

By: Takeshi Hiragimoto (hfritz) 2006-01-22 22:37:56.000-0600

crich-san
OK, I will.
Thank you

By: Olle Johansson (oej) 2006-03-09 14:21:59.000-0600

THis is now discussed on asterisk-dev - please join that discussion!

By: Takeshi Hiragimoto (hfritz) 2006-03-09 15:10:06.000-0600

ok I will.

By: Clod Patry (junky) 2006-04-11 22:45:51

let's move this to ASTERISK-6050