Using Australian English Text to Speech

The Tellme Voice Application Network supports both female (hayley) and male (james) for Text to Speech (TTS) processing. This article demonstrates how to access this functionality.

To access this TTS functionality, set the name attribute of the voice element to "hayley" or "james" as shown in the following example.

<?xml version="1.0" encoding="iso-8859-1"?>
<vxml version="2.1"
  xmlns="http://www.w3.org/2001/vxml">
  <form>
    <block>
      <prompt>
        <voice name="hayley">
        Welcome to Tellme.
        </voice>
      </prompt>
      <exit/>
    </block>
  </form>
</vxml>

For information about the Speech Synthesis Markup Language (SSML) elements that the TTS engine supports, see the Speech Synthesis Markup Element Reference.

This section covers how phone numbers and mailing addresses should be formatted and how they are read by the TTS engine.

Australian telephone numbers are 10 digits long, and can be written 0A BBBB BBBB or 04MM MBB BBB (for mobile telephone numbers), where 0A is the optional "area code" and BBBB BBBB is the subscriber number. 04MM M are allocated per mobile network. When the number is to be seen by an international audience, it is written +61 A BBBB BBBB or +61 4MM MBB BBB.

  • Brief time breaks occur between number segments.
  • Phone numbers are not pronounced in pairs, as regular numbers are; digits are read individually.
    Text Pronunciation
    (07) 3847 9222 "zero seven (pause) three eight four seven (pause) nine triple two"
    0738479222 "oh seven three eight four seven nine two two two"
    07 3847 9222 "Area code Oh seven three eight four seven (pause) nine triple two"

  • Phone number delimiters are not pronounced.
  • You can use the SSML say-as element to ensure that the TTS engine pronounces a phone number correctly.
  • Numbers in an address are read as numbers (for details, see the Numbers section)
  • United States addresses are typically in the following format:
    RECIPIENT
    [FLOOR] [/] [APARTMENT] HOUSE_NUMBER STREET_NAME STREET_TYPE 
    LOCALITY PROVINCE_ABBREVIATION POSTAL_CODE
    AUSTRALIA
    
  • To ensure that the TTS engine pronounces the state abbreviation correctly, be sure to include a zip code. Also, do not include extra spaces after the city name.
  • You can use the SSML say-as element to ensure that the TTS engine pronounces an address correctly.
Pronunciation Rule Text
Between a street address and a numeric street, a break occurs St Lucia QLD
A break occurs between city/state and the zip code 4072, Australia

Four digit numbers have some common pronunciation patterns, as listed below. You can also use the SSML say-as element to ensure that the TTS engine pronounces a number digit by digit.

Note. To express multiplication, you must write out the mathematical functions. For example, use "4 times 5" instead of "4*5" or "4X5".

Pattern Pronunciation Rule Example Text Example Pronunciation
4 digit numbers without commas, decimal points read as pairs 2348 twenty three forty eight
4 digit numbers where 2nd pair begins with zero 2nd pair is read as individual digits 2304 twenty-three oh four
4 digit numbers that begins with zero Read as pair 0234 zero two three four
4 digit number where 2nd pair is 00 read in hundreds 1200 twelve hundred
4 digit number 2001 through 2009 Read as a single number 2008 two thousand and eight

Currency values are pronounced, in general, as <number><currency value> AND <number> <currency value>. For example, $432.19 is pronounced as "four hundred and thirty two dollars ,nineteen cents." You can use the SSML say-as element to ensure that the TTS engine pronounces a currency value correctly.

Pronunciation Rule Text Pronunciation
zero value before or after decimal point, only the non-zero value is read $432.00 "four hundred and thirty two dollars"
$0.19 "nineteen cents"
Use m or b to indicate million or billion, respectively. Capitalization or spacing does not matter. $432M "four hundred and thirty two million dollars"
$432.19 m "four hundred thirty two point one nine million dollars"
$432B four hundred and thirty two billion dollars
$432.19 b "four hundred and thirty two point one nine billion dollars"
Ranges are pronounced with the currency value last $2 - $4 "two dollars to four dollars"
Yen values are read with "yen" pronounced last (Yen is the only currency without a name for amounts less than whole numbers) JPY 123.45 "one hundred and twenty three point forty-five japanese yen"
Numbers with more than 2 digits after the decimal point have the decimal values read individually and with just the larger currency name $12.3456 "Twelve dollars, thirty-four cents five six"
Use currency abbreviations GBP 12.34 "Twelve pounds sterling, thirty four pence"

You can refer here Currency Abbreviations to know about currency code abbreviations and the readout for each.

This section covers how the TTS engine pronounces date and time text. You can use the SSML say-as element to ensure that the TTS engine pronounces a date or time value correctly.

Note. Roman Numerals in dates are not supported.

Dates in Australia are formatted as dd/mm/yyyy.

Text Pronunciation
25/11/1990 "The twenty fifth of November, nineteen ninety"
26.01.90 "The twenty sixth of January ninety"
1980s "nineteen eighties"
Feb 14 2009 "February the 14th two thousand and nine"

Time can be formatted in different ways. Below are examples of the different formats. In general, time is expressed in 12-hour format, with am and pm to indicate morning or evening. For official purposes 24-hour time notation is used.

  • 12:14
  • 12:14:13
  • 12:14 pm
Pronunciation Rule Text Pronunciation
Seconds are optional 12:14 twelve fourteen
12:14:13 twelve fourteen and thirteen seconds
Morning and evening indicators are optional, can be capitalized or not, with or without periods 12:14 pm twelve fourteen P M

See Also
Speech Synthesis Markup Element Reference, Unicode Code Charts
[24]7 Inc.| Terms of Service| Privacy Policy| General Disclaimers