Building Applications Frequently Asked Questions

This FAQ contains answers to questions about building VoiceXML applications using Tellme Studio.

  1. How do I tell the application what spoken input to expect?
  2. Does Tellme provide any "pre-built" grammars?
  3. What is grammar "tuning"?
  4. Does Tellme support recognition of languages other than English?
  5. What are the limits on grammar size?
Q: What type of language is VoiceXML?
A: VoiceXML is a declarative, XML-based language comprised of elements that describe the human-machine interaction provided by a voice response system. This includes:
  • Output of audio files and synthesized speech (text-to-speech).
  • Recognition of spoken and DTMF input.
  • Control of telephony features such as call transfer and disconnect.
  • Direction of the call flow based on user input
VoiceXML may also embed meta-information, references to other VoiceXML files, and JavaScript code, used to implement client-side logic.

Q: How do I tell the application what spoken input to expect?
A: The voice application uses a grammar to define what utterances are legal for the caller to say at a particular point in the application. (An utterance is speech input before it has been recognized by the voice recognizer as a specific response.) A grammar represents the set of accepted inputs via a list of regular expressions. A grammar representing the possible answers to the question "How do you travel from home to work?" might specify the possible utterances like this in SRGS/GRXML format:
           <one-of>
             <item>
               <item repeat="0-1">by</item>
               <one-of>
                 <item>car</item>
                 <item>auto</item> 
                 <item>bus</item> 
               </one-of>
             </item>
             <item>
               <item repeat="0-1">in</item>
               <item repeat="0-1">the</item>
               <one-of>
                 <item>subway</item>
                 <item>underground</item> 
                 <item>tube</item> 
               </one-of>
             </item>
           </one-of> 
The VoiceXML language lets the developer specify which grammars are in force at any point in the application via a grammar scoping capability. Grammars are included into a VoiceXML file either in-line, or through references to external grammar files.

The Tellme Platform currently supports SRGS/GRXML grammar format, with legacy support for the GSL grammar format.

Q: Does Tellme provide any "pre-built" grammars?
A: Yes, the Tellme Platform provides access to many pre-built grammars. These grammars are either commonly used, difficult to create or require constant maintenance. Tellme currently provides many pre-built grammars, including:
  • General: Yes/No
  • Credit Cards: Expiration date, Expiration month, Expiration Year, Credit Card Number
  • Date/Time: Day of month, Day of year, Month, Year, Date, AM/PM, TimeDuration (in minutes or days), Hour, Time
  • Financial: US Dollars (no cents), US Money (dollars and cents),
  • Locations: City/State
  • Numbers: Digits, Natural numbers, Percentages, Social-Security Numbers
  • Telephone: US Phone number, Phone extension, Area Code, 7-digit phone, 10-digit phone
The Studio Grammar Library contains the full list of grammars and their descriptions.

Q: What is grammar "tuning"?
A: A voice application, like any user-centric application, is prone to certain problems that may only be discovered through formal usability testing, or observation of the application in use. Poor speech recognition accuracy is one type of problem common to voice applications, and a problem most often caused by poor grammar implementation. When users mispronounce words or say things unexpected by the grammar designer, the recognizer cannot match their input against the grammar. Poorly designed grammars containing many difficult-to-distinguish entries will also result in many misrecognized inputs.

Grammar tuning is the process of improving recognition accuracy by modifying a grammar based on an analysis of its performance. Tuning is often performed during an iterative process of usability testing and application improvement and may involve amending the grammar with commonly spoken phrases, removing highly confusable words, and adding additional ways that callers may pronounce a word.

Q: How do you handle speakers with foreign accents?
A: Tellme's work on 1-800-555-TELL , which naturally is designed to be usable by the widest range of speakers, has shown that the Microsoft speech recognition engine employed by the Tellme Platform does a very good job with callers with strong foreign accents.

Q: Does Tellme support recognition of languages other than English?
A: Not currently. For now, Tellme is focused on maximizing reliability of English recognition over the phone. Adding a new language is a non-trivial task. Even if a model exists which lets a recognizer understand a native speaker under ideal acoustic conditions, making that same model work reliably when confronted with the noisy audio environment of the phone is very difficult.

Q: What are the limits on grammar size?
A: Though the speech recognizer doesn't place a specific limit on grammar size, several practical considerations serve to effectively limit the maximum grammar size to tens of thousands of entries. First, as the grammar grows in size, the recognizer must test an utterance against a larger number of possibilities, slowing the recognizer considerably. Second, the larger the grammar, the greater the possibility that it contains words that are easily confusable. This lowers the accuracy of the grammar. However, with careful planning, it is possible to create large grammars that are still effective. Tellme has production grammars with up to thirty thousand entries.

[24]7 Inc.| Terms of Service| Privacy Policy| General Disclaimers