Technical Articles

VoiceXML is a programming language for building interactive voice applications. The VoiceXML language provides a clean and simple means for:

  • Playing audio
  • Recognizing speech and touch-tone (DTMF) input
  • Controlling a call flow

VoiceXML is a derivative of the Extensible Markup Language (XML). XML is the standard format for defining structured documents and data on the Web. XML enables programmers to define an arbitrary vocabulary, formally known as a schema, using a standard, well-defined, easily-parsed syntax. One XML schema might describe customer information, another might describe a mathematical equation, and yet another might describe a recipe for chocolate chip cookies.

XML is easily transported across the World Wide Web (WWW) using existing Internet protocols such as HTTP. Special tools aren't required to author XML documents, but it is tremendously easy to create tools or modify existing ones that both emit and read XML. This makes XML an ideal language for passing data back and forth between applications.

This document is divided into the following two sections:

A complete VoiceXML tutorial written by Microsoft Tellme can be found on MSDN.

The following are some articles on the basics of VoiceXML programming that you may find helpful.

An XML primer Before starting to build VoiceXML applications, it helps to have a basic understanding of the Extensible Markup Language (XML), the language from which VoiceXML is derived. This document explains how to generic XML documents.
VoiceXML 2.x Essentials This document introduces the basic principles of VoiceXML by example and then drills into a number of important features of the VoiceXML language.
Using Variables This document explains how variables can be declared and used within a VoiceXML application.
Using JavaScript This document explains how JavaScript can be used within a VoiceXML application.
Handling Events This document focuses on the mechanics of handling events and the use of application-defined events in a VoiceXML application.
Building VoiceXML applications This document describes the components of a voice application including the purpose of the application root document.
Modularizing code with subdialogs This document explains how to effectively use subdialogs in your voice applications.
TTS Engine Behavior This document describes the behavior of the Text-to-Speech (TTS) engine.
Acoustic Models This document describes the behavior of acoustic models.

The following are some articles on more advanced VoiceXML concepts.

Using the data element This document shows you how to use the data element so that you can build powerful voice applications using VoiceXML, JavaScript, and a small amount of server-side code.
Creating and Manipulating Utterance Recordings This document explains how to enable utterance recordings in your application and how to submit an utterance recording to an HTTP server.
Using Cookies This article details how to use cookies and their limitations with respect to the Tellme VoiceXML interpreter.
Using Mixed Initiative Mixed initiative is a powerful technique that allows the flow of a call to be directed by the user as well as by the application. This article explains how to build a set of mixed initiative dialogs using constructs provided in VoiceXML.
Post-hangup processing in VoiceXML 2.x This article explains the recommended procedure for performing post-hangup processing on the Tellme Voice Application Network.
Using n-best lists Rather than arbitrarily picking a recognition result, the voice recognizer can provide a list of likely possibilities. The list of likely possibilities is known as an n-best list, and this article explains how to take advantage of the Tellme VoiceXML interpreter's support for n-best processing.
Implementing 'Go Back' This document shows one way to implement "go back" functionality using a simple navigation history stack.
Understanding bargein In an automated telephone system, experienced users are accustomed to interrupting the system to quickly navigate to the next prompt. In the world of telephony this is known as "bargein." Bargein is a great feature because it enables experienced users to move rapidly through the system to get to the information that they want. Circumstances will arise, however, when a voice application developer wants to disable bargein. This article explains how.
Effective Use of Caching to Boost VoiceXML Application Performance This document provides a detailed overview of HTTP caching and how you can use it to dramatically improve the performance of your VoiceXML applications.
Using fetchaudio This document first discusses probable causes of latencies in your voice application and how to eliminate them. The document then explains how to use the fetchaudio attribute to mask unavoidable latencies in your voice application.
Customizing the Hourglass This document provides technical information on disabling the hourglass and on changing the audio played when the hourglass is enabled. The document also describes strategies for minimizing the duration of the hourglass using fetchaudio.
[24]7 Inc.| Terms of Service| Privacy Policy| General Disclaimers