Building Applications Frequently Asked Questions

This FAQ contains answers to questions about building VoiceXML applications using Tellme Studio.

  1. What options are there for implementing a voice application?
  2. What is the architecture of a voice application on the Tellme Platform?
  3. Why is the Tellme VoiceXML interpreter called a "client"? Doesn't Tellme provide a server-based solution?
  4. Where is the business logic of the application implemented?
  5. Does Tellme support static VoiceXML pages as well as dynamically generated pages?
  6. What technologies do customers use to generate the VoiceXML?
  7. Does Tellme host a customer's voice applications?
  8. How can I integrate backend data into my voice application?
  9. Can I personalize my phone application based on user characteristics?
  10. Does Tellme automatically convert my application into voice by screen scraping my Web site?
  11. Can I re-implement my DTMF-based IVR application using VoiceXML on Tellme?
  12. How does an application maintain state across VoiceXML pages?
  13. Does the voice-browser accept cookies?
Q: What options are there for implementing a voice application?
A: A business has a continuum of solutions for building and deploying a voice application. These solutions range from pure do-it-yourself solutions to pure-outsourced solutions.

With a do-it-yourself solution, a business hand crafts each technology layer of the voice application. First, it negotiates a carrier contract, then acquires and deploys the telephony hardware and software in an appropriate facility. Next, it evaluates and purchases the various components of the voice platform: the speech recognition engine, text-to-speech engine, voice application interpreter and so on. Finally, it builds the application plus the interfaces that integrate it with databases and other applications.

A step up from the pure do-it-yourself solution is to outsource the telephony layer while building the voice platform and application integration layers. In this solution, the telephony-outsourcing vendor manages the telephony infrastructure, and provides an environment in which a business runs the other application layers. The business is still responsible for researching, evaluating and assembling their own voice platform in addition to building the application itself.

The next solution, and the one implemented by Tellme, is one step beyond the "hosted telephony" solution. The Tellme solution is to allow businesses to outsource both the telephony and voice platform layers, so that they need only build the application and integrate it with their back-end systems. This allows them to concentrate on building the logic of their applications rather than worrying about the complexities of deploying a robust, scalable voice platform and telephony infrastructure.

The final voice application solution is to outsource all technology layers. With this solution, a business allows the outsourcing provider to manage the telephony infrastructure, provide the voice platform and also specify the integration mechanisms. This solution lets businesses concentrate purely on building their application, but limits the back-end systems to which they can integrate, to those specifically supported by the outsourcing vendor. Since many businesses have proprietary back-end systems and custom integration requirements, these solutions are appropriate only for the simplest of applications.

Q: What is the architecture of a voice application on the Tellme Platform?
A: The Tellme voice application architecture integrates powerful voice-recognition technology with the familiar application model of the Web, providing a mechanism to quickly and easily augment existing Web applications with voice-driven interfaces. It is easiest to explain the Tellme voice application architecture by comparing it with the architecture of a Web application.

A Web application is implemented as a series of HTML pages retrieved from a Web server by a browser. The browser's job is to retrieve pages from the Web server over HTTP, and to visually render these pages for the user. User input collected through HTML forms, is passed to the server via HTTP requests for processing, and the server generates a response back to the browser. This processing typically involves back end business logic, legacy system integration and database access.

Tellme has harnessed the simplicity of this well-understood architecture, and applied it to the development of voice applications. The architecture of a voice application is very similar to that of a Web application with the following differences:
  1. The user interface of the application is a voice-driven call-flow instead of a visual Web page.
  2. The interface is represented as a sequence of VoiceXML pages, instead of HTML pages.
  3. The Tellme VoiceXML interpreter plays the role of the Web browser, fetching VoiceXML pages from the Web server and rendering them over the phone as voice and DTMF-driven call flows.
In all other respects, the architectures are the same. They implement a stateless request/response application model. The client interacts with the user to collect input and sends HTTP GET or POST requests to pass the input to the server and present the next "page" of the user interface. Application logic and legacy system integration is implemented on the server through common Web back-end technologies such as CGI, NSAPI, ASP or JSP. In fact, voice and Web applications often use the same back-end infrastructure components. The only difference is that the Web application presents data visually using HTML whereas the voice application presents data audibly using VoiceXML. Finally, even the client-side logic capabilities are the same, with VoiceXML supporting embedded JavaScript, just like HTML.

Q: Why is the Tellme VoiceXML interpreter called a "client"? Doesn't Tellme provide a server-based solution?
A: Technically, yes. The Tellme Platform is composed of a bunch of servers managing phone calls and running VoiceXML applications. However, architecturally speaking, the Tellme Platform plays the role of a client speaking to a Web server over the Internet.

One way of looking at this is to think of the Tellme Platform as a mechanism that converts a normal telephone into a type of Web browser that interacts with the user via speech rather than visually. Both the voice browser and Web browser are Web server clients.

Q: Where is the business logic of the application implemented?
A: Just as with a Web application, the business logic of a voice application may be implemented on the client-side (the Tellme Platform), the server-side, or both. Most applications split the application logic across both environments.

Client-side logic runs on the Tellme platform as part of interpreting the VoiceXML page. It is implemented using a combination of VoiceXML directives and embedded JavaScript. Since it is executed while the call is in-progress, it has the ability to directly alter the behavior of the user interface. This makes it ideal for performing UI-related tasks such as validating data, randomizing user prompts (to give the interface a more human feel), and modifying call parameters and behavior on-the-fly (such as timeout durations).

Server-side logic, on the other hand, runs at the customer site as part of the process of dynamically generating VoiceXML pages based on requests from the client portion of the application. It is implemented using the same technologies used with Web applications. While different applications often use very different server-side implementation technologies, it doesn't matter to the Tellme Platform as long as valid VoiceXML is generated as a result.

Since the server-side logic runs at the customer site, it is ideally situated to access customer databases and integrate with other computer systems. This also provides the opportunity to re-use the existing Web application infrastructure and logic. For example, code modules responsible for accessing databases, enforcing security policies and implementing business rules may often be shared across Web and voice applications. This sharing can dramatically simplify voice application development by focusing efforts on designing and building the voice interface instead of re-inventing the back-end integration mechanisms.

Q: Does Tellme support static VoiceXML pages as well as dynamically generated pages?
A: Yes. The platform makes a standard HTTP request to retrieve the VoiceXML pages to execute. It makes no difference to the platform whether the page is stored statically on the Web server, or whether it is dynamically generated. In fact, most applications employ a combination of the two.

Q: What technologies do customers use to generate the VoiceXML?
A: Any technology used to generate Web pages may be used to generate VoiceXML pages.

Q: Does Tellme host a customer's voice applications?
A: No. The VoiceXML files are completely managed by the customer. The Tellme VoiceXML interpreter retrieves them across the Internet from the customer's Web site.

Q: How can I integrate backend data into my voice application?
A: You can integrate backend data into your voice applications by generating VoiceXML using a server-side framework such as CGI, ASP, or JSP. Using whatever database API is supported by these server-side frameworks (DBI, ODBC, OLE/DB), access your backend database, generate VoiceXML on the fly containing that data, and return it to the Tellme VoiceXML interpreter.

Q: Can I personalize my phone application based on user characteristics?
A: Yes. By using the same techniques as in a Web application, a voice application may access user profile data stored in a database to generate personalized VoiceXML pages.

Q: Does Tellme automatically convert my application into voice by screen scraping my Web site?
A: No. Some companies have technology that can "read" a Web page and convert it on the fly into a voice application. Tellme doesn't do this because there are fundamental differences between well-designed Web and speech user interfaces. Programs automatically converting a Web interface to speech produce rudimentary applications which fail to exploit speech's unique strengths and which don't take into account particular customer scenarios.

For example, imagine two different Web pages to be converted to speech. One contains a list of step-by-step driving directions, the other a list of stocks and quotes for a portfolio. Callers asking for driving directions are likely to be calling from their car, and will want the directions read back to them one at a time, as they complete each step. The application should pause between each step and wait for the user's command to proceed. On the other hand, the list of stocks and their prices should be read back at a more rapid pace, automatically proceeding to the next quote after a short pause. Users would find it tedious and annoying if they were forced to indicate each time they are ready to hear the next quote. As you can see, list navigation can be very context sensitive. A program automatically converting a Web page to speech would have no ability to discern the difference between these applications and would produce applications poorly suited to their intended purpose.

Q: Can I re-implement my DTMF-based IVR application using VoiceXML on Tellme?
A: Yes. The Tellme Voice Application Network provides robust support for DTMF and voice applications as part of the VoiceXML standard. Well-designed voice applications can provide support for both simultaneously.

Q: How does an application maintain state across VoiceXML pages?
A: Either HTTP cookies or VoiceXML/JavaScript variables may be used to maintain application state across the execution of VoiceXML pages.

Just like a Web browser, the Tellme Platform allows a Web site to set and retrieve cookies associated with a user's session. A session begins when Tellme answers a call, and ends when the call is finished. Cookies on the Tellme Platform follow all of the same rules as Web cookies with regards to content, expiration periods and security. By default, all cookies are session cookies, and will disappear at the end of the user's call. Persistent cookies may also be created, but require that callers first be identified using their Tellme sign-in numbers and passwords. Please contact Tellme for further information on this capability.

VoiceXML application variables and JavaScript variables may also be used to maintain state across VoiceXML pages. Values stored in these variables may be accessed anywhere within the context of the same VoiceXML application.

Q: Does the voice-browser accept cookies?
A: Yes. See the above question on maintaining application state.

Q: What type of language is VoiceXML?
A: VoiceXML is a declarative, XML-based language comprised of elements that describe the human-machine interaction provided by a voice response system. This includes:
  • Output of audio files and synthesized speech (text-to-speech).
  • Recognition of spoken and DTMF input.
  • Control of telephony features such as call transfer and disconnect.
  • Direction of the call flow based on user input
VoiceXML may also embed meta-information, references to other VoiceXML files, and JavaScript code, used to implement client-side logic.

Q: How do you handle speakers with foreign accents?
A: Tellme's work on 1-800-555-TELL , which naturally is designed to be usable by the widest range of speakers, has shown that the Microsoft speech recognition engine employed by the Tellme Platform does a very good job with callers with strong foreign accents.

[24]7 Inc.| Terms of Service| Privacy Policy| General Disclaimers