Technical Information
From the technical point of view, OptimTalk Technology is a set of libraries
written in portable C++ that
- interpret VoiceXML 2.0/2.1, CCXML, SRGS, SISR and SSML markup
languages,
- control ASR and TTS engines, typically using the MRCPv2 protocol,
- communicate with telephony infrastructure using a telephony protocol,
typically SIP, or a hardware telephony board,
- control the media layer, such as media recording and playback, media
streaming and mixing, etc., typically implemented by a media server
(audio as well as video supported),
- provide an implementation of the media layer, and
- provide all required infrastructure for successful run of the previously
mentioned libraries.
The following table presents the functionality in more detail.
| Functionality |
Description |
| Interpretation of VoiceXML 2.0 and VoiceXML
2.1 |
The VoiceXML 2.0/2.1 interpreter can be easily embedded into an
application. The interpreter contains all the components it needs for
its successful run. However, if some of the components do not fulfil
customer's needs they can be easily replaced with a new implementation.
If the MRCPv2 protocol is not appropriate for some reason,
the interpreter can be integrated with any TTS and ASR engine using
their proprietary API. It can be also integrated with the media processing
layer by providing custom implementations of the media player, media
recorder, media source and media sink components. If the OptimTalk CCXML
Interpreter is not appropriate for some reason, it can be integrated with
another telephony platform. |
| Analysis of grammars according to the Speech
Recognition Grammar Specification (SRGS) |
The grammar analyzer
- can be embedded into an application and used for parsing utterances
in text form, or
- can be integrated with third party ASR systems since all information
about the grammar rules are accessible through a set of interfaces. (The
capability is demonstrated on the integration of SRGS with Microsoft
Speech API 5.1 that does not support SRGS natively.)
|
| Evaluating semantics in accordance with the
Semantic Interpretation for Speech Recognition Specification (SISR) |
The semantic interpreter
- cooperates with the SRGS implementation, or
- can also work as an independent entity being embedded into custom
applications.
|
| Natural language generation for TTS systems – both with and without
Speech
Synthesis Markup Language (SSML) support |
The OptimTalk VoiceXML Interpreter can process SSML prompts in the
following ways:
- It can generate SSML documents (SSML itself has to be interpreted by
the TTS system).
- It can significantly facilitate mapping of the SSML-based prompts to
other markup languages or APIs for TTS systems that do not support
SSML.
|
| Interpretation of
Call Control Markup Language (CCXML) |
The CCXML interpreter can be easily embedded into an application.
It can
- provide telephony call control support for the VoiceXML interpreter,
- provide telephony call control support for a proprietary dialog
control language,
- be used separately without VoiceXML for building flexible telephony
applications.
If the available telephony layer integration (e.g. SIP protocol or
Dialogic cards) is not appropriate for communication with the telephony
infrastructure for some reason, the interpreter can be integrated with
any telephony protocol or telephony board using their proprietary API.
|
| Reusable general-purpose components |
All OptimTalk Technology parts use a set of reusable components for
performing various common operations. The reusable components help to
promote consistency, cohesion and compactness of the system. They
include:
- an ECMAScript engine and tools for work with ECMAScript scopes, scripts
and values,
- streams for reading data from memory, from files and from web servers
using the http(s) protocol,
- component for synchronous and asynchronous resource fetching,
- component for storing of and working with structured values of
arbitrary complexity,
- XML parser and tools for XML manipulation,
- URI parser and tools for URI manipulation,
- loggers,
- and more
|