Voice-based services technologies have been undergoing change for decades, ever since the early, primitive voice response systems of the 1980s. Underpinning these changes has been the technology advancement beyond robotic, impersonalized sounding dialogs towards humanistic engagement and a conversational experience. 

As a result of this, the bar of user expectations continues to be raised. Soleo’s Voice Services Platform (VSP) employs current state-of-the-art technologies in a framework that provides for the integration of network and speech technology improvements as they become available. Soleo’s historic expertise in the universe of in-network carrier class IVR systems provides a rock-solid foundation upon which to evolve and extend these technologies.

The role of the VSP is to deliver conversational “concierge-like” service interactions to a voice-based user enabled by Soleo’s AI-driven discovery engine.  The overall success of the platform is measured by the degree to which it can mimic the conversational flow and dynamics that an intelligent human agent would provide. Just as with human conversation, there are two separate and dialogue elements that need to be addressed concurrently for this to succeed, understanding and speaking.  Working with the Language Models in the Discovery Engine, the VSP is enabled to both understand conversational speech, as well as communicate text-based replies in a natural and conversational manner.   For this reason, the VSP must be able to:

  • Interpret – Speech recognition first began to appear in the telephone network in the mid 1990s where it was used by Carriers to (partially) automate directory assistance calls.   Since then the industry has experienced a technology revolution, that was almost unimaginable then. The VSP incorporates these evolved technologies, so as to be able to probabilistically interpret individual words, along with possible alternative interpretations and the relative confidence in each. In doing this, the VSP will often attempt to resolve ambiguities with the caller, using replies such as “I think you meant” to confirm interpretations. These interpretations, along with their relative confidence scores, can then be passed to the Discovery Engine to select the correct meaning from the scored interpretations, supplying contextual understanding and derivation of actual intent.
  • Speak – Natural Language Generation (NLG) has experienced similar advances in technology. Callers now expect to hear “human-sounding” responses, and the VSP is the final piece of this process before responses are returned to the caller.  Outside of the manipulation of the text itself, which is done by the Discovery Engine, the VSP supplies the bulk of the humanization, supplying various voice characteristics such as inflection to improve intelligibility.

The introduction of Speech Synthesis Markup Language in 2010 provided for much richer and intelligent manipulation of speech by entities such as the discovery engine. For example, the Discovery Engine may now flag text with “Say as Address”, telling the synthesizer to speak “St.” as “Street” and not “Saint” and to speak numbers in a manner consistent with address constructs.

Going forward, Soleo’s VSP and Discovery Engine together will have the capacity to offer advanced Natural Language capabilities that allow real-time, context-aware adjustment of dialogues. The growth in these capabilities will move us further towards the ultimate goal of providing interactions that are indistinguishable from human ones.


Session Initiation Protocol

The Voice Services Platform functions as a communications gateway for all voice traffic, whether as part of a Soleo Hosted site or elsewhere. As such, it needs to be capable of communicating with a variety of different entities that connect through both packet and switched networks. Some of these entities are “user facing” in the sense that they utilize the VSP primarily as a query/response gateway. In the case of Intercept, (Number not Found) traffic, or inbound traffic from Directory Assistance, the platform may be taking traffic directly from network switches.

Conversely, the VSP also connects dynamically to “cloud” resources to process language recognition and generation, including the Discovery Engine as well as connecting to external content repositories such as listing directories and ad servers.

When a client voice caller initiates a dialogue, the Voice Services Platform orchestrates the connectivity, resources, and facilities necessary to ensure the most relevant, natural, and conversational dialogue that the technology can provide, in the most cost-effective way possible.

Positioned for Evolution

Soleo’s ancestry in large-scale telephony providing network-based automated Directory Assistance and hosted call transfer solutions to carriers has resulted in a deep understanding, acquired over decades, of the nuances and idiosyncrasies of providing effective voice services in demanding environments.