Speech Recognition Goes Mainstream
Speech recognition has finally shaken its reputation as a "handicapped" technology plagued by flagging accuracy rates and a general perception that the products don't work very well.
Long used successfully as assistive technology for people with disabilities, speech recognition is now going mainstream, thanks in large part to brand-name vendors such as IBM and Nuance Communications, both of which recently unveiled enterprise editions of flagship dictation products.
Regulatory pressures, too, are buffeting speech-rec solutions as companies in the health-care, legal and law-enforcement arenas use them to comply with legislative requirements.
Indeed, VARs are ideally positioned to deliver on speech recognition's promise to eliminate onerous data-entry tasks and streamline corporate workflow.
The technology "presents a tremendous opportunity" for the channel, says Steve Cramoysan, a research director at Stamford, Conn.-based research firm Gartner.
"There's a continuously growing trend for speech recognition to transition from an enabling technology to a productivity and cost-savings tool," says Michael Moser, CEO of EXAQ Micro Services, a Citrus Heights, Calif.-based reseller of voice-recognition systems. "This seems to be most pronounced in the medical field."
Talk To Me
The speech-recognition market consists of three segments: embedded applications, telephony and dictation.
The first, embedded applications--voice commands that start cars or control household appliances, for example--is garnering a lot of attention but offers the fewest opportunities for VARs at this time.
Telephony, though more promising, presents hurdles to be cleared. But Gartner's Cramoysan says organizations are rolling out customized call-center apps that are dependent on voice recognition.
It's the third, dictation, that is a downright hotbed of activity for VARs. No longer are opportunities limited to one-off purchases of a couple of microphones and some shrink-wrapped software.
"Growth in the sales of our products has just been off the charts," says Matt Revis, director of product management for dictation solutions at Burlington, Mass.-based Nuance, which produces Dragon Naturally Speaking (DNS) software. "Our business is up almost 60 percent. In health care, figures are up 150 percent."
Now in its eighth version, DNS boasts a 99 percent accuracy rate, meaning that the software recognizes almost every word a user speaks into a PC or handheld device to create documents and manipulate Microsoft Office applications, including Word, Excel, Outlook and PowerPoint.
Armonk, N.Y.-based IBM Software Group now offers Embedded ViaVoice Enterprise edition, which seeks to drive up accuracy rates through features such as N-Best, an application that serves up similar phrases or words when spoken input is unclear.
Microsoft, too, is dipping its feet into the speech-recognition market. The Redmond, Wash.-based software giant is offering Microsoft Speech Server, or MSS, tools designed to help resellers and developers deliver both telephony- and dictation-related speech applications to eager enterprises.
NEXT: Vendors lean on VARs to sell their speech-rec wares.
Despite product improvements, however, IBM and others are leaning heavily on resellers to execute sales.
"VARs contribute a great deal of value as they add the application layer and integration services on top of the speech middleware," says Brian Garr, director of IBM's Enterprise Speech Solutions group.
The solution-provider contribution is especially vital because speech-recognition tools require a great deal of customization.
"There's a large opportunity for individuals interested in adding value to the basic speech-recognition technology that's currently available," says Bill Wade, vice president of St. Louis-based Kaberline Healthcare Informatics, a physician-owned speech reseller that targets medical, veterinary, legal and corporate markets.
Along with harnessing speech technology to aid people struggling with spinal-cord injuries and other mobility impairments, Kaberline has customized speech-recognition products to help medical facilities jump-start the digitization of medical records. "There's less need for box-pushers than there is for people with an eye toward the development of innovative ways to incorporate speech recognition into the expanding workflow of today's business," Wade says.
Blending speech-related access with enterprise applications is another area ripe for VAR participation, says Bob Bova, CEO and president of Vanguard Voice Systems, a Rancho Santa Margarita, Calif.-based reseller of products from IBM and other vendors. To help customers integrate voice capabilities with enterprise applications, Vanguard developed AccuSpeech middleware that works with a variety of voice-engine products.
"The most challenging aspect of deploying speech as an interface technology has been the integration of the speech engine with the application, which requires complete redevelopment if the application or database is changed," Bova says. "VARs can integrate these applications with various hardware solutions specific to their expertise and markets."
Also helping more VARs get into speech-related application work is a series of emerging voice standards, such as VoiceXML and Speech Synthesis Markup Language, or SSML.
"In the past, developing speech-technology applications required special skills and often relied heavily on familiarity with several popular but proprietary development technologies," says Raj Tumuluri, president and CEO of Openstream, a reseller in Somerset, N.J.
But, ultimately, VARs are likely to benefit most from the fact that speech recognition requires a great deal of training and customer hand-holding.
Without solid training, in fact, about half of all users outfitted with speech-recognition tools will fail to use the technology properly, says James Cox, CEO of Varna, Ill.-based VAR Crown International Distributing and author of the book How To Master IBM ViaVoice.
"The training is imperative. Most customers are going to need help--someone standing over their shoulder as they learn these products," Nuance's Revis adds. He says the company relies on more than 300 solution providers to distribute its products. "For a VAR, this is a great business case. They can provide our software, along with their own services. It's a lucrative proposition."
NEXT: Three factors driving the speech-recognition market.
A look at what's driving the speech-recognition market:
- The rise of voice-related standards that make application customization and development easier. Those standards include VoiceXML; Media Resource Control Protocol, or MRCP; Speech Recognition Grammar Specification, or SRGS; and Speech Synthesis Markup Language, or SSML.
- The need for busy, non-IT professionals to create more electronic documents, especially to comply with regulations such as the Healthcare Insurance Portability and Accountability Act, or HIPAA.
- Statutes guaranteeing access to technology for individuals with disabilities. Section 508 of the U.S. Rehabilitation Act, for example, mandates that government agencies provide technology tools to disabled employees. Also, regulations require public schools to provide some children in special-education classes with access to technology. Others include the Americans with Disabilities Act and Workers' Compensation statutes.