Linux Journal: Voice Recognition Ready for Consumer DevicesJul 08, 2000, 17:06 (0 Talkback[s])
(Other stories by Linley Gwennap)
"Until recently, voice recognition required each user to train the system to recognize his or her particular speech patterns. Like most other software, however, voice recognition improves, given faster processors and more memory. Recent products reduce training time dramatically. Speaker-independent software eliminates training entirely. To achieve highly accurate speaker-independent recognition with moderate processing requirements, designers must limit the context and vocabulary of the application. For example, a car needs to recognize only a few dozen words, including ``temperature'', "radio", and the numbers needed to select a station."
"Lernout & Hauspie (http://www.lhsl.com/), a leading supplier of voice software, supplies speech engines for applications as simple as these, as well as far more complex ones. According to Klaus Schleicher, a director of product management at L&H, the simplest speech engine provides speaker-independent recognition of up to 100 words, but requires less than 200K of memory. L&H offers a more-powerful speech engine that can recognize up to 1,000 words, again without training. This engine requires 2MB of memory and can run on a 200MHz processor. This hardware costs a bit more, but is still easily obtainable for $30 today, and that price will drop over time. The larger vocabulary is suitable for applications such as a TV set-top box that can be programmed by speaking the name of a show or a hand-held PDA that can manage calendars and address books via voice."
"Composing arbitrary text, such as an e-mail message, requires a much larger vocabulary. For this purpose, L&H has a speech engine with a 20,000-word vocabulary--twice as large as the average adult's. This engine requires some training, but only about five minutes per user. Even this large vocabulary doesn't require a full-blown PC or server; the company has demonstrated it using a 200MHz StrongArm processor and 32MB of memory. This speech engine could be incorporated into a webpad, allowing users to compose e-mail and other documents without using a keyboard."