Verbal Compass
Better speech-based error correction for dictation tools
From: Technology Review - March 2005 - page 80-81

Context: Extreme multitasking is the modern fad, but no person has enough
hands to manage a cell phone, a digital organizer, a steering wheel, and
coffee all at the same time. Accordingly, people want a hands-free way to
interact with computers. Although speech recognition systems are more
accurate than ever, typical users still spend more time correcting errors
than dictating text; half of their correction time is spent just moving a
cursor to errors identified in, say, a dictated e-mail. Confidence scores the
software's estimates of how likely it is to have captured the right word and
can be used to identify possible errors. Now Jinjuan Feng and Andrew Sears at
the University of Maryland, Baltimore County, have shown that confidence
scores can also be used to accelerate the correction process.  

Methods and Results: Twelve participants dictated 400-word documents using a
speech recognition system. It interpreted 17 percent of the words
incorrectly, a typical rate; it was the correction process that was atypical.
The software used confidence scores to tag words throughout the text as
navigation anchors. Users could quickly jump to each anchor with short voice
commands and then move a cursor word by word to the error. The researchers
measured the number of navigation commands the participants used, the failure
rates of the navigation commands, and the time spent dictating and
navigating. Average failure rates reported for other techniques are about 5
percent for direction-based navigation (move right) and 10 to 20 percent for
word-based navigation (select December). In a test of Feng and Searss
technique, the failure rate was only 3.2 percent. Even better, the time users
spent navigating to errors was cut by nearly a fifth. This is significant
compared with other error-correction techniques and it is promising, because
this work suggests the means for further improvement.  

Why it Matters: The Lilliputian buttons on PDAs and other pocket-sized
wonders are quickly shrinking under a constant-sized thumb. Multitasking is
on the rise, and more people with physical disabilities are entering the
workforce. Both trends will steer users away from computer systems with
manual interfaces. Speech recognition, but for its high error rate and long
correction times, is an obvious alternative. 

This work clearly shows that using confidence scores for navigation can
shrink users correction times. With further improvements, the technique
promises to boost the usability of hands-free error correction and so
engender a surge of new gadgets and applications. 

Source: Feng, J., and A. Sears. 2004. Using confidence scores to improve
hands-free speech based navigation in continuous dictation systems. ACM
Transactions on Computer-Human Interaction 11:329-356. 

From:
http://www2.technologyreview.com/articles/05/03/issue/synopsis_info.asp

Links:
Andrew L. Sears
http://userpages.umbc.edu/~asears/

