A smartphone speech recognition software can write text messages three times faster than humans, say researchers, adding that the discovery can spur the development of innovative speech recognition apps.
Computer scientists from Stanford University, Chinese web services company Baidu and University of Washington devised an experiment that pitted Baidu’s Deep Speech 2 cloud-based speech recognition software against 32 texters, ages 19 to 32, working the built-in keyboard on an Apple iPhone.
“They grew up texting, so we’re putting speech recognition up against people who are really good at this task,” said James Landay, professor of computer science at Stanford.
The subjects took turns typing or speaking about 100 phrases sourced from a standard library of everyday phrases such as “physics and chemistry are hard,” “have a good weekend” and “go out for some pizza and beer”.
The testing app recorded their times and accuracy rates.
Half the subjects performed the task in English using the QWERTY keyboard; the other half conducted the test in their native Mandarin Chinese, using iOS’ Pinyin keyboard.
For English, speech recognition was three times faster than typing and the error rate was 20.4 per cent lower.
In Mandarin Chinese, speech was 2.8 times faster with an error rate 63.4 per cent lower than typing.
“We knew speech recognition is pretty good, so we expected it to be faster, but we were actually quite surprised to find that it was almost three times faster than typing on a keyboard,” said co-author Sherry Ruan, computer science PhD student at Stanford.
Although the researchers used Baidu’s speech recognition software, they suspect that other high-accuracy speech engines perform at a similar level.
“We should put speech in more applications than just typing an email or text message,” Landay noted.
You could imagine an interface where you use speech to start and then it switches to a graphical interface that you can touch and control with your finger.
The study was published online at arxiv.org.