Scientists have created a program that can continuously recognize live human speech with high accuracy faster than people can. The percentage of errors in the new algorithm is lower than in humans.
Speech recognition has so far been the Achilles’ heel of artificial intelligence. The new program can fix the situation – it makes fewer errors in speech recognition than people.
Watching human speech and being able to decipher it quickly is one of the most difficult tasks for artificial intelligence. During a conversation people can interrupt, correct, fill the time between words and phrases with different sounds. All this prevents people from understanding the meaning of what is said not only to programs, but also to people.
Now, scientists at the Karlsruhe Institute of Technology have created a program that can accurately recognize most of the phrases spoken by humans. The program has already been tested in practice, allowing it to translate university lectures from German or English into languages spoken by international students.
According to scientists, if a person recognizes the speech of a living interlocutor, he or she makes about 5.5% of mistakes per conversation. For an algorithm developed by researchers, this figure is about 5.0%. Previously, the problem with the program was quite a delay in sound processing, but in the new version the scientists were able to reduce this figure to just one second. Today, this is the lowest delay for speech recognition programs.
“Fast and accurate recognition of human speech is an important step for computer processing of live language. It will allow us to improve communication between people and artificial intelligence, make voice translation more accurate and ensure better interaction between people and machines,” says Alex Weibel, one of the authors of the work, Professor of Computer Science at the Karlsruhe Institute of Technology.