Microsoft’s Speech Recognition System Achieves New Accuracy Milestone

Microsoft's conversational speech recognition system has reached a 5.1 percent error rate, its lowest so far.

by Rohan Mehta Aug 21, 2017

Microsoft’s conversational speech recognition system – designed to accurately recognises the words in a conversation like humans do – has reached a 5.1 percent error rate, its lowest so far.

This milestone means that, for the first time, a computer can recognise the words in a conversation as well as a person would.

“Our research team reached that 5.1 percent error rate with our speech recognition system, a new industry milestone, substantially surpassing the accuracy we achieved last year,” Microsoft said in a blog post late on Sunday.

Last year in October, the team from Microsoft Artificial Intelligence and Research reported a speech recognition system that makes the same or fewer errors than professional transcriptionists.

The researchers had then reported a word error rate (WER) of 5.9 percent.

“Last year, Microsoft’s speech and dialogue research group announced a milestone in reaching human parity on the ‘Switchboard’ conversational speech recognition task, meaning we had created technology that recognised words in a conversation as well as professional human transcribers,” said Xuedong Huang, Technical Fellow, Microsoft.

See Also:

Google Adds 6-Second Video Preview to Mobile Search

‘Switchboard’ is a corpus of recorded telephone conversations that the speech research community has used for more than 20 years to benchmark speech recognition systems.

The task involves transcribing conversations between strangers discussing topics such as sports and politics.

The team used “Microsoft Cognitive Toolkit 2.1” (CNTK), the most scalable deep learning software available, for exploring model architectures.

Additionally, Microsoft’s investment in cloud compute infrastructure, specifically, Azure GPUs helped improve the effectiveness and speed.

Reaching human parity with an accuracy on par with humans has been a research goal for the last 25 years.

“Microsoft’s willingness to invest in long-term research is now paying dividends for our customers in products and services such as Cortana, Presentation Translator, and Microsoft Cognitive Services,” the post read.

“Moving from recognising to understanding speech is the next major frontier for speech technology,” the post added.

For the latest tech news and reviews, follow Mobile Dekho on Twitter, facebook and subscribe to our Youtube channel.

Previous

Samsung Galaxy Note 8 Image And Features Leaked Ahead Of Launch

Next

Apple iPhone 8, 7s and 7s Plus Prices Reportedly Leaked

Related News

Huawei launches its Much Awaited HUAWEI WATCH GT 2e in India at Rs. 11,990

by MD News Desk

May 20, 2020

HUAWEI WATCH GT 2e receives maximum pre booking on Amazon soon after its launch

by MD News Desk

May 20, 2020

Huawei launches in its Much Awaited Premium Mid-range Smartphone Huawei Y9s in India

by MD News Desk

May 20, 2020