Baidu shows AI-based transcribing tool

Posted on Tuesday, Mar 14 2017 @ 15:00 CET by Thomas De Maesschalck
While we don't really know Baidu in the West, there's no doubt this Chinese search engine employs some very bright minds, including famed AI researcher Andrew Ng. The company has booked great advances in terms of AI-based speech recognition and now Baidu presents SwiftScribe, a tool for quickly and easily transcribing voice recordings.

Baidu claims that by using SwiftScribe, a professional transcriptionist can cut down the average time spent on a project by 40 percent. The Chinese firm explains one of the unique aspects of SwiftScribe is that the system can learn and improve as users make edits on-the-fly.

It's not open the the public yet, but Baidu is offering an invite to 30-50 transcriptionists to test-drive the beta version.

Today we are proud to announce the beta launch of Baidu’s first AI-powered transcription software, SwiftScribe. We set out to develop SwiftScribe to fix a pain point – the time-consuming process of manually transcribing word-by-word. Now, through the integration of Baidu’s state of the art speech recognition technology and easy editing tools, SwiftScribe will allow people to quickly and easily transcribe voice recordings, increasing productivity and streamlining workflow.

The core technology powering SwiftScribe is Baidu’s speech recognition engine, Deep Speech 2. Its neural network, which is trained on thousands of hours labeled audio data, learns to associate sounds with certain words and phrases. In addition to advanced ASR technology, we designed intuitive shortcut keys and innovative human-computer interaction to solve the problem of discontinuity, one of the biggest obstacles users face when transcribing.

Baidu SVAIL has developed every component of SwiftScribe, from the speech recognition system to the user interface. One big advantage of this approach is as users transcribe and make edits, the system can learn and improve along the way. It is the use of this sophisticated end-to-end approach that sets SwiftScribe apart from other competitors on the market.

For professional transcriptionists, SwiftScribe will allow for both higher productivity and returns on projects. It typically takes between four to six hours to transcribe one hour of audio data, and the going rate for transcriptions is somewhere around one dollar per audio minute. Using SwiftScribe, the time a transcriptionist spent on a project is on average cut down by 40 percent.

SwiftScribe was designed for anyone who does transcription regularly – freelancers, transcriptionists working for transcription service companies, and data entry specialists. Because of its wide user base, SwiftScribe has the potential to positively impact a range of industries that benefit from transcription, including medical and healthcare, legal and law enforcement, business, media, and others.

To begin, we will invite between 30-50 transcriptionists to test the beta version. For more information about SwiftScribe or to request an invitation, please visit or send an email to:

About the Author

Thomas De Maesschalck

Thomas has been messing with computer since early childhood and firmly believes the Internet is the best thing since sliced bread. Enjoys playing with new tech, is fascinated by science, and passionate about financial markets. When not behind a computer, he can be found with running shoes on or lifting heavy weights in the weight room.

Loading Comments