Using Speech To Text Processing on Social Media Voice Notes

WhatsApp voices notes are great. When you don’t feel like typing, or just want your friends to hear your voice without having to bother them with a phone call, they’re the go-to.

Unfortunately, voice notes have attracted quite a bit of opposition.

At least within my social network.

The problem with voice notes is that while they’re great for the sender (disclaimer: I love them), they can be downright inconvenient for the recipient.

Ever log onto WhatsApp in the middle of the workday to find a slew of three minute voice notes waiting for your listening attention one by one? What about when you’re about to head into a meeting? In these instances, and more, what’s convenient for your friend for you can be downright annoying.

Wouldn’t it be great if WhatsApp would integrate an automatic transcription feature into the service?

It would. But until then you still have options.

Desktop (Cross Platform)

Here’s a solution that works on the desktop. Find the speech to text engine that works for you. There are speech recognition tools for the desktop, but I think it makes more sense to use a cloud app.

1. Download Voice Note

Firstly, you’ll need to download the voice note.

If you’re accessing WhatsApp from the web UI, then all you have to do is click on the downward-facing caret next to the message and then select download:

Downloading a voice note

You’ll receive the download as an .ogg — which is the Ogg Vorbis (lossy) audio format.

2. Set Up Shop At HappyScribe.com

Happy Scribe offers automatic transcription. It’s a paid service, but there’s a generous free tier to give you an opportunity to see how it works.

I’ve used it to transcribe quite a lot of voice notes and still haven’t reached the end of that tier yet.

Their current pricing is here.

3. Upload and Transcribe Voice Notes

After about a minute, your transcribed voice note is ready to read.

As you can see, the results are imperfect. But they’re more than good enough to get the gist of what’s being said.

Android

First up, I tested out Transcriber for WhatsApp — which I’m currently enjoying early access to.

If you’re transcribing voice notes using an app like this, firstly hold down the voice message and then hit share. Once the options populate, choose the transcription app that you’ve installed (search the Play Store for automatic voice transcription; there are quite a few options).

Unfortunately Transcriber for WhatsApp, which was my first choice, deemed my friend’s voice note to have “incomprehensible audio.”

After trying a couple more apps, I next tried out Otter.ai — a popular AI-powered transcription service which is ideal for automatically transcribing meetings and recorded conversations.

Best of all, it can be accessed from a web UI — so conceivably this could be the tool to handle all your WhatsApp voice notes.

I ran my friend’s voice note through Otter — and this time I had success.

Again, the automated transcript differed slightly from what was actually said. But it was more than good enough to use. Otter also plucked out the keywords from the message for me.

Otter’s pricing is here.

Compare and contrast the transcripts:

Otter.Ai:

Happy Scribe:

Happy Scribe:

Verdict

Both Otter and Happy Scribe do a decent job at automatically transcribing WhatsApp voice notes. While both speech to text engines currently transcribe imperfectly, both are more than good to use in my opinion.

Given that Otter has both a web UI and apps for Android and iOS this would give it the edge for me. It’s always easier to use one platform across your devices. And, using just one service, you can keep all your voice message transcriptions in one place.

At the time of writing, Otter is also offering 600 free minutes of automatic transcription per month (max 40 minutes per recording).

So — unless you’re receiving a vast amount of voice notes — you could conceivably do all your transcription using it alone.