Sunday, February 19, 2012

Voice Recognition for FM Repeaters

Last year Google pushed version 11 of their Chrome browser, and along with it, one really interesting new feature- support for the HTML5 speech input API.

This means that you'll be able to talk to your computer, and Chrome will be able to interpret it. This feature has been available for awhile on Android devices, so many of you will already be used to it, and welcome the new feature.

If you dig around in the source code, you learn how the speech recognition is implemented:

http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech/

Audio is collected from the mic encoded in FLAC format, and then passed via an HTTPS POST to a Google web service, which responds with a JSON object with the results.

Some Asterisk Telephony enthusiasts have been monkeying with this Google Speech API. This is how I first learned of it.

http://zaf.github.com/asterisk-speech-recog/

Interacting with a repeater thus far has been limited to DTMF to query the time, etc.

This API opens a whole new world to craft your own Siri like repeater system. Just set up a series of IF statements to grep/match the text returned.

You ask "What time is it?" It sees "time" and does a time lookup and speaks it back.
You ask "where is KB8ZXE?" it sees where and KB8ZXE and passes a query to APRS.fi and reports back that he was last 2.1 miles NorthEast of Green Bay".... etc

I've been experimenting with this on IRLP node/ repeater (147.075 MHz) here in Green Bay. It's really quite trivial to implement. I bet we are the first ham radio repeater to implement voice recognition.



Here is all you really need to get started:
 #!/bin/sh  
echo "1 SoX Sound Exchange - Convert WAV to FLAC with 16000"
sox message.wav message.flac rate 16k
echo "2 Submit to Google Voice Recognition"
wget -q -U "Mozilla/5.0" --post-file message.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-US&client=chromium" > message.ret
echo "3 SED Extract recognized text"
cat message.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > message.txt
echo "4 Remove Temporary Files"
rm message.flac
#rm message.ret
echo "5 Show Text "
cat message.txt

1 comment:

Bill Chellis said...

Steve,

This is pretty cool.
I might just have to setup my own repeater now just to play with this.
God knows no one else will around here.

Once again, please know you are not the only ham out there who realizes their ham ticket is a bona fide license to TINKER.

Bill, KB1ROP