Sunday, February 19, 2012

Voice Recognition for FM Repeaters

Last year Google pushed version 11 of their Chrome browser, and along with it, one really interesting new feature- support for the HTML5 speech input API.

This means that you'll be able to talk to your computer, and Chrome will be able to interpret it. This feature has been available for awhile on Android devices, so many of you will already be used to it, and welcome the new feature.

If you dig around in the source code, you learn how the speech recognition is implemented:

http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech/

Audio is collected from the mic encoded in FLAC format, and then passed via an HTTPS POST to a Google web service, which responds with a JSON object with the results.

Some Asterisk Telephony enthusiasts have been monkeying with this Google Speech API. This is how I first learned of it.

http://zaf.github.com/asterisk-speech-recog/

Interacting with a repeater thus far has been limited to DTMF to query the time, etc.

This API opens a whole new world to craft your own Siri like repeater system. Just set up a series of IF statements to grep/match the text returned.

You ask "What time is it?" It sees "time" and does a time lookup and speaks it back.
You ask "where is KB8ZXE?" it sees where and KB8ZXE and passes a query to APRS.fi and reports back that he was last 2.1 miles NorthEast of Green Bay".... etc

I've been experimenting with this on IRLP node/ repeater (147.075 MHz) here in Green Bay. It's really quite trivial to implement. I bet we are the first ham radio repeater to implement voice recognition.



Here is all you really need to get started:
 #!/bin/sh  
 echo "1 SoX Sound Exchange - Convert WAV to FLAC with 16000"   
 sox message.wav message.flac rate 16k  
 echo "2 Submit to Google Voice Recognition"  
 wget -q -U "Mozilla/5.0" --post-file message.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-US&client=chromium" > message.ret   
 echo "3 SED Extract recognized text"   
 cat message.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > message.txt  
 echo "4 Remove Temporary Files"  
 rm message.flac  
 #rm message.ret  
 echo "5 Show Text "  
 cat message.txt  

Steve Ford, WB8IMY picked up on this blog and published it in the July 2012 issue of QST magazine.


{edit 2014}

This blog entry is over a year old is meant as a starting place for someone who has some Linux experience.  Since that time the Google speech API has changed a bit.  They block queries without a server key.

Step 0. Using an existing Google/Gmail account, join the Chrome-Dev Group.
https://console.developers.google.com/project
Step 1. Create a new Project here (e.g. Speech Recognition)
Step 2. Click on your newly created project and choose APIs & auth.
Step 3. Turn ON Speech API by clicking on its Status button.
tep 4. Click on Credentials in APIs & auth and choose Create New Key -> Server key. Leave the IP address restriction blank.
Step 5. Write down your new API key or copy it to the clipboard.





Now for version 2 of the API you submit like so (replace with your API key):


 echo "1 SoX Sound Exchange - Convert WAV to FLAC with 16000"  
 sox message.wav message.flac rate 16k  
 echo "2 Submit to Google Voice Recognition"  
 wget -q -U "Mozilla/5.0" --post-file message.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v2/recognize?lang=en-us&client=chromium&key=AxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxY" > message.ret   
 echo "3 SED Extract recognized text"  
 cat message.ret | sed 's/.*transcript":"//' | awk -F '"}' '{print $1}' | tail -1 > message.txt  
 echo "4 Remove Temporary Files"  
 rm message.flac  
 rm message.ret  
 echo "5 Show Text "  
 cat message.txt  


I have easily added code to existing IRLP and Allstar Linux computers.  IRLP or Allstar has the hooks to catch DTMF strings to invoke this application to record your spoken commands, and submit them for translation.  From there you can code keyword triggers a number of ways.  An easy example is to use grep.

 if grep --quiet time /tmp/message.txt; then  
  /home/irlp/bin/key 
  TIME=`date "+%l:%M %p"`  
  echo "the time is $TIME" | festival --tts   
  /home/irlp/bin/unkey
 fi  


Freely Available STTs:
Google STT
IBM STT
Wit.ai STT
AT&T STT


I highly recommend the "Building a Virtual Assistant for Raspberry Pi" book by Tanay Pant

{Edit 10/2018}
Chris Lam, KM6VGZ  - “Make amateur radio cool again”, said Mr Artificial Intelligence. - A project on building a speech recognition system for amateur radio communication.

Wednesday, February 1, 2012

I have some sad news folks.

It's been a while since I ranted about the ARRL. I was reluctant to renew my ARRL membership, so I let it lapse for a while so I could have some time to ponder if it's worth it.

The decision I came to was, if I renewed, I didn't want the paper QST every month. I usually buy the CD ROM at the end of the year. So I was leaning towards the blind rate, which excludes QST. But that was only like $8.00, so why bother at all?

I ended up renewing as a guy who went blind from reading QST, but I added the QEX subscription. I'll have someone read it to me I guess. I am so sorry, folks!

About the only thing that interested me in QST as Steve Ford's Eclectic Technology column. I have never had in interest in contesting, and a near zero interest in HF.

So what does QST have to offer me? Next to nothing that I can't wait till the end of the year to browse on the CD.

Here is something that always catches my eye on the ARRL website:
Khrystyne Keane's column titled: "ARRL in Action: What have we been up to?"

I always have this hope in the back of my mind that today is going to be the day, the good old ARRL gets off its butt and does something out of the ordinary.

It's never really the case, the report by Khrystyne (who I know loves me so much) is really just a re-hashing of the mundane crap they did, in case you were asleep.

So I got to thinking maybe there is a report already from the Microwave Band-planning Committee Already that I missed. So I jump to the meat:

http://www.arrl.org/committee-reports

Nothing yet, but some other things catch my eye from various committee reports..

By the early 1990s, the number of FM repeaters peaked at more than 23,000 according to ARRL Repeater Directory statistics.

The FM expansion came to a sudden halt in the mid-1990s with the proliferation of inexpensive cellular telephone service. FM operators were suddenly handed a communication technology that was not only superior in terms of performance; it was private and came with no restrictions on content. As a result, the amateur FM user base effectively collapsed.

Today, with cellular telephone service dominating the personal communications arena, the vast majority of amateur FM repeater systems see little or no use at most times of the day. Some repeaters have boosted activity somewhat by using EchoLink or IRLP to provide transcontinental or even global linking, but according to reports from repeater coordinators, activity overall remains very low.


These committees are trying to develop arguments and recommend "strategies to defend amateur frequency allocations to the bands between 222-3500 MHz, in light of the skyrocketing demand for mobile wireless broadband spectrum."

It is hard to regain the "cutting edge" part of ham radio that we once had if Part 15 and commercial carriers push the envelope without needing a license.

In my opinion, we don't need more repeaters. That seems to have been observed above. We need more flexibility to use the bands above 2 meters for other things, like building data networks that aren't a joke. There seems to be plenty of under-used space on the 70cm band.

If the ARRL doesn't seek rewriting the rules, or re-doing microwave band plans, then please stop asking for our input or trying to appear as listening.

Keep in mind that sooner or later if you jokers can't regain that cutting edge, then how do you expect inactive hams to be concerned about band threats by large broadband cellular telco's? These guys will ultimately will turn underused spectrum into something useful and cutting edge.

But don't worry about that, concern yourself with that 4 MHz of HF spectrum as you have been. And don't forget to take offense, though what I type is likely the sad truth.

Lastly to the readers. Have you shared your views with the ARRL recently? They cannot operate to the members liking without feedback and Regular communication!

Here is an interesting observation from W9GB on QRZ:
If the 700,000 licensed US amateurs really cared about US spectrum allocations (especially above 30 MHz), then they should support a stronger lobbying voice. The ARRL with only has ~ 200,000 US members -- not even 50% membership of licensed operators.


This is why I did renew. I do care, and realize the current spectrum pressures are enormous. But at the same time, what does this observation say about the ARRL's relevance to a large portion of the hams?