/ pi

Building a dual mode (online/offline) voice assistant: part 1

I wanted to help a friend who has ALS, so decided to figure out what can be done with inexpensive parts (and mostly open-source software) to enable a person with limited mobility to "listen", "understand", and "reply" to other people. Everyone's familiar with Stephen Hawking's computer - it was a group effort, and required a lot of funding. We wanted to build something that's simple, inexpensive, and does not require collaboration with large companies like Intel. I built the audio/NLP computer, and my friend built a special helmet with gyro navigation to control everything.


This work isn't just about limited mobility, it can also be used as a foundaton for building your own voice-based assistant - as an alternative to Amazon Alexa or Google Home.

List of components:

  1. Raspberry Pi 3 B+ Motherboard - $38
  2. RASPIAUDIO.COM Audio DAC HAT Sound Card - $30
  3. MakerFocus GPIO Expansion Extension Board - $9

The GPIO expansion board is optional, but it enables you to easily connect multiple devices to your Pi while you're experimenting.

Part 1: enabling audio

We are going to use RASPIAUDIO.COM Audio DAC HAT Sound Card for capturing and playing audio. It's a great little device, and you can read more about it here. It has a high-sensitivity microphone (good for noisy environments) and a pair of 5W speakes on board. I found these sufficiently loud (and clear) for both music playback and speech synthesis:


Once the device is connected to your Raspberry Pi, you can install the required libraries by downloading and running the following bash script from the manufacturer's website (it's always recommended to review a script before running it, to understand what it does):

sudo wget -O mic mic.raspiaudio.com
sudo bash mic

Reboot. After rebooting, download and run the followign bash script to finish the installation:

sudo wget -O test test.raspiaudio.com
sudo bash test

Once the test is running, you can press the yellow onboard button, and you will hear "Front Left, Front Right", followed by the recorded sequence from the microphone.

See what that looks like here: https://youtu.be/uVuwMC7wbVg

You can adjust volume by running AlsaMixer:



Now that your audio is working, stay tuned for Part 2 - where will start capturing and processing speech programmatically using Python!