Now that we are able to easily create and customise voice commands on the Pi, let's do the reverse and create voice responses. As mentioned in my previous post, There are a lot of voice tools available, but I would like to have an offline alternative capable of working without an internet connection. What's a home automation system if it's crippled because of no internet?
That's why in this post, I will work with both an offline and online text to speech tool, and provide a mechanism to switch between the two, should the internet connection be down. I'm using both, because from what I've experienced, the online alternatives just sounds better than the offline ones.
Searching for an offline and easy to use text to speech tool, I came across flite. "Flite" is a lightweight version of another text to speech tool called Festival ("flite" = "festival-lite"). It is designed specifically for embedded systems and has specific commands to make it easier to use from the command line.
Flite is available in the repository and will use a mere 384kB of disk space. I suppose that indeed qualifies as lightweight
pi@piclock:~ $ sudo apt-get install flite Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed: flite 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 234 kB of archives. After this operation, 384 kB of additional disk space will be used. Get:1 http://mirrordirector.raspbian.org/raspbian/ jessie/main flite armhf 1.4-release-12 [234 kB] Fetched 234 kB in 0s (395 kB/s) Selecting previously unselected package flite. (Reading database ... 119163 files and directories currently installed.) Preparing to unpack .../flite_1.4-release-12_armhf.deb ... Unpacking flite (1.4-release-12) ... Processing triggers for man-db (188.8.131.52-5) ... Processing triggers for install-info (5.2.0.dfsg.1-6) ... Setting up flite (1.4-release-12) ...
Different voices are installed by default. You can list them as follows:
pi@piclock:~ $ flite -lv Voices available: kal awb_time kal16 awb rms slt
To use a certain voice, use the "-voice" option when launching flite. For example:
pi@piclock:~ $ flite -voice slt -t "Hello, is it me you're looking for?"
If you can't find a voice you like, additional voices are available for download on the flite website: Flite English Synthesis Demo
Nothing to be installed here for the speech synthesis, as it will be processed online, but a tool is required to play the received audio file.
Using the preinstalled "omxplayer", the audio seemed to be cut off and the program not stopping after playing out the file. So instead, I installed "mplayer".
pi@piclock:~ $ sudo apt-get install mplayer Reading package lists... Done Building dependency tree Reading state information... Done Note, selecting 'mplayer2' instead of 'mplayer' The following extra packages will be installed: liba52-0.7.4 libbs2b0 liblircclient0 liblua5.2-0 libpostproc52 libquvi-scripts libquvi7 Suggested packages: lirc The following NEW packages will be installed: liba52-0.7.4 libbs2b0 liblircclient0 liblua5.2-0 libpostproc52 libquvi-scripts libquvi7 mplayer2 0 upgraded, 8 newly installed, 0 to remove and 0 not upgraded. Need to get 1,042 kB of archives. After this operation, 2,711 kB of additional disk space will be used. Do you want to continue? [Y/n]
For the text to speech side of things, I'm making use of the Google Translate TTS API. It's possible to pass a string to the API in the form of a URL, which will return an mp3 file containing the spoken version.
Clicking the link below should play out some audio:
By integrating this URL in a script and make the query variable, custom responses can be generated on the fly.
In order to be able to use voice control at all times, even without an active internet connection, both solutions can be implemented and combined in order to have the code switch between them.
I wrote a script taking the desired message as an argument. The script first checks connectivity to Google. If ping to Google is successful, use the Google Translate TTS, otherwise, use "flite".
#!/usr/bin/env python import os from sys import argv response = argv def check_internet(): host = "google.com" connectivity = os.system("ping -W 1 -c 1 " + host) return connectivity def offline_response(): os.system("flite -voice slt -t \"" + response + "\"") def online_response(): url = "\"http://translate.google.com/translate_tts?ie=UTF-8&tl=en&client=tw-ob&q=" + response + "\"" agent = "\"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0\"" recording = "/tmp/recording.mp3" os.system("wget -U " + agent + " -O " + recording + " " + url + " && mplayer " + recording + "") def main(): if check_internet() == 0: online_response() else: offline_response() main()
Ok, for this post's demo, I'm calling the response script defined in the previous paragraph and have it repeat the incoming speech registered via PocketSphinx, as installed in my previous post.
The first part of the video demonstrates the offline TTS by temporarily setting the ping host to a dummy value ("google.coma"), simulating internet down. In the second part, the ping host is valid, and the script uses the online TTS. You can see the audio file being downloaded on the fly.
I hope you've enjoyed this post!
Navigate to the next or previous post using the arrows.