rikatherobot-blog - Tumblr blog

rikatherobot-blog · 8 years ago

Text

Telegram Bot and Textpack

Being able to interface with Rika on a mobile phone provides the ability to remotely control/obtain information from her. We will be using Telegram, the open source messaging app, as it has a comprehensive API for bots. Her text responses would also need random number generation to again, seem more human.

Installing tgbot-cpp

‘tgbot-cpp’ is the C++ library for the Telegram bot API, provided here. Follow the instructions to compile the library - make sure the files appear in “/usr/local/lib” and “/usr/local/include”. To link and use the library in Code::Blocks, we need to:

Go to ‘Settings -> Compiler -> Compiler Settings -> Compiler Flags’ and check “Have g++ follow the C++11 ISO C++ language standard”. This is required for the library (and its dependencies) to work.

Next, go to ‘Linker Settings -> Other Linker Options’ and add these in new (and individual) lines.

-lTgBot -lpthread -lssl -lcrypto -lboost_system

Then, go to ‘Search Directories -> Compiler’ and add the directory “/usr/local/include” followed by going to ‘Linker’ and adding “/usr/local/lib”. Whenever you open any of the .cbp files provided, go to ‘Project -> Build Options -> Linker Settings -> Link Libraries’ and make sure that the library “libTgBot.a” is linked; this is the actual library that is linked on a per-project basis.

The full documentation of tgbot-cpp can be found here.

Testing/Using the Telegram Bot

Install Telegram on your mobile phone if you haven’t; a phone number has to be linked to each human user (but not required for bots). Once you’ve set it up, initiate a chat with BotFather to get your bot token (explained here). A test program in “/test_files/test_telegram” is provided (use the .cbp file); insert your bot token and initiate a chat session with your bot. Write anything to it and the terminal will print your chat ID - record it, as it will be used to prevent anyone else from secretly using your bot.

Insert your bot token and personal chat ID in the main.cpp file, and you’re good to go.

Implementation details of the Telegram Bot

The main idea is that the user will use Telegram to interface with Rika while outdoors or away from home, and from a design perspective, transitioning from voice mode to text mode can lead to problems;

For example, the user will use the voice mode to tell Rika that he/she is going out for a while, thus initiating a ‘going out’ routine (turn off lights, switch off pc, etc.) and switching to text mode. When the user is coming back, he/she can notify Rika via Telegram, thus initiating a ‘coming home’ routine (turn on lights, etc.). If Rika switches to voice mode, then the user can’t use Rika until he/she is home, which is a nuisance if there’s a change in plans. One way or another, it seems that the best and simplest option is to have both running at the same time, therefore multithreading is used to achieve this. A downside is that, any modules in rika_action need to be thread safe if it can be called from both text and voice modes - using mutexes and semaphores if needed.

The addition of the telegram bot is also put in main.cpp rather than in rika_text, due to the nature of tgbot-cpp’s methods/functions, but it is still wrapped nicely for modularity. Similar to voice mode, simple samples of application are provided to get you going, and all the other methods are provided in tgbot-cpp’s full documentation (given above).

rika_text library addition - get_random_text_dialogue

This library contains a function that picks a random line of string in a specified text file and returns it.

rika_text library addition - rika_textpack

A folder to store .txt files of text dialogue sets. “rika_textpack_lib.h” is a programmatic reference for each text file. Similar to rika_voicepack, each addition of a .txt file should be referenced, while you can easily add/change dialogue lines in the text files - each possible dialogue must be in it’s own line.

Extras

There is a possibility that the Raspberry Pi 3B’s wifi will turn off after minutes of inactivity, see here on how to disable it.

Ending notes

I believe this is the final input/response module that will be added for the time being. In the future, I would like to bring Rika to life with the ability to play .gifs on screen, and these .gifs will have the Rika character (or your own character) doing animations (idle, talking to the user, etc.), but this will come much later.

0 notes

rikatherobot-blog · 8 years ago

Text

R.I.K.A - Version 1.0

youtube

Note: GPIO module shown in the video is not provided in this version, and is only for demonstration purposes only.

Summary of functionalities added:

Playing an audio file, or playing a random audio file in a specified set.

A minimal voicepack, voice acted by Susuri.

Speech recognition.

Keyword finder.

These are basic functionalities; they are mixed and matched together with each other in main. Blog posts before this should show the methods and steps needed to run Rika, while the actual code should also be self-documenting. You can download the latest release of version 1 here.

What’s next?

My next goal is to add a way for the user to text/receive notifications from Rika on their mobile phones; Telegram will most probably be used as there’s an API for bots. I will then focus on mostly OS level actions (weather API, alarms) before adding a GPIO module near summer 2017 for real-world interaction.

0 notes

rikatherobot-blog · 8 years ago

Text

Speech Recognition

Speech recognition allows the user to intuitively interact with Rika through verbal commands. We can also use text based communication, but that will come at a later time. For this, we will be using a microphone and CMU’s Pocketsphinx.

Enabling and testing microphone

I am using a logitech webcam which has a built-in microphone, but this method should work with any USB microphone as I believe it is hardware agnostic (compared to other methods) - see here. Setting your microphone as ‘default’ is necessary for my code to work out-of-the-box.

ALSA also has power management enabled; to disable it, invoke:

sudo nano /etc/modprobe.d/alsa-base.conf

And add these two lines:

power_save=0

power_save_controller=N

Installing ALSA developer library

ALSA is the sound driver used for the Raspberry Pi. To interface with it in C/C++, we need to obtain the developer package. On the terminal, invoke:

sudo apt-get install libasound2-dev

To link this library with Code::Blocks, go to ‘Compiler Settings -> Linker Settings -> Other Linker Options’ and type “-lasound” on a new line. You may have done this for the SDL2 and SDL2_mixer libraries.

Testing ALSA developer library

In “/test_files/test_alsa_capture”, there is a test program that captures audio from your microphone for 5 seconds. The program captures in 16000Hz, mono-channel and signed 16 bit little-endian to conform to Pocketsphinx’s requirements. Run it, say random words and you can play the sound file produced by going to that directory in the terminal and invoking:

sudo aplay -r 16000 -c 1 -f S16_LE test.wav

Installing Pocketsphinx

Pocketsphinx is the smaller version of CMU’s sphinx4, designed for embedded applications. First, ensure that you do not have other sound drivers installed on your Raspberry Pi (JACK or PulseAudio), as we want Pocketsphinx to immediately work with ALSA - see here (’purge’ command). To install, follow the tutorial here, but ignore the steps done for setting up your microphone (the ‘options’ steps). As of writing, the latest release for both Pocketsphinx and Sphinxbase is ‘5prealpha’. To link both of them in Code::Blocks;

First, you need to go to ‘Settings -> Compiler -> Linker Settings -> Other Linker Options’ and type in a new line (but all in the same line) “-lpocketsphinx -lsphinxbase -lsphinxad”.

Next, you need to go to ‘Search Directories -> Compiler’ and add the directories “/usr/local/include/pocketsphinx” and “/usr/local/include/sphinxbase”. This is important, because since we downloaded the installers through their website rather than “apt-get”, their files are installed in a different directory (non-default). Below are two pictures showing the completed steps (apologies for quality):

Testing Pocketsphinx

Another test program is provided in “/test_files/test_pocketsphinx”, which takes an audio file and decodes it. Please read the README, as there is an additional step needed for it to work. You can use your previously made audio file from “/test_alsa_capture” for this.

rika_voice library addition - speech_recognition_wrapper

This wrapper internally uses both ALSA capture and Pocketsphinx to decode speech from the microphone. Again, a README is provided to give you additional details.

Extras

The Raspberry Pi has a screen blanking mode to reduce power consumption - this might cause some USB hardware to stop working. For safety, follow the steps outlined here to remove that feature. Additionally, you can install xscreensaver using:

sudo apt-get install xscreensaver

Access the GUI via ‘Start Menu -> Settings/Preferences -> Screensaver’ and disable screensaver using the dropdown menu at the top left of the GUI. Hopefully this enables Rika to keep running without any malfunctions.

Ending notes

Now that we have an input method and audio response method for rika_voice, we can tie them together in the main function with functions in rika_action; this could be GPIO stuff, scripts, whatever you’d like Rika to do.

Once I’ve cleaned up the code, I will release the current build as version 1.0, as this is a minimal program for others to familiarise with Rika first.

0 notes

rikatherobot-blog · 8 years ago

Text

Voicepack and Keyword Finder

rika_voice library addition - rika_voicepack

A folder for Rika’s dialogues. “rika_voicepack_lib.h” is a programmatic reference list for dialogue sets. “rika_voicepack_format.h” is a file to specify the format that you are using. A minimal voicepack is provided consisting of basic lines voice acted by the talented Susuri.

rika_extra library addition - find_keywords

A simple function that finds specified keywords in a source string, and returns a ‘1′ if all keywords are present, or ‘0′ if not. Uses function overloading; you can specify up to 3 keywords to be searched.

Ending notes

The base of Rika’s voice response is set. You only need to add sound files in “/rika_voicepack”, populate “rika_voicepack_lib.h” and use PlayRandomDialogue() in the main function. My next goal is to get voice recognition working - this might take a while to test and integrate. I do have ideas, but I first need to know how it works.

0 notes

rikatherobot-blog · 9 years ago

Text

Playing Audio Files

Since I want to make Rika seem more human, audio files of voice acted dialogues will be used for responses instead of say, text-to-speech. We will also use random number generation to randomly choose a dialogue in a dialogue set, making her responses varied.

Testing audio output

To configure your Raspberry Pi 3B’s audio output (3.5mm jack or HDMI), see here. Test the audio using a Youtube video or play a .wav audio file at your “/home/pi” directory via the terminal using:

aplay <your sound filename>.wav

Installing SDL2 and SDL_mixer v2.0

To install SDL2, see here. To install SDL_mixer v2.0, see here. Follow through until you have linked the library.

Testing SDL_mixer v2.0

Download a test sound file (.wav preferably, .mp3 works too). Rika’s file package contains a .cpp test file in “/test_files/test_sdl_mixer”. Specify the directory to your test sound file and run the program. The test sound file should play and once it ends, a success message will be printed.

Random number generation

You might need Linux kernel headers for this. Just connect to the internet and type into the terminal:

sudo apt-get install raspberrypi-kernel-headers

We will be using ‘time()’ as the seed for ‘srand()’ and get a random number via ‘rand()’. This is sufficient for our randomisation needs.

rika_voice library addition - sdl_mixer_wrapper

A wrapper function to play an audio file, or play a random audio file in a set (to be used with a voicepack).

Extras

A minor side note, recordings produced using “arecord” are apparently incompatible with SDL, for whatever reason. I’ve made sure it outputs a .wav file with the correct formats, but to no avail. For more info on the Raspberry Pi’s sound card commands, invoke:

man arecord or man aplay

The full documentation for SDL_mixer v2.0 can be found here.

Ending notes

If you’re looking for a nice pair of active speakers, I recommend the Logitech Z120 Laptop Speakers 3.5mm USB. These are USB powered and take audio via the 3.5mm jack. You can power it using the Pi’s USB port, and it reduces the need for extra power plugs. Some speakers give out a lot of static, especially when we programmatically open the audio device - see here on how to reduce it.

0 notes