1 of 1

Voice Command Module

Demo

Function introduction

Using this module, you can control the Petoi robot to do various skills through voice without using wake words. Currently, the module supports 40 fixed voice commands in two languages (English and Chinese) and ten customized commands that you can record any sound clips.

Hardware setup

Connecting to the NyBoard with wire as shown in the following picture, connect to the NyBoard Grove interface which include D8, D9:

Software setup

The module code is integrated into the OpenCat project. Uncomment the line #define VOICE in the OpenCat.ino, as shown in the figure below. Then use Arduino IDE to upload the sketch to the robot main board. Robot will work in Voice mode. This sketch will allow the robot to behave under voice commands.

Play with the voice commands

Default Usage

1. Switch the language and toggle the audio response

You can speak "BingBing" to switch to English or "Di-Di" to switch to Chinese.

Speak "Play sound" to turn on the audio response or "Be quiet" to turn off the audio response.

2. Use the predefined voice commands

You can refer to the list of available voice commands shown below:

See this doc for the latest version.

In order to avoid inadvertently triggering the robot to respond to voice commands, you can say "Be quiet" to the robot to disable the voice module. Such as when you are talking with other people.

You can say "Play sound" to the robot to enable the voice module.

If the above voice commands don't take effect when it is in English mode, try to use the mobile app and create a new button with the command "X65,100", or enter "XAd" in the serial monitor to disable the voice module.

Use the mobile app and create a new button with the command "X65,99", or enter "XAc" in the serial monitor to enable the voice module.

The voice command "climb up" is a challenge for you. You can design the behavior by yourself. Then, you can post it on the Petoi Forum Challenge or email support@petoi.com. We may adopt it in our official firmware and send you a gift!

For example, you can share your behavior like this:

How to debug if the voice command doesn't work

In some cases, the voice module may not respond to your voice. Please check the following:

1. On Bittle X, the dial switch on the bottom of the BiBoard extension hat is dialed to Voice, not UART2.

2. Say "Play sound" to check if the robot responds with "Do-Re-Mi." Sometimes the voice may be accidentally set to muted mode triggered by "Be quiet."

3. If the module doesn't make any sound with "Play sound," say "Bing-Bing" to switch to English mode. You may try different tones and speeds to say "Bing-Bing." The robot should respond with "switch English" if not set in English mode. It won't say anything if it's already in English.

4. If the voice module still doesn't make any sound, try to use the mobile app and create a new button with the command "X65,97" or enter "XAa" in the serial monitor. It's equivalent to saying "Bing-Bing" but excludes the chance that the voice is not recognized. Then you can try to say "Play sound" again.

or use the mobile app and create a new button with the command "X65,99", or enter "XAc" in the serial monitor to enable the voice module.

5. The above steps validate that the voice module is working. It's powered separately from the motion unit.

6. Next, if you say "Hello," the robot should wave its hand and validate the complete reaction loop is good. Then, you can try other voice commands.

Try to power off the mainboard by disconnecting the USB data cable and long-pressing the battery's button. and re-power on the mainboard.

If the above steps cannot fix the problem, contact support@petoi.com for help.

Record customized voice commands

When the robot works in English mode, you can speak "Start learning" (or send the command "XAe" in the serial monitor) into the custom voice command mode and record your voice commands in order.

If the module is not in English mode, you can speak "Bing-Bing" (or send the command "XAa" in the serial monitor) to switch to English mode.

You can record up to 10 voice commands.

To exit the custom voice command mode in the middle, you can speak "Stop learning" (or send the command "XAf" in the serial monitor).

After exiting the custom voice command mode, speak one of the recorded voice commands to trigger the reaction.

Speak "Clear the learning data" to delete all the recordings at once (you cannot delete a specific recording).

There are 10 skill strings as custom replies already defined (but only the first five can see the actual reaction of the robot because they are predefined serial commands ) in the voice.h:

const char voice1[] PROGMEM = "T";                                    //call the last skill data sent by the Skill Composer
const char voice2[] PROGMEM = "kpu1";                                 //single-handed pushup
const char voice3[] PROGMEM = "m0 80 0 -80";                          //move head
const char voice4[] PROGMEM = "kmw";                                  //moonwalk
const char voice5[] PROGMEM = "b14,8,14,8,21,8,21,8,23,8,23,8,21,4";  //twinkle star
const char voice6[] PROGMEM = "6th";
const char voice7[] PROGMEM = "7th";
const char voice8[] PROGMEM = "8th";
const char voice9[] PROGMEM = "9th";
const char voice10[] PROGMEM = "10th";

The response actions ("kpu1" means single-handed pushup, "kmw" means moonwalk) are already defined in the program.

Other serial commands are also supported as responses, such as joint movements(e.g. "m0 80 0 -80" means shake head left and right) and playing a melody(e.g. "b14,8,14,8,21,8,21,8,23,8,23,8,21,4")

To use these custom replies above, you need to enter the custom voice command mode and record 10 voice commands (such as "single-handed pushup", "shake head", "moonwalk", "twinkle star") first, and then exit the custom voice command mode.

If you have recorded a voice command and the corresponding custom reply is not a predefined serial command((e.g.,"5th"), there is no actual demonstration effect; it only prints a simple message on the serial monitor when you speak the corresponding voice command.

Advanced usage for developers

Understand the principle

Convert the voice command collected by the microphone in the module into a serial command
Send the serial command to the mainboard MCU through the soft serial port Serial2
After receiving the serial command, the MCU parses it into the corresponding skill command, and finally, the reaction module, according to the skill command, controls the robot to respond accordingly

Upload the demo sketch testVoiceCommander.ino, you can see every serial command that is sent to MCU(including the custom voice command if you have recorded it)

You can open the serial monitor to check the raw return values of every voice command.

After you speak the voice command to the robot, the Returned value ("XA11" or "XA21kup") is the corresponding serial command which is sent to the mainboard MCU. In fact, the third number("11" or "21") is an invisible character, for understanding, we convert it to a numeric value and print it out.

The test sketch

The test sketch is in the OpenCat repository on GitHub (specific path: OpenCat/ModuleTests/testVoiceCommander). You can visit our GitHub repository https://github.com/PetoiCamp/OpenCat to download the complete code, as shown in the following picture:

Serial interface

There are seven related serial commands for configuration, you can use them in the serial monitor.

After inputting the command above in the message box, press Enter to send the command to the robot.

How to design new reactions

For the robot in Voice mode, in order to improve the utilization rate of custom voice control commands, you can modify the last 6 skill strings to the skill names with actual action responses.

Using the task queue to create a sequence of motions, please refer to the source code in the voice.h as below:

if (index < 61) {
    token = raw[3];         //T_SKILL;
    shift = 4;              //3;
}
const char *cmd = raw.c_str() + shift;
tQueue->addTask(token, shift > 0 ? cmd : "", 2000);
char end = cmd[strlen(cmd) - 1];
if (!strcmp(cmd, "bk") || !strcmp(cmd, "x") || end >= 'A' && end <= 'Z' || end == 'x') {
    tQueue->addTask('k', "up");

tQueue is the task queue defined in OpenCat.h, using the method "addTask" of this object, the robot can do some simple skills sequentially as a custom voice command response.

Using the Skill Composer and binding the customized voice command to the new skills

Use SkillComposer to design new skills and then export them into InstinctX.h
Modify voice.h to bind the customized voice command to the new skills: just insert 'k'+the new skill name into the string variable（e.g.voice1[], if you want to bind the first customized voice command )
```
const char voice1[] PROGMEM = "kskill1";     // "k" is the token for skill, skill1 is the new skill name. 
```