Difference between revisions of "Speech Recognition"

From RobotinoWiki
(Step 8)
(Commands recognized by Robotino)
 
(36 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Speech Recognition on Robotino ==
+
==Introduction==
 +
{|cellspacing="20" cellpadding="10"
 +
|- style="vertical-align:top"
 +
|[[Image:Robotino_sr_icon_64.png]]
 +
|Robotino can now comprehend basic human speech commands. This feature is integrated in the v2.4 OS for both 1GB and 4GB CF cards. The setup is preconfigured for a Logitech ClearChat PC Wireless Stereo Headset. Read below how to modify the setup to fit your audio device.
 +
{{#ev:youtube|VNE9QJsZv-s}}
 +
! style="text-align:left; width:20em; background-color:#dddddd"|
 +
=== Package links ===
 +
[[downloads#CF card images|CF card images]]
  
Robotino can now comprehend basic human speech commands. Robotino uses the open source speech recognition engine called Julius.
+
[http://julius.sourceforge.jp/en_index.php Julius]
 +
|}
  
== Setting up Speech Recognition ==
+
==Speech recognition==
  
Setting up speech recognition is done in two parts. The first part involves training of the speech data and creation of the acoustic model and the second involves the actual execution of the speech recognition engine with the created acoustic model.
+
Robotino has already been trained for a voice but this may not work for everyone. In case it doesn't, then a new acoustic model will have to be created for that person. There are two ways to do this
 +
#'''Recommended''' The acoustic model is created by us and we just need your training data -> [[Recording data for speech recognition training]]
 +
#[[Creating the acoustic model yourself]]
  
=== Creating an Acoustic Model ===
+
==Setup audio devices==
 
+
When the Logitech ClearChat headset is plugged into Robotino's USB port, the udev rule ''"/etc/udev/rules.d/99-robotinosr.rules"'' matches
The HTK toolkit (version 3.4) shall be used to create an acoustic model. Please follow the following steps.
+
<pre>SUBSYSTEMS=="usb", KERNEL=="hiddev[0-9]*", SYSFS{idVendor}=="046d", SYSFS{idProduct}=="0a12", RUN+="/usr/local/OpenRobotinoAPI/1/daemons/srd.sh start", SYMLINK+="headset"
 
 
==== Step 1 ====
 
You will need to [http://htk.eng.cam.ac.uk/register.shtml register] with HTK before you can download it. Please do so.
 
==== Step 2 ====
 
Download the sources for HTK toolkit 3.4 from [http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.tar.gz here].
 
Also download the HTK samples from [http://htk.eng.cam.ac.uk/ftp/software/HTK-samples-3.4.tar.gz here].
 
==== Step 3 ====
 
* Move to your home directory
 
<pre> cd ~ </pre>
 
* Create a directory called bin
 
<pre> mkdir bin </pre>
 
* Unpack the downloaded HTK toolkit sources and HTK samples in a folder called htk-3.4 in the bin directory. The bin directory should contain the following
 
<pre> 
 
htk-3.4 samples
 
</pre>
 
 
 
*Move the samples folder to the htk-3.4 folder as follows
 
<pre>
 
cd bin
 
mv samples htk-3.4
 
</pre>
 
 
 
* If you have a newer version of the gcc compiler (version 4 or above), you will need to install gcc version 3.4 so that HTK will compile properly. Use the following gcc's version command to see which version is installed on your system
 
<pre>gcc -v</pre>
 
 
 
* If your gcc version is 4 and above, follow the listed commands to install gcc 3.4
 
<pre>
 
sudo apt-get install gcc-3.4
 
sudo rm /usr/bin/gcc
 
sudo ln -s /usr/bin/gcc-3.4 /usr/bin/gcc
 
</pre>
 
 
 
* Install the external dependencies as follows
 
<pre>
 
sudo apt-get install libx11-dev libesd0-dev libasound2-dev libzip1 flex
 
</pre>
 
 
 
* Now move to the htk-3.4 dir and configure htk as follows. Note change %yourusername% from the command to your user name.  
 
<pre>
 
cd htk-3.4
 
./configure --prefix=/home/%yourusername%/bin/htk-3.4
 
</pre>
 
 
 
* Now run make all and make install. This should install the created binaries to the folder /home/yourusername/bin/htk-3.4/bin .
 
<pre>
 
make all
 
make install
 
</pre>
 
 
 
* Change directory back to home and create a folder called voxforge and then a folder called HTK_scripts in the voxforge folder.
 
<pre>
 
cd ~
 
mkdir voxforge
 
cd voxforge
 
mkdir HTK_scripts
 
cd HTK_scripts
 
</pre>
 
 
 
* Now copy some scripts from the htk-3.4/samples folder to the HTK_scripts folder as follows
 
<pre>
 
cp ../../bin/htk-3.4/samples/RMHTK/perl_scripts/mkclscript.prl .
 
cp ../../bin/htk-3.4/samples/HTKTutorial/maketrihed .
 
cp ../../bin/htk-3.4/samples/HTKTutorial/prompts2mlf .
 
cp ../../bin/htk-3.4/samples/HTKTutorial/prompts2wlist .
 
</pre>
 
 
 
* Your HTK_scripts folder should contain the following
 
<pre>
 
maketrihed  mkclscript.prl  prompts2mlf  prompts2wlist
 
</pre>
 
 
 
==== Step 4 ====
 
 
 
* Now we will download Julius (version 4.5.1). We shall be using pre-compiled libraries which can downloaded from [http://sourceforge.jp/projects/julius/downloads/47530/julius-4.1.5-linuxbin.tar.gz/ here]
 
 
 
* Once downloaded extract them to your /home/%yourusername%/bin folder. After that is done your bin folder should contain the following
 
<pre>
 
htk-3.4  julius-4.1.5-linuxbin
 
 
</pre>
 
</pre>
 +
udev runs ''"/usr/local/OpenRobotinoAPI/1/daemons/srd.sh start"''
  
==== Step 5 ====
+
Copy ''"/etc/udev/rules.d/99-robotinosr.rules"'' to ''"/etc/udev/rules.d/99-myheadset.rules"'' and modify the new rule to fit your hardware. How to do this can be read [http://wiki.ubuntuusers.de/UDEV here].
  
* Now you will need to update your user path which can be done as follows. First change to your home directory and edit the .bashrc file.
+
''"/usr/local/OpenRobotinoAPI/1/daemons/srd.sh"'' uses the ''"/usr/local/OpenRobotinoAPI/1/daemons/configure_alsa.sh"'' to write a valid alsa configuration to ''"/root/.asoundrc"''. The audo interface from which card and device number are taken is given in ''"/etc/robotino/sr/devicename"''. Modify this file to match the output from ''"aplay -l"''.
<pre>
 
cd ~
 
gedit .bashrc
 
</pre>
 
  
* Add the following to the end of the .bashrc file. Note change %yourusername% from the command to your username.
+
Test the setup by running ''"/usr/local/OpenRobotinoAPI/1/daemons/configure_alsa.sh"'' and see if ''"/root/.asoundrc"'' is reasonable. Then run ''"/usr/local/OpenRobotinoAPI/1/daemons/srd.sh start"''. If this works test your udev rule by detaching and reattaching your USB audio device and see if srd comes up.
<pre>
 
# HTK and JULIUS scripts and executables
 
PATH=$PATH:$HOME/bin:/home/%yourusename%/bin/htk-3.4/bin:/home/%yourusename%/bin/julius-4.1.5-linuxbin/bin
 
</pre>
 
  
* Source your .bashrc file to reflect the changes
+
==Commands recognized by Robotino==
<pre>
 
source ~/.bashrc
 
</pre>
 
  
* Test if your HTK toolkit has been installed correctly by running the following command.
+
Robotino can currently recognize the following commands:
<pre>
 
HVite -V
 
</pre>
 
:You should see an output similar to the following.
 
<pre>
 
  
HTK Version Information
 
Module    Version    Who    Date      : CVS Info
 
HVite      3.4        CUED  25/04/06  : $Id: HVite.c,v 1.1.1.1 2006/10/11 09:55:02 jal58 Exp $
 
HShell    3.4        CUED  25/04/06  : $Id: HShell.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HMem      3.4        CUED  25/04/06  : $Id: HMem.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HLabel    3.4        CUED  25/04/06  : $Id: HLabel.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
HMath      3.4        CUED  25/04/06  : $Id: HMath.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HSigP      3.4        CUED  25/04/06  : $Id: HSigP.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HWave      3.4        CUED  25/04/06  : $Id: HWave.c,v 1.1.1.1 2006/10/11 09:54:59 jal58 Exp $
 
HAudio    3.4        CUED  25/04/06  : $Id: HAudio.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
HVQ        3.4        CUED  25/04/06  : $Id: HVQ.c,v 1.1.1.1 2006/10/11 09:54:59 jal58 Exp $
 
HModel    3.4        CUED  25/04/06  : $Id: HModel.c,v 1.2 2006/12/07 11:09:08 mjfg Exp $
 
HParm      3.4        CUED  25/04/06  : $Id: HParm.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HDict      3.4        CUED  25/04/06  : $Id: HDict.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
HNet      3.4        CUED  25/04/06  : $Id: HNet.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HRec      3.4        CUED  25/04/06  : $Id: HRec.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HUtil      3.4        CUED  25/04/06  : $Id: HUtil.c,v 1.1.1.1 2006/10/11 09:54:59 jal58 Exp $
 
HAdapt    3.4        CUED  25/04/06  : $Id: HAdapt.c,v 1.2 2006/12/07 11:09:07 mjfg Exp $
 
HMap      3.4        CUED  25/04/06  : $Id: HMap.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
 
</pre>
 
 
* Test if Julius has been installed correctly by entering the following command in the terminal
 
 
<pre>
 
<pre>
julius-4.1.5
+
ROBOTINO STOP
</pre>
 
: You should see an output similar to the following
 
<pre>
 
Julius rev.4.1.5 - based on
 
JuliusLib rev.4.1.5 (fast)  built for i686-pc-linux
 
  
Copyright (c) 1991-2009 Kawahara Lab., Kyoto University
+
ROBOTINO ROTATE *NUMBER* DEGREES
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
+
ROBOTINO ROTATE *NUMBER* *NUMBER* DEGREES
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
+
ROBOTINO ROTATE *NUMBER* *NUMBER* *NUMBER* DEGREES
Copyright (c) 2005-2009 Julius project team, Nagoya Institute of Technology
 
  
Try '-setting' for built-in engine configuration.
+
ROBOTINO ROTATE MINUS *NUMBER* DEGREES
Try '-help' for run time options.
+
ROBOTINO ROTATE MINUS *NUMBER* *NUMBER* DEGREES
</pre>
+
ROBOTINO ROTATE MINUS *NUMBER* *NUMBER* *NUMBER* DEGREES
* Now to switch back to your original gcc version, do the following (The original version in my case was 4.3, yours may differ)
 
<pre>
 
sudo rm /usr/bin/gcc
 
sudo ln -s /usr/bin/gcc-4.3 /usr/bin/gcc
 
</pre>
 
  
==== Step 6 ====
+
ROBOTINO MOVE FORWARD *NUMBER* METERS
 +
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* METERS
 +
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* *NUMBER* METERS
 +
ROBOTINO MOVE FORWARD *NUMBER* CENTIMETERS
 +
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* CENTIMETERS
 +
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* *NUMBER* CENTIMETERS
  
Install Audacity as follows
+
ROBOTINO MOVE BACKWARD *NUMBER* METERS
<pre>
+
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* METERS
sudo apt-get install audacity
+
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* *NUMBER* METERS
</pre>
+
ROBOTINO MOVE BACKWARD *NUMBER* CENTIMETERS
 +
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* CENTIMETERS
 +
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* *NUMBER* CENTIMETERS
  
==== Step 7 ====
+
ROBOTINO MOVE LEFT *NUMBER* METERS
We will now compile the grammar and voca files.
+
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* METERS
 +
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* *NUMBER* METERS
 +
ROBOTINO MOVE LEFT *NUMBER* CENTIMETERS
 +
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* CENTIMETERS
 +
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* *NUMBER* CENTIMETERS
  
* Create a folder called auto in your /home/%yourusename%/voxforge directory
+
ROBOTINO MOVE RIGHT *NUMBER* METERS
<pre>
+
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* METERS
cd ~
+
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* *NUMBER* METERS
cd voxforge
+
ROBOTINO MOVE RIGHT *NUMBER* CENTIMETERS
mkdir auto
+
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* CENTIMETERS
cd auto
+
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* *NUMBER* CENTIMETERS
</pre>
 
  
* Download the grammar and voca files from here. Extract them in your auto folder you just created. After extraction your auto folder should look like this
+
*NUMBER* = ZERO, ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE
<pre>
 
ls
 
robotino.grammar  robotino.voca
 
 
</pre>
 
</pre>
  
* Now compile the grammar and voca files to Julius files. Make sure you are in the auto folder. Run the following commandrob
+
Examples:
 
<pre>
 
<pre>
mkdfa.pl robotino
+
ROBOTINO MOVE FORWARD ONE METER
 
</pre>
 
</pre>
:You should see an output as follows
 
 
<pre>
 
<pre>
robotino.grammar has 11 rules
+
ROBOTINO ROTATE ONE EIGHT ZERO DEGREES
robotino.voca    has 10 categories and 27 words
 
---
 
Now parsing grammar file
 
Now modifying grammar to minimize states[7]
 
Now parsing vocabulary file
 
Now making nondeterministic finite automaton[31/31]
 
Now making deterministic finite automaton[24/24]
 
Now making triplet list[24/24]
 
10 categories, 24 nodes, 32 arcs
 
-> minimized: 11 nodes, 19 arcs
 
---
 
generated: robotino.dfa robotino.term robotino.dict
 
 
 
 
</pre>
 
</pre>
 
==== Step 8 ====
 
Now we shall proceed to the training and creation of the acoustic model.
 
 
* Download the prompts file and save it in your '/home/%yourusername%/voxforge/auto' folder. Your 'voxforge/auto' folder should look like this
 
<pre>
 
prompts      robotino.dict    robotino.term
 
robotino.dfa  robotino.grammar  robotino.voca
 
</pre>
 
 
* Now create a folder called 'lexicon' in the 'voxforge' directory.
 
 
<pre>
 
<pre>
cd ~
+
ROBOTINO MOVE BACKWARD FOUR FIVE CENTIMETERS
cd voxforge
 
mkdir lexicon
 
 
</pre>
 
</pre>

Latest revision as of 15:32, 22 February 2011

Introduction

Robotino sr icon 64.png Robotino can now comprehend basic human speech commands. This feature is integrated in the v2.4 OS for both 1GB and 4GB CF cards. The setup is preconfigured for a Logitech ClearChat PC Wireless Stereo Headset. Read below how to modify the setup to fit your audio device.

{{#ev:youtube|VNE9QJsZv-s}}

Package links

CF card images

Julius

Speech recognition

Robotino has already been trained for a voice but this may not work for everyone. In case it doesn't, then a new acoustic model will have to be created for that person. There are two ways to do this

  1. Recommended The acoustic model is created by us and we just need your training data -> Recording data for speech recognition training
  2. Creating the acoustic model yourself

Setup audio devices

When the Logitech ClearChat headset is plugged into Robotino's USB port, the udev rule "/etc/udev/rules.d/99-robotinosr.rules" matches

SUBSYSTEMS=="usb", KERNEL=="hiddev[0-9]*", SYSFS{idVendor}=="046d", SYSFS{idProduct}=="0a12", RUN+="/usr/local/OpenRobotinoAPI/1/daemons/srd.sh start", SYMLINK+="headset"

udev runs "/usr/local/OpenRobotinoAPI/1/daemons/srd.sh start"

Copy "/etc/udev/rules.d/99-robotinosr.rules" to "/etc/udev/rules.d/99-myheadset.rules" and modify the new rule to fit your hardware. How to do this can be read here.

"/usr/local/OpenRobotinoAPI/1/daemons/srd.sh" uses the "/usr/local/OpenRobotinoAPI/1/daemons/configure_alsa.sh" to write a valid alsa configuration to "/root/.asoundrc". The audo interface from which card and device number are taken is given in "/etc/robotino/sr/devicename". Modify this file to match the output from "aplay -l".

Test the setup by running "/usr/local/OpenRobotinoAPI/1/daemons/configure_alsa.sh" and see if "/root/.asoundrc" is reasonable. Then run "/usr/local/OpenRobotinoAPI/1/daemons/srd.sh start". If this works test your udev rule by detaching and reattaching your USB audio device and see if srd comes up.

Commands recognized by Robotino

Robotino can currently recognize the following commands:

ROBOTINO STOP

ROBOTINO ROTATE *NUMBER* DEGREES
ROBOTINO ROTATE *NUMBER* *NUMBER* DEGREES
ROBOTINO ROTATE *NUMBER* *NUMBER* *NUMBER* DEGREES

ROBOTINO ROTATE MINUS *NUMBER* DEGREES
ROBOTINO ROTATE MINUS *NUMBER* *NUMBER* DEGREES
ROBOTINO ROTATE MINUS *NUMBER* *NUMBER* *NUMBER* DEGREES

ROBOTINO MOVE FORWARD *NUMBER* METERS
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* METERS
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* *NUMBER* METERS
ROBOTINO MOVE FORWARD *NUMBER* CENTIMETERS
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* CENTIMETERS
ROBOTINO MOVE FORWARD *NUMBER* *NUMBER* *NUMBER* CENTIMETERS

ROBOTINO MOVE BACKWARD *NUMBER* METERS
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* METERS
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* *NUMBER* METERS
ROBOTINO MOVE BACKWARD *NUMBER* CENTIMETERS
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* CENTIMETERS
ROBOTINO MOVE BACKWARD *NUMBER* *NUMBER* *NUMBER* CENTIMETERS

ROBOTINO MOVE LEFT *NUMBER* METERS
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* METERS
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* *NUMBER* METERS
ROBOTINO MOVE LEFT *NUMBER* CENTIMETERS
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* CENTIMETERS
ROBOTINO MOVE LEFT *NUMBER* *NUMBER* *NUMBER* CENTIMETERS

ROBOTINO MOVE RIGHT *NUMBER* METERS
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* METERS
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* *NUMBER* METERS
ROBOTINO MOVE RIGHT *NUMBER* CENTIMETERS
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* CENTIMETERS
ROBOTINO MOVE RIGHT *NUMBER* *NUMBER* *NUMBER* CENTIMETERS

*NUMBER* = ZERO, ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE

Examples:

ROBOTINO MOVE FORWARD ONE METER
ROBOTINO ROTATE ONE EIGHT ZERO DEGREES
ROBOTINO MOVE BACKWARD FOUR FIVE CENTIMETERS