Difference between revisions of "Speech Recognition"

From RobotinoWiki
m (Speech Recognition on Robotino)
(Replaced content with '== Speech Recognition on Robotino == Robotino can now comprehend basic human speech commands. Robotino uses the open source speech recognition engine called Julius. [[Recor…')
Line 6: Line 6:
  
 
[[Creating the acoustic model yourself]]
 
[[Creating the acoustic model yourself]]
 
== Setting up Speech Recognition ==
 
 
Setting up speech recognition is done in two parts. The first part involves training of the speech data and creation of the acoustic model and the second involves the actual execution of the speech recognition engine with the created acoustic model.
 
 
=== Creating an Acoustic Model ===
 
 
The HTK toolkit (version 3.4) shall be used to create an acoustic model. Please follow the following steps.
 
 
==== Step 1 ====
 
You will need to [http://htk.eng.cam.ac.uk/register.shtml register] with HTK before you can download it. Please do so.
 
==== Step 2 ====
 
Download the sources for HTK toolkit 3.4 from [http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.tar.gz here].
 
Also download the HTK samples from [http://htk.eng.cam.ac.uk/ftp/software/HTK-samples-3.4.tar.gz here].
 
==== Step 3 ====
 
* Move to your home directory
 
<pre> cd ~ </pre>
 
* Create a directory called 'bin'
 
<pre> mkdir bin </pre>
 
* Unpack the downloaded HTK toolkit sources and HTK samples in a folder called 'htk-3.4' in the 'bin' directory. The 'bin' directory should contain the following
 
<pre> 
 
htk-3.4 samples
 
</pre>
 
 
*Move the 'samples' folder to the 'htk-3.4' folder as follows
 
<pre>
 
cd bin
 
mv samples htk-3.4
 
</pre>
 
 
* If you have a newer version of the gcc compiler (version 4 or above), you will need to install gcc version 3.4 so that HTK will compile properly. Use the following gcc's version command to see which version is installed on your system
 
<pre>gcc -v</pre>
 
 
* If your gcc version is 4 and above, follow the listed commands to install gcc 3.4
 
<pre>
 
sudo apt-get install gcc-3.4
 
sudo rm /usr/bin/gcc
 
sudo ln -s /usr/bin/gcc-3.4 /usr/bin/gcc
 
</pre>
 
 
'''NOTE''' - if the above doesn't work for you then maybe the hardy ubuntu package repository is not in your sources.list file. In that case, do the following. If it does work, then skip to the next bullet point.
 
<pre>
 
sudo gedit /etc/apt/sources.list
 
</pre>
 
:add the following line to the end of the sources.list file
 
<pre>
 
deb http://de.archive.ubuntu.com/ubuntu/ hardy main universe
 
</pre>
 
:now run the following command
 
<pre>
 
sudo apt-get update
 
</pre>
 
:This should pull the hardy ubuntu packages from the repository. You can now run the following commands.
 
<pre>
 
sudo apt-get install gcc-3.4
 
sudo rm /usr/bin/gcc
 
sudo ln -s /usr/bin/gcc-3.4 /usr/bin/gcc
 
</pre>
 
 
* Install the external dependencies as follows
 
<pre>
 
sudo apt-get install libx11-dev libesd0-dev libasound2-dev libzip1 flex
 
</pre>
 
 
* Now move to the 'htk-3.4' dir and configure htk as follows. Note change %yourusername% from the command to your user name.
 
<pre>
 
cd htk-3.4
 
./configure --prefix=/home/%yourusername%/bin/htk-3.4
 
</pre>
 
 
* Now run make all and make install. This should install the created binaries to the folder '/home/yourusername/bin/htk-3.4/bin' .
 
<pre>
 
make all
 
make install
 
</pre>
 
 
* Change directory back to home and create a folder called 'voxforge' and then a folder called 'HTK_scripts' in the voxforge folder.
 
<pre>
 
cd ~
 
mkdir voxforge
 
cd voxforge
 
mkdir HTK_scripts
 
cd HTK_scripts
 
</pre>
 
 
* Now copy some scripts from the 'htk-3.4/samples' folder to the 'HTK_scripts' folder as follows
 
<pre>
 
cp ../../bin/htk-3.4/samples/RMHTK/perl_scripts/mkclscript.prl .
 
cp ../../bin/htk-3.4/samples/HTKTutorial/maketrihed .
 
cp ../../bin/htk-3.4/samples/HTKTutorial/prompts2mlf .
 
cp ../../bin/htk-3.4/samples/HTKTutorial/prompts2wlist .
 
</pre>
 
 
* Your 'HTK_scripts' folder should contain the following
 
<pre>
 
maketrihed  mkclscript.prl  prompts2mlf  prompts2wlist
 
</pre>
 
 
==== Step 4 ====
 
 
* Now we will download Julius (version 4.5.1). We shall be using pre-compiled binaries which can downloaded from [http://sourceforge.jp/projects/julius/downloads/47530/julius-4.1.5-linuxbin.tar.gz/ here]
 
 
* Once downloaded extract them to your '/home/%yourusername%/bin' folder. After that is done your 'bin' folder should contain the following
 
<pre>
 
htk-3.4  julius-4.1.5-linuxbin
 
</pre>
 
 
==== Step 5 ====
 
 
* Now you will need to update your user path which can be done as follows. First change to your home directory and edit the .bashrc file.
 
<pre>
 
cd ~
 
gedit .bashrc
 
</pre>
 
 
* Add the following to the end of the .bashrc file. Note change %yourusername% from the command to your username.
 
<pre>
 
# HTK and JULIUS scripts and executables
 
PATH=$PATH:$HOME/bin:/home/%yourusename%/bin/htk-3.4/bin:/home/%yourusename%/bin/julius-4.1.5-linuxbin/bin
 
</pre>
 
 
* Source your .bashrc file to reflect the changes
 
<pre>
 
source ~/.bashrc
 
</pre>
 
 
* Test if your HTK toolkit has been installed correctly by running the following command.
 
<pre>
 
HVite -V
 
</pre>
 
:You should see an output similar to the following.
 
<pre>
 
 
HTK Version Information
 
Module    Version    Who    Date      : CVS Info
 
HVite      3.4        CUED  25/04/06  : $Id: HVite.c,v 1.1.1.1 2006/10/11 09:55:02 jal58 Exp $
 
HShell    3.4        CUED  25/04/06  : $Id: HShell.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HMem      3.4        CUED  25/04/06  : $Id: HMem.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HLabel    3.4        CUED  25/04/06  : $Id: HLabel.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
HMath      3.4        CUED  25/04/06  : $Id: HMath.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HSigP      3.4        CUED  25/04/06  : $Id: HSigP.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HWave      3.4        CUED  25/04/06  : $Id: HWave.c,v 1.1.1.1 2006/10/11 09:54:59 jal58 Exp $
 
HAudio    3.4        CUED  25/04/06  : $Id: HAudio.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
HVQ        3.4        CUED  25/04/06  : $Id: HVQ.c,v 1.1.1.1 2006/10/11 09:54:59 jal58 Exp $
 
HModel    3.4        CUED  25/04/06  : $Id: HModel.c,v 1.2 2006/12/07 11:09:08 mjfg Exp $
 
HParm      3.4        CUED  25/04/06  : $Id: HParm.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HDict      3.4        CUED  25/04/06  : $Id: HDict.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
HNet      3.4        CUED  25/04/06  : $Id: HNet.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HRec      3.4        CUED  25/04/06  : $Id: HRec.c,v 1.1.1.1 2006/10/11 09:54:58 jal58 Exp $
 
HUtil      3.4        CUED  25/04/06  : $Id: HUtil.c,v 1.1.1.1 2006/10/11 09:54:59 jal58 Exp $
 
HAdapt    3.4        CUED  25/04/06  : $Id: HAdapt.c,v 1.2 2006/12/07 11:09:07 mjfg Exp $
 
HMap      3.4        CUED  25/04/06  : $Id: HMap.c,v 1.1.1.1 2006/10/11 09:54:57 jal58 Exp $
 
 
</pre>
 
 
* Test if Julius has been installed correctly by entering the following command in the terminal
 
<pre>
 
julius-4.1.5
 
</pre>
 
: You should see an output similar to the following
 
<pre>
 
Julius rev.4.1.5 - based on
 
JuliusLib rev.4.1.5 (fast)  built for i686-pc-linux
 
 
Copyright (c) 1991-2009 Kawahara Lab., Kyoto University
 
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
 
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
 
Copyright (c) 2005-2009 Julius project team, Nagoya Institute of Technology
 
 
Try '-setting' for built-in engine configuration.
 
Try '-help' for run time options.
 
</pre>
 
* Now to switch back to your original gcc version, do the following (The original version in my case was 4.3, yours may differ)
 
<pre>
 
sudo rm /usr/bin/gcc
 
sudo ln -s /usr/bin/gcc-4.3 /usr/bin/gcc
 
</pre>
 
 
==== Step 6 ====
 
 
Install Audacity as follows
 
<pre>
 
sudo apt-get install audacity
 
</pre>
 
 
==== Step 7 ====
 
We will now compile the grammar and voca files.
 
 
* Create a folder called 'auto' in your '/home/%yourusename%/voxforge' directory
 
<pre>
 
cd ~
 
cd voxforge
 
mkdir auto
 
cd auto
 
</pre>
 
 
* Download the grammar and voca files from here. Extract them in your 'auto' folder you just created. After extraction your 'auto' folder should contain the following
 
<pre>
 
robotino.grammar  robotino.voca
 
</pre>
 
 
* Now compile the grammar and voca files to Julius files. Make sure you are in the 'auto' folder. Run the following command
 
<pre>
 
mkdfa.pl robotino
 
</pre>
 
 
'''KNOWN ERROR''' - in case you get an error as follows while running the command above
 
<pre>
 
/usr/X11R6/bin/perl: bad interpreter: No such file or directory
 
</pre>
 
:Then open the mkdfa.pl file
 
<pre>
 
gedit ~/bin/julius-4.1.5-linuxbin/bin/mkdfa.pl
 
</pre>
 
:And change the first line from
 
<pre>
 
#!/usr/X11R6/bin/perl
 
</pre>
 
:To
 
<pre>
 
#!/usr/bin/perl
 
</pre>
 
:And run the command again
 
<pre>
 
mkdfa.pl robotino
 
</pre>
 
:You should see an output as follows
 
<pre>
 
robotino.grammar has 11 rules
 
robotino.voca    has 10 categories and 27 words
 
---
 
Now parsing grammar file
 
Now modifying grammar to minimize states[7]
 
Now parsing vocabulary file
 
Now making nondeterministic finite automaton[31/31]
 
Now making deterministic finite automaton[24/24]
 
Now making triplet list[24/24]
 
10 categories, 24 nodes, 32 arcs
 
-> minimized: 11 nodes, 19 arcs
 
---
 
generated: robotino.dfa robotino.term robotino.dict
 
 
</pre>
 
 
==== Step 8 ====
 
Now we shall proceed to the training and creation of the acoustic model.
 
 
* Download the prompts file and save it in your '/home/%yourusername%/voxforge/auto' folder. Your 'voxforge/auto' folder should look like this
 
<pre>
 
prompts      robotino.dict    robotino.term
 
robotino.dfa  robotino.grammar  robotino.voca
 
</pre>
 
 
* Now create a folder called 'lexicon' in the 'voxforge' directory.
 
<pre>
 
cd ~
 
cd voxforge
 
mkdir lexicon
 
</pre>
 
 
* Download the lexicon file from here and save it in the 'voxforge/lexicon' folder you just created.
 
 
==== Step 9 ====
 
We shall now record the training data.
 
 
*Create folder called 'train' in the 'voxforge' folder and then a folder called 'wav' in the 'train' folder.
 
<pre>
 
cd ~/voxforge
 
mkdir train
 
cd train
 
mkdir wav
 
</pre>
 
 
*Open the prompts file from the /home/%yourusername%/voxforge/auto folder in a text editor (for example gedit).
 
 
*Open Audacity and configure it as follows
 
** In the Edit>Preferences>Devices (or Audio I/O); make sure that you select 'Channels: 1 (Mono)' under the 'Recording' section.
 
** In the Edit>Preferences>Quality; make sure that the 'Default Sample Rate' is set to '48000 Hz' and the 'Default Sample Format' is set to '16-bit'
 

Revision as of 16:58, 16 February 2011

Speech Recognition on Robotino

Robotino can now comprehend basic human speech commands. Robotino uses the open source speech recognition engine called Julius.

Recording data for speech recognition training

Creating the acoustic model yourself