Recording data for speech recognition training

From RobotinoWiki
Revision as of 13:59, 22 February 2011 by Verbeek (talk | contribs) (Your acoustic model files)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Pre-requisites

  • You must have a headset with a mic or a desktop boom mic. Preferably the same mic which will be used for speech recognition on the robot. Built in laptop or desktop mics are not recommended.
  • Make sure you have Audacity installed. This can be done as follows on linux
sudo apt-get install audacity
And the windows version can be downloaded here
  • Open Audacity and configure it as follows
    • In the Edit>Preferences>Devices (or Audio I/O); make sure that you select 'Channels: 1 (Mono)' under the 'Recording' section.
    • In the Edit>Preferences>Quality; make sure that the 'Default Sample Rate' is set to '16000Hz' and the 'Default Sample Format' is set to '16-bit'
  • Making a USB headset default device on linux. NOTE: No need to do this for Robotino. Robotino has been already configured to use a USB headset as default.
  • Create a new text file called '.asoundrc' and open it in gedit as follows
gedit ~/.asoundrc 
  • Paste the following text in the file
 pcm.!default {
         type asym
         playback.pcm {
                 type plug
                 slave.pcm "hw:1,0"
         }
         capture.pcm {
                 type plug
                 slave.pcm "hw:1,0"
         } 
 }
  • Save the file and restart the computer.

Prompts file

The "prompts" file contains the words you need to record for your individual speech audio files. Each line of the prompts file corresponds to the transcribed contents of one audio file. The first column contains the name of the audio file, and the following columns on the same line contain the text transcriptions of what is recorded in the audio file, see below:

sample1 ROBOTINO MOVE ROBOTINO ROTATE ROBOTINO STOP
sample2 ROBOTINO RIGHT ROBOTINO LEFT ROBOTINO FORWARD ROBOTINO BACKWARD
sample3 OPPRESSIVE AS THE HEAT HAD BEEN IT WAS NOW EVEN MORE OPPRESSIVE
sample4 ONE ONE TWO TWO THREE THREE FOUR FOUR FIVE FIVE
sample5 SIX SIX SEVEN SEVEN EIGHT EIGHT NINE NINE ZERO ZERO
sample6 A DEAD MAN IS OF NO USE ON A PLANTATION
sample7 NINE EIGHT SEVEN SIX FIVE FOUR THREE TWO ONE ZERO
sample8 ROBOTINO MOVE FORWARD ROBOTINO MOVE BACKWARD
sample9 HE CRIED IN SUCH GENUINE DISMAY THAT SHE BROKE INTO HEARTY LAUGHTER
sample10 ZERO ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE
sample11 DEGREE METER CENTIMETER DEGREES METERS CENTIMETERS
sample12 DOWN THERE THE EARTH WAS ALREADY SWELLING WITH LIFE
sample13 ROBOTINO MOVE LEFT ROBOTINO MOVE RIGHT
sample14 MINUS FOUR MINUS THREE MINUS TWO MINUS ONE MINUS ZERO
sample15 EACH DAY SHE BECAME A MORE VITAL PART OF HIM
sample16 RIGHT ONE THREE FIVE METERS LEFT SEVEN NINE METERS
sample17 ROBOTINO ROTATE MINUS NINE ZERO DEGREES
sample18 A MADDENING JOY POUNDED IN HIS BRAIN
sample19 ROBOTINO MOVE FORWARD SIX SEVEN CENTIMETERS MOVE FORWARD ONE METER
sample20 ROTATE SIX SEVEN DEGREES ROTATE EIGHT NINE DEGREES
sample21 SINCE THEN SOME MYSTERIOUS FORCE HAS BEEN FIGHTING US AT EVERY STEP
sample22 ROBOTINO STOP MINUS ZERO CENTIMETERS
sample23 ONE TWO THREE METERS THREE TWO ONE METERS
sample24 ROBOTINO ONE ROBOTINO METER ROBOTINO METER ROBOTINO ONE
sample25 SEVEN EIGHT NINE DEGREES NINE EIGHT SEVEN DEGREES
sample26 STOP ROTATE MOVE STOP ROTATE MOVE
sample27 I GUESS I CAN TALK AND WORK AT THE SAME TIME
sample28 ROBOTINO MOVE BACKWARD FIVE FIVE FIVE METERS
sample29 MOVE LEFT THREE ONE DEGREES MOVE RIGHT ONE THREE DEGREES
sample30 EACH INSULT ADDED TO THE VALUE OF THE CLAIM
sample31 STOP STOP MOVE MOVE ROTATE ROTATE
sample32 LEFT LEFT RIGHT RIGHT FORWARD FORWARD BACKWARD BACKWARD
sample33 HAVE YOU EVER EARNED A DOLLAR BY YOUR OWN LABOUR
sample34 FOUR FIVE SIX CENTIMETERS SIX FIVE FOUR CENTIMETERS
sample35 ROBOTINO CENTIMETERS ROBOTINO METERS ROBOTINO DEGREES
sample36 ONE METER ONE DEGREE ONE CENTIMETER
sample37 MINUS NINE MINUS EIGHT MINUS SEVEN MINUS SIX MINUS FIVE
sample38 SOMETIMES HER DREAMS WERE FILLED WITH VISIONS
sample39 MINUS ONE TWO MINUS THREE FOUR MINUS FIVE SIX
sample40 MINUS SEVEN EIGHT MINUS NINE ZERO
sample41 PHILIP BEGAN TO FEEL THAT HE HAD FOOLISHLY OVERESTIMATED HIS STRENGTH
sample42 FORWARD ZERO TWO FOUR CENTIMETERS BACKWARD SIX EIGHT CENTIMETERS
sample43 MOVE FORWARD STOP ROTATE FOUR FIVE DEGREES STOP
sample44 ABOUT HIM EVERYWHERE WERE THE EVIDENCES OF LUXURY AND OF AGE
sample45 ROBOTINO MOVE LEFT THREE FOUR METERS MOVE LEFT ONE CENTIMETER
sample46 ROBOTINO MOVE RIGHT ZERO ONE METERS MOVE RIGHT ONE CENTIMETER
sample47 DO YOU KNOW THAT YOU ARE SHAKING MY CONFIDENCE IN YOU
sample48 ROBOTINO MOVE BACKWARD NINE FIVE CENTIMETERS MOVE BACKWARD ONE METER
sample49 ROTATE ZERO ONE TWO DEGREES ROTATE THREE FOUR FIVE DEGREES
sample50 HER EFFORTS WERE NOT FUTILE AND IN NO TIME THE GOOSE BEGAN TO TIRE
sample51 ROBOTINO ROBOTINO ROTATE ROTATE STOP STOP MOVE MOVE
sample52 ROTATE RIGHT ROTATE LEFT ROTATE FORWARD ROTATE BACKWARD
sample53 I CAME FOR INFORMATION MORE OUT OF CURIOSITY THAN ANYTHING ELSE
sample54 CENTIMETERS CENTIMETERS DEGREES DEGREES METERS METERS
sample55 ROBOTINO MOVE ROTATE STOP ROBOTINO STOP ROTATE MOVE
sample56 ROBOTINO YES ROBOTINO NO ROBOTINO YES ROBOTINO NO
sample57 YES NO ONE TWO NO YES THREE FOUR YES NO FIVE SIX
sample58 SHE ATE A BOWL OF PORRIDGE AND CHECKED HERSELF IN THE MIRROR
sample59 YES CENTIMETER NO METER YES DEGREE NO CENTIMETERS YES METERS NO DEGREES
sample60 YES YES NO NO ROBOTINO ROBOTINO

Recording sentences

  • Start Audacity.
  • Make sure your microphone volume in Audacity is set to 1.0.
  • Then click Record (i.e. the red circle button) and begin speaking in your normal voice for a few seconds, and then click Stop (i.e. the yellow square button). Look at the Waveform Display for the audio track you just created. The Vertical Ruler to the left of the Waveform Display provides you with a guide to your audio levels. Try to keep your recording levels between 0.5 and -0.5, averaging around 0.3 to -0.3. It is OK to have a few spikes go outside the 0.5 to -0.5 range, but avoid having any go beyond the 1.0 to -1.0 range, as this will generate distortion. If necessary, adjust Audacity's microphone volume to keep your audio within the proper ranges.

Audacity screenshot.png

  • To begin, you should not have any tracks displayed in the Audacity window. If you do, click the x icon at the top left of the audio track display (or hit ctrl-z as many times as is required to remove them; or restart Audacity). If you don't Audacity will happily record your new track, and leave your old track untouched, and when you export your audio to a wav file, both tracks will be merged to your wav file.
  • Make sure your volumes are set properly, as outlined in the preceding section.
  • Record you first file by clicking 'Record' in Audacity and saying the words in the first line of your prompts file:
ROBOTINO MOVE ROBOTINO ROTATE ROBOTINO STOP
  • Speak normally - not too slow or too fast - and clearly. Pause slightly before you begin speaking and leave a short pause after you have completed (i.e. half a second pause before and after you speak). Remember not to breath out until you have clicked stop - most microphoness pick up breathing noises.
  • Click the 'Stop' icon when you are completed.
  • Review your waveform to ensure that highest and lowest peaks of your recording is between 0.5 and 1.0 in the upper range and the lower range is between -0.5 and -1.0. If they are, then listen to the file (press 'Play' in Audacity) to make sure your pronunciation is clear and that you do not hear any non-speech noises (i.e. breathing noises, lip smacking, or background noises, ...). If there are any problems, hit ctrl-z and re-record your file.
  • If the file sounds OK then click File>Export and make sure that the format is WAV signed 16 bit PCM. Name the file as sample1 (for the first sentence) and save it.
  • Repeat the same procedure for the rest of the sentences in the prompt file.

Uploading the data

  • Copy all the 'sample*.wav' files to a folder.
  • Create a file called 'info.txt' and paste your email address in the file. The 'info.txt' file should look like this
youremailaddress@xxxxxxx
  • Save the 'info.txt' file in the same folder as the wav files.
  • Zip the folder and upload it here

Your acoustic model files

Training is done automatically and you will be sent the acoustic model files to the given email address. Then copy those files to '/etc/robotino/sr/julius/acoustic_model_files' dir on your Robotino. Your robot is now ready to recognize your speech.