Speech recognition

Thu Mar 26 02:46:47 UTC 2009

On Wed, Mar 25, 2009 at 6:17 PM, Olivier Galibert <galibert at pobox.com> wrote:
> For speech recognition, software is only part of the problem and,
> fundamentally, the easiest one (take the algorithms, implement them,
> optimize/debug at will).  The real problem is the data needed to build
> the models to feed the algorithms.  There isn't as far as I know any
> reasonable set of corpus available under an open source license usable
> to build a decent speech recognizer.  Which makes open source speech
> recognition something not doable yet.

There are some small databases available [1], although admittedly too
small for accurate general purpose use.  There are some models
available [2], built from databases which are not themselves
redistributable.  There are also a number of model-building tools
available [3-5], which may be sufficient for small command-and-control
tasks.

But you are right.  For general-purpose voice recognition, we don't
have the data we need.  Still, I think it may be worth putting the
software in place so that those who wish to purchase licenses to
commercial data have everything else they need, and to encourage the
production of better quality free data [6].

References:
[1] http://www.speech.cs.cmu.edu/databases/
[2] http://www.speech.cs.cmu.edu/sphinx/models/
[3] http://www.speech.sri.com/projects/srilm/
[4] http://cmusphinx.sourceforge.net/html/download.php#SphinxTrain
[5] http://cmusphinx.sourceforge.net/html/download.php/#cmulclmtk
[6] http://www.voxforge.org/
-- 
Jerry James
http://loganjerry.googlepages.com/