Speech Recognition with Scala

Table of contents
Reading Time: 2 minutes

In this blog I am going to explain how to integrate speech recognition in your Scala project.

Speech recognition enables us to integrate the recognition and translation of spoken language into our projects in form of text. Speech recognition is really upcoming feature in electronic and computer devices so as to make them smarter.

In the project we shall be using the CMU Sphinx Toolkit. It allows us to integrate offline speech recognition. It is an open source toolkit which provides us with several speech recognizer components. There are several components available depending upon the needs of the application. The available components include speech recognizer , speech decoders, software for acoustic model training,language model and pronunciation dictionary.

We shall be using the Sphinx 4 speech recognizer, it is a pure java speech recognition library. It is used for identification of speech devices,adapt models and to recognize and translate speech .

Now let us look at the code.First of all let us include the following two dependencies in our build.sbt

libraryDependencies += "edu.cmu.sphinx" % "sphinx4-core" % "1.0-SNAPSHOT"
libraryDependencies += "edu.cmu.sphinx" % "sphinx4-data" % "1.0-SNAPSHOT"
                                  or
libraryDependencies += "de.sciss" % "sphinx4-core" % "1.0.0"
libraryDependencies += "de.sciss" % "sphinx4-data" % "1.0.0"

The next step is to import the following into the application

import edu.cmu.sphinx.api._

Next comes the code for setting up the configuration for speech recognition

object SpeechRecognitionApp extends App {
  val configuration = new Configuration
  configuration.setAcousticModelPath("file:models/acoustic/wsj")
  configuration.setDictionaryPath("file:models/acoustic/wsj/dict/cmudict.0.6d")
  configuration.setLanguageModelPath("models/language/en-us.lm.dmp")
}

The code above will create a configuration variable which is responsible for setting up the acoustic model, language model and path to the dictionary.

The cmudict is a text file used as dictionary of recognizable words which can be extended for a given language. It looks something like this

ONE                  HH W AH N
ONE(2)               W AH N
TWO                  T UW
THREE                TH R IY

Now the last step is to create the recognition object, enable it to start recognition and store the result.And we have stored that result in form of a string.

println("Start speaking :")
val speechRecognizer = new LiveSpeechRecognizer(configuration)
speechRecognizer.startRecognition(true)
var result = speechRecognizer.getResult
while ({result = speechRecognizer.getResult; result != null}) {
  println(result.getHypothesis)
}

Now we are done, so we can use the above code to enable speech recognition into our Scala Project.

References

Written by 

Pallavi is a Software Consultant, with more than 3 years of experience. She is very dedicated, hardworking and adaptive. She is Technology agnostic and knows languages like Scala and Java. Her areas of interests include microservices, Akka, Kafka, Play, Lagom, Graphql, Couchbase etc. Her hobbies include art & craft and photography.

6 thoughts on “Speech Recognition with Scala2 min read

  1. To resolve the deps, had to change them to:

    libraryDependencies += “edu.cmu.sphinx” % “sphinx4-core” % “5prealpha-SNAPSHOT”
    libraryDependencies += “edu.cmu.sphinx” % “sphinx4-data” % “5prealpha-SNAPSHOT”

  2. Hi, I am getting an error that Not enough agrument for constructor SpeechAlighner.

    Can you please help?

Comments are closed.

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading