Introduction to Elasticsearch in Scala

Table of contents

Reading Time: 2 minutes

ElasticSearch is a real-time distributed search and analytics engine built on top of Apache Lucene. It is used for full-text search, structured search and analytics.

Lucene is just a library and to leverage its power you need to use Java. Integrating Lucene directly with your application is a very complex task.

Elastic Search uses the indexing and searching capabilities of Lucene but hides the complexities behind a simple RESTful API.

In this post we will learn to perform basic CRUD operations using Elastic search transport client in Scala with sbt as our build-tool.

Let us start by downloading Elasticsearch from here and unzipping it.

Execute the following command to run Elastic search in foreground:

cd elasticsearch-<version>
./bin/elasticsearch

Test it out by opening another terminal window and running the following:

curl 'http://localhost:9200/?pretty'[code]

You should see a response like this:

[code language="scala"]{
"name" : "Don Fortunato",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "2.3.2",
"build_hash" : "b9e4a6acad4008027e4038f6abed7f7dba346f94"
"build_timestamp" : "2016-04-21T16:03:47Z",
"build_snapshot" : false,
"lucene_version" : "5.5.0"
},
"tagline" : "You Know, for Search"
}

To start with the coding part, create a new sbt project and add the following dependency in the build.sbt file.

"org.elasticsearch" % "elasticsearch" % "2.3.2"

Next, we need to create a client that will talk to the elaticsearch server.

private val port = 9300

private val nodes = List("localhost")

private val addresses = nodes.map { host = new InetSocketTransportAddress(InetAddress.getByName(host), port) }

lazy private val settings = Settings.settingsBuilder().put("cluster.name", "elasticsearch").build()

val client:Client = TransportClient.builder()
.settings(settings).build().addTransportAddresses(addresses:_*)

Once the client is created we can query the Elastic search server.

The following example inserts a JSON document into an index called library, under a type called books.

An index is like a database and a type is like a table in Elastic search.

Lets create our first json document.

val jsonString =
{
"title": "Elastic",
"price": 2000,
"author":{
"first": "Zachary",
"last": "Tong";
}
}

To add a json into the Elasticsearch add the following code to your project:

client.prepareIndex("library","books","1").setSource(jsonString).get()

The prepareIndex method takes 3 arguments:- index name,type,id. The id argument is optional. If you do not specify an id Elastic search will automatically generate an id for the document.

Note that the title of the book is Elastic and not Elastic search. Lets correct this by executing an update on the document:

client.prepareUpdate("library","books","1").setDoc("title", "Elasticsearch").get()

Lets search for our document and see whether the document is updated or not
Execute the following code to search for a document:

client.prepareSearch("library").setTypes("books")

.setQuery(QueryBuilders.termQuery("_id","1")).get()

The id that we specified while adding the document is stored as “_id”.
If you do not specify the setQuery method then Elastic search will get all the documents in the type books.

Finally to delete a document execute the following code:

client.prepareDelete("library","books",2).get()

Elasticsearch also provides bulk API used to insert multiple documents onto the Elastic search server in a single API call.
To use the bulk API create a file in the following format:

{ "title" :"Java","price":"1000","author":{"first":"Chris","last":"Adamson"} }
{ "title" : "Scala","price":"2000","author":{"first":"Martin","last":"Ordersky"} }
{ "title" : "C","price":"3000","author":{"first":"Dennis","last":"Ritchie"} }

Now lets create a bulk request and add the following documents to the request:
Open an InputStream and read the json file you just created and store the data in a list named fileData.

val bulkRequest:BulkRequestBuilder = client.prepareBulk()
fileData.foreach{
json => bulkRequest.add(client.prepareIndex("library","books").setSource(json))
}
bulkRequest.get()

We are done with the CRUD operations. You can read more from the Elasticsearch docs.
Get the source code from here.

Happy Searching!!!!

5 thoughts on “Introduction to Elasticsearch in Scala3 min read”

Reblogged this on Big Data of Everything and commented:
I haven’t had time recently to write blogs of my own, but I saw this blog and I think it’s really helpful for those who are interested in doing ElasticSearch, so I’m going to share it on my own blog. This is the first time I’m doing a reblog.