Autocomplete using Elasticsearch

Table of contents

Reading Time: 2 minutes

You would have seen in a movie data store like IMDB, Whenever a user enters ‘g’, the search bar suggests him that you might be looking for gone girl or all the movies that have ‘g’ in them.

This is what an Autocomplete or word completion is and it has become an essential part of any application.

Autocomplete speeds up human-computer interaction by predicting the word using very few characters.

In this blog I’ll be discussing about result suggest autocomplete using elasticsearch which means that the predictions would be based on the existing data in the data store.

There is another type of autocomplete i.e search suggest autocomplete which works on the previously searched phrases but we won’t be discussing about it in this blog

Analyzers

Whenever we insert data into Elasticsearch, it analyzes the data so that an appropriate inverted index can be created.
The Analyzers consists of a tokenizer and one or more token filter which transform the data appropriately so that the business needs are met.

For this post we are using the nGrams analyzer.

N-gram is a contiguous sequence of n items from a given sequence of text. This means that we are breaking the search text into character permutations.

Mapping And Settings

{
  "settings": {
    "analysis": {
      "filter": {
        "gramFilter": {
          "type": "nGram",
          "min_gram": 1,
          "max_gram": 20,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      },
      "analyzer": {
        "gramAnalyzer": {
          "type": "custom",
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "gramFilter"
          ]
        },
        "whitespaceAnalyzer": {
          "type": "custom",
          "tokenizer": "whitespace",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "movies": {
      "properties": {
        "Title": {
          "type": "string",
          "analyzer": "gramAnalyzer",
          "search_analyzer": "whitespaceAnalyzer"
        },
        .
        .
        .
      }
    }
  }
}

Notice that we have defined a gramFilter of type nGram, min_gram and max_gram are the minimum and maximum characters that you want in the tokens and token_chars is the condition on which you want to create the grams.

And also we have used two analyzers in the mapping:-

gramAnalyzer
whitespaceAnalyzer

Now the question which must be striking you guys is, Why do we need two analyzers?

It’s just because we want to analyze the stored data and the search query differently.

The search text lowercased and is split on whitespaces.
The stored data is lowercased and gramFilter is applied on it.

Once our analyzers are ready we need to apply these to the field that we want to make suggestions for (In our example the field would be Title).

Searching

We can execute a match phrase query on “Title” field to use the autocomplete functionality.

The query looks like this:

{
  "query": {
    "match": {
      "Title": "go"
    }
  }
}

This query will return all the movies that are listed in the Elasticsearch index which contain ‘go’ in the Title.

An activator template implementing this feature can be found here.

References:
1. https://qbox.io/blog/multi-field-partial-word-autocomplete-in-elasticsearch-using-ngrams
2. https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html

3 thoughts on “Autocomplete using Elasticsearch2 min read”

Comments are closed.

High performance systems

Data Engineering, Strategy and Analytics

Intelligence Driven Decisioning - AI/ML

Cloud Engineering

Architecture Strategy, Audit & Academy

Platforms

KDP

KDSP

Products

Premon

Studio9

Tech Hub

Akka

Scala

Rust

Spark

Functional Java

Kafka

Flink

ML/AI

DevOps

Data Warehouse

Travel

Retail

Finance

Healthcare

Media and Publishing

Consumer Internet

Hi-tech & IoT

Case Studies

Blogs

Books

Community

Resources

OS contributions

Webinars

Knolx

Check out our open positions

Services

Go to Overview

Accelerators

Go to Overview

Platforms

Products

TechHub

Industries

Go to Overview

Travel

Insights

Go to Overview

Autocomplete using Elasticsearch

For this post we are using the nGrams analyzer.

Share the Knol:

Related

Written by Rachel Jones

3 thoughts on “Autocomplete using Elasticsearch2 min read”

COMPANY

Sign up to our newsletter

Certificates

Partners

© 2023 Knoldus, Inc. All Rights Reserved.

Part of NashTech

Privacy Policy | Sitemap

Discover more from Knoldus Blogs

Check out our
open positions