Implementing full text search with Couchbase and harnessing the power of Couchbase full text search (CBFT)


Hey Folks.! In this blog we are going to be introduced to the Couchbase Full text search.
In my recent blog ,we talked about how we can user ElasticSearch for the full text search and how we can connect it with Couchbase so that our data gets copied in real time and we can search on it too.
But what if we do not want to persist the data at two places(Couchbase and ElasticSearch) just for implementing full text search, because it increases the cost in many ways like server cost , management cost etc.
Here comes CBFT(CouchBase full text) search to our rescue.
A cbft process creates and maintains full-text indexes using the bleve full-text indexing engine.
Please make sure to not to use it in production because it is still in its infancy stage, but yeah you can play around and explore its potentials.
So first of all,

What is cbft ?

cbft, or Couchbase Full-Text server, is distributed, clusterable, data indexing server.

It includes the ability to manage full-text and other kinds of indexes for JSON documents that you’ve created and stored into a Couchbase bucket and other data sources.

The indexes that cbft manages can be automatically distributed across multiple, clustered cbft server processes on different server machines to support larger indexes, higher performance and higher availability.

Why should we use it ?

If you want to keep your data at a single place and do not want to duplicate it and want to be saved from the overhead cost of managing the data. Then its the thing that you are looking for.

Getting started

Prerequisites

We already discussed how we can install the couchbase server, and how to perform CRUD operations on it. In case you missed it you can take a look here , and for more advanced features you can take a look here.

You should also have a bucket in Couchbase Server with JSON documents that you’d like to index.

For example, while during the setup steps of Couchbase Server, you can have Couchbase Server create and populate abeer-sample bucket of sample JSON documents.

Getting cbft

You can download cbft from here(Linux users). If you use some other OS then you can download it from here , navigate to addons and select the release in accordance with your OS.

After downloading, then next uncompress what you downloaded…

tar -xzf cbft-v{X.Y.Z-AAA}.linux.amd64.tar.gz

cbftInstallation

If you are some other OS user you can see the step of installation from here.

Running cbft

Start cbft, pointing it to your Couchbase Server as your default datasource server…

./cbft.linux.amd64 -server http://localhost:8091

You can also connect to a remote server by using the same command

./cbft.linux.amd64 -server http://Administrator:<Password>@<ServerUrl/IPAddress>:8091

CouchbaseRemoteConnection

Note: Cbft also defaults to using a directory named “data” as its data directory, which cbft will create in the current working directory if it does not exist yet. You can change the data directory path by using the -dataDir command-line parameter.

The web admin UI

Next, point your web browser to cbft’s web admin UI…

http://localhost:8095

In your web browser, you should see a “Welcome to cbft” page in the web admin UI.

WelcomePage

That welcome page will list all the indexes that you’ve defined; of course, there should be no indexes at this point.

Creating a full-text index

Now you are ready to create your own index you can create it like this.

Creatingindex

Index Name

In the Index Name field, type in a name, such as “user-index”.Here we are using Users

Index Type

As soon as you select full-text(bleve) as your index type, additional options come to provide your own custom mapping. Providing your custom mapping is easy but lets just leave it for further insights.

Source Type

Choose Couchbase as the source type.

Source Name

If your source type is couchbase, the Source Name should be name of a bucket.

For example, to index the “users” bucket from your Couchbase server, type in a Source Name of “users”.

NOTE: if your bucket has a password, you can supply the password by clicking on the Show advanced settings checkbox, which will display the Source Params JSON textarea. Then, fill in the authUser field in the JSON with the name of the Couchbase bucket and the authPassword field with the bucket’s password.

Additionalsettings

Your new index

Finally, click the Create Index button.

You should see a summary page of your new full-text index.

The Document Count field on the index summary page is a snapshot of how many documents have been indexed so far. You can click on the Refresh button next to the Document Count in order to see indexing progress.
IndexMade

Now you are ready to query your indexes.!! 

Querying your full-text index

Next, click on the Query tab.

In the query field, type in a query term.

Hit enter/return to execute your first cbft full-text query!

You should see query results appearing below the query field.

Using the REST API

You can also use the REST API to access your index.

For example, if your index is named Users, here’s how you can use the curl tool to check how many documents are indexed…

curl http://localhost:8095/api/index/Users/count

Here’s an example of using curl to query the Users

curl -XPOST --header Content-Type:text/json \
     -d '{"size":10,"query":{"query":"your search string"}}' \
     http://localhost:8095/api/index/Users/query

In futher blogs we will discuss how to provide your custom mapping to the cbft and how to execute different queries like multifield query and fuzzy query, so stay tuned.! 🙂

Till then if you want to know more about bleve you can take a look here, and there is great talk by Marty Schoch you can check it out here.
Once again if you need any help, ping us.!

About shiv4nsh

Coder, Gamer, Learner..!!
This entry was posted in Elasticsearch, fulltextsearch, Scala and tagged , , , , . Bookmark the permalink.

2 Responses to Implementing full text search with Couchbase and harnessing the power of Couchbase full text search (CBFT)

  1. Pingback: Implementing full text search with Couchbase and harnessing the power of Couchbase full text search (CBFT) | Scala Lovers

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s