Apache Solr
Solr is the popular, blazing-fast, open source enterprise search platform. It is one of the easiest ways of developing sophisticated, high-performance search applications. Based on another Apache product Lucene, Solr provides developers with capabilities such as advanced full-text search capabilities, scalability, easy monitoring and much more. This blog intends to get you started with Solr and helps you interact with a Solr server.
Solr basically is an enterprise search platform. But what does that mean? Let me try and explain this to you with the help of an example. Suppose you work for a sales team and your boss wants you to call software developers who belong to the age group 20-40 in order to offer them a scheme. For this, you might require the following information:
– A person’s age
– A person’s work specialization
– A person’s contact number
All this information might be difficult for you to access since these might be available in different locations (clouds). An enterprise search software could solve this issue by indexing all the different data locations into a log (an Index) and turning that log into a central location to fetch relevant data.
Now that you have a slight idea about how Solr can be used, let’s dive in a little deeper. Firstly you’ll need to install Solr in your systems for using it. Please refer to Install Apache Solr for a detailed procedure for the same.
Let’s get acquainted with some key terms :
– Solr server: This is the enterprise search server that runs on your localhost. All the cores and collections are present on this server.
– Core: The term core is used to refer to a single index and associated configuration files. A single server can have multiple cores.
– Collection: Solr servers could be stand-alone or in cloud mode. For cloud mode, we have collections which are nothing but multiple cores under the hood.
– Index: It is like a table of content at the start of a book which makes it easy for the reader to access the chapter they want. The whole process of adding any document to a Solr core is called indexing.
– Document: As the name suggests, it is a set of data that describes something. It is the basic unit of information in Solr. In more granularity, documents are composed of fields. This is similar to a DBMS’ table and its respective attributes.
This much information will be enough to get us started. Let’s move on and start the Solr server. Open the terminal and move to the directory which houses your Solr and then move into the bin directory. To start the server type
./solr start
This will start the Solr server on the default port 8393. To check that, open the link
http://localhost:8983/solr/
on your browsers. The Solr Server can also be started on any specific port by using the following command
./solr start -p portnumber
Once the server is up, we need to create a core. Since we’re currently interested in getting a hands-on, let’s create a core with all the default configurations. Type
./solr create -c sample_core
to create a core by name sample_core.
Now let’s talk code. First things first, we would require a maven dependency of SolrJ for writing a java code that can interact with our Solr server. Or you could manually add a Jar for the same. To find either, click here.
As I mentioned earlier, the Solr server by default starts at port 8393. To tell our program as to which core do we wish to index our documents in we need to provide it a URL. Create a string variable that holds the URL viz
String urlString = "http://localhost:8983/solr/sample_core";
Moving on, we need a client through which all our communication to Solr server can be routed. This is provided to us by the class SolrClient in SolrJ jar. Create a SolrClient instance as mentioned below
SolrClient Solr = new HttpSolrClient.Builder(urlString).build();
Now that we have a channel to communicate to the server, let us create some data to index. We need a Solr Document which is provided to us by the class SolrInputDocument.
SolrInputDocument doc = new SolrInputDocument();
Let’s add fields to the document using the addField method viz
doc.addField(field-name,value);
Once you’re done adding the fields to the document, it’s time we pass this document to our SolrClient and commit the changes.
Solr.add(doc); Solr.commit();
Now when you run the program, and then check the Solr admin, you’ll find the document in the core. But why not create a method to view that on our console.
To query data from the SolrServer we need a query variable provided of type SolrQuery. In that variable, we need to specify on which parameter do we want to query. Eg you want to query people from Delhi (provided there is a field named “location”). We do that using the function of the query called setQuery(). We also need to specify what all data do we need to get from the documents that match the query. We do that using the addField function. For example, you want to query the name of those Delhi residents,
SolrQuery query = new SolrQuery(); query.setQuery("location: Delhi");//For selecting all fields pass "*:*" as the argument to the function query.addField("name");//For selecting all fields pass "*" as the argument to the function.
We can trigger the above query in a QueryResponse and get the documents returned by the search using SolrDocumentList viz
QueryResponse response = Solr.query(query); SolrDocumentList documentList = response.getResults();
And hence, we’ll have the result in the documentList variable.
Which brings to the end of this blog. Feel free to have a look at the above code. The intent here was to get you acquainted with Solr. The blogs that follow this one would contain deeper insights into Solr so stay tuned.