Solr Relevance Search Using SolrJ In Scala


In this blog we will see how we can perform relevance(or relevant) search in solr using solrj Http API in scala .

To give brief what is relevance search : –

A developer working on search relevancy focuses on the following areas as the “first line of defense”:

  • Text Analysis: the act of “normalizing” text from both a search query and a search result to allow fuzzy matching. For example, one step known as stemming can turn many forms of the same word “shopped”, shopping”, and “shopper” all to a more normal form – “shop” to allow all forms to match.
  • Query Time Weights and Boosts: Reweighting the importance of various fields based on search requirements. For example deciding a title field is more important than other fields.
  • Phrase/Position Matching: Requiring or boosting on the appearance of the entire query or parts of a query as a phrase or based on the position of the words

So Lets take an example ( here the assumptions are that you have created a solr core ) : –

Problem : –

say you have a table/core in solr that holds the following data –

userID , HashTag , Text  ,LaunchedTime

and you have to find out the details where a/some particular hashtag/’s are being used .

Solution : –

Step 1) The dependencies : –

I am using the SolrJ 6.2.1 Version .

org.apache.solr % solr-solrj % 6.2.1

get it from Here (SolrJ 6.2.1)

Step 2) You need a solr connection

val solrConn: HttpSolrClient = {
val urlString = shttp://$solrHostname:$solrPort/solr/$solrKeyspace.$solrTable
new HttpSolrClient.Builder(urlString).build()

}

Here : –

$solrHostname  : the hostname/machineIp where your solr is running

$solrPort             : port where solr is running ( by default 8983)

$solrKeyspace.   : Cassandra Keyspace Name

$solrTable            : Cassandra Table name

So here You get the HttpSolrClient . Using this you can query your solr engine .

Step 3) Create Solr Query  : –

To create solr Query , we are using solrQuery Class : –

def createSolrQuery(start: Int, rows: Int): SolrQuery = {
val solrQuery = new SolrQuery
solrQuery.set(q, “hashtag : (#modi OR #blackMoney)”)
solrQuery.set(sort, score desc , LaunchedTime desc)
solrQuery.set(df, s$HASHTAG)
solrQuery.set(start, s$start)
solrQuery.set(rows, s$rows)
solrQuery.set(fl, Text)
solrQuery
}

Lets Understand the Function : –

3.1) solrQuery.set(q, “hashtag : (#modi OR #blackMoney)”) : –

It Searches for the hashtag ‘modi’ Or ‘blackmoney’ .  Where ‘q’ determines the basic solr query .

3.2) solrQuery.set(sort, score desc , LaunchedTime desc)

‘sort’ : we are sorting according to the relevancy score (sort from highest score to lowest score ) . And when equals then according to the launchedTime .

3.3) solrQuery.set(df, s$HASHTAG)

‘df’ : this parameter determines the ‘DEFAULT SEARCH FIELD’

we are searching according to the hashtag , so the df feild determines the searching and calculation of score on the hashtag field . ( the more the search parameter in the field more is the score and more is the relevant text it will be ) .

Note* :df is the default field and will only take effect if the qf is not defined.

3.4) solrQuery.set(start, s$start)

‘start’ : the searching location/rows to start searching with

3.5) solrQuery.set(rows, s$rows)

‘rows’ : No of rows to be returned .

3.6) solrQuery.set(fl, Text)

‘fl’ : Field Text the filed to be returned . ( here we are returning/fetching only the Text Field) .

Step 4)  Function to Fetch Result From Solr Using The Solr Query :  –

val solrQuery = createSolrQuery( 0 ,10)   // get the solr query using the function created above in Step 3 .
val solrConnection = solrConn (HttpSolrClient , Created in Step 2)
val res: List[SolrDocument] = solrConnection.query(solrQuery).getResults.asScala.toList
val textDetails: List[String] = res.map { s =>
s.getFieldValues(Text).toArray()
}
 So we use the getResult Function to execute the slor Query .
Then get the result in list and using the map function of Scala we iterate and gets the TEXT filed from the result .
At end I would like to display the solr Query created and its result : –
http://Ip-Address:8983/solr/Test_Keyspace.Test_Table/select?q=hashtag+%3A(modi+OR+balckMoney)&sort=score+desc+%2C+launchedtime+desc&start=0&rows=10&fl=hashtag&df=text&wt=json&indent=true
Output : –

{
  "responseHeader": {
    "status": 0,
    "QTime": 9
  },
  "response": {
    "numFound": 1,
    "start": 0,
    "docs": [
      {
        "Text": "#modi,#blackMoney modi rocks!!!!!!!!!"
      }
    ]
  }
}

To check out other solr query Parameter : –

https://wiki.apache.org/solr/CommonQueryParameters


KNOLDUS-advt-sticker

This entry was posted in big data, Scala and tagged , , , , . Bookmark the permalink.

One Response to Solr Relevance Search Using SolrJ In Scala

  1. Pingback: Solr Relevance Search Using SolrJ In Scala – bigtechnologies

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s