In this blog we will see how we can perform relevance(or relevant) search in solr using solrj Http API in scala .
To give brief what is relevance search : –
A developer working on search relevancy focuses on the following areas as the “first line of defense”:
- Text Analysis: the act of “normalizing” text from both a search query and a search result to allow fuzzy matching. For example, one step known as stemming can turn many forms of the same word “shopped”, shopping”, and “shopper” all to a more normal form – “shop” to allow all forms to match.
- Query Time Weights and Boosts: Reweighting the importance of various fields based on search requirements. For example deciding a title field is more important than other fields.
- Phrase/Position Matching: Requiring or boosting on the appearance of the entire query or parts of a query as a phrase or based on the position of the words
So Lets take an example ( here the assumptions are that you have created a solr core ) : –
Problem : –
say you have a table/core in solr that holds the following data –
userID , HashTag , Text ,LaunchedTime
and you have to find out the details where a/some particular hashtag/’s are being used .
Solution : –
Step 1) The dependencies : –
I am using the SolrJ 6.2.1 Version .
“org.apache.solr“ % “solr-solrj“ % “6.2.1“
get it from Here (SolrJ 6.2.1)
Step 2) You need a solr connection
val solrConn: HttpSolrClient = { |
val urlString = s“http://$solrHostname:$solrPort/solr/$solrKeyspace.$solrTable“ |
new HttpSolrClient.Builder(urlString).build() |
}
Here : –
$solrHostname : the hostname/machineIp where your solr is running
$solrPort : port where solr is running ( by default 8983)
$solrKeyspace. : Cassandra Keyspace Name
$solrTable : Cassandra Table name
So here You get the HttpSolrClient . Using this you can query your solr engine .
Step 3) Create Solr Query : –
To create solr Query , we are using solrQuery Class : –
def createSolrQuery(start: Int, rows: Int): SolrQuery = { |
val solrQuery = new SolrQuery |
solrQuery.set(“q“, “hashtag : (#modi OR #blackMoney)”) |
solrQuery.set(“sort“, “score desc , LaunchedTime desc“) |
solrQuery.set(“df“, s“$HASHTAG“) |
solrQuery.set(“start“, s“$start“) |
solrQuery.set(“rows“, s“$rows“) |
solrQuery.set(“fl“, “Text“) |
solrQuery |
} |
Lets Understand the Function : –
3.1) solrQuery.set(“q“, “hashtag : (#modi OR #blackMoney)”) : –
It Searches for the hashtag ‘modi’ Or ‘blackmoney’ . Where ‘q’ determines the basic solr query .
3.2) solrQuery.set(“sort“, “score desc , LaunchedTime desc“)
‘sort’ : we are sorting according to the relevancy score (sort from highest score to lowest score ) . And when equals then according to the launchedTime .
3.3) solrQuery.set(“df“, s“$HASHTAG“)
‘df’ : this parameter determines the ‘DEFAULT SEARCH FIELD’
we are searching according to the hashtag , so the df feild determines the searching and calculation of score on the hashtag field . ( the more the search parameter in the field more is the score and more is the relevant text it will be ) .
Note* :df
is the default field and will only take effect if the qf
is not defined.
3.4) solrQuery.set(“start“, s“$start“)
‘start’ : the searching location/rows to start searching with
3.5) solrQuery.set(“rows“, s“$rows“)
‘rows’ : No of rows to be returned .
3.6) solrQuery.set(“fl“, “Text“)
‘fl’ : Field Text the filed to be returned . ( here we are returning/fetching only the Text Field) .
Step 4) Function to Fetch Result From Solr Using The Solr Query : –
val solrQuery = createSolrQuery( 0 ,10) // get the solr query using the function created above in Step 3 . |
val solrConnection = solrConn (HttpSolrClient , Created in Step 2) |
val res: List[SolrDocument] = solrConnection.query(solrQuery).getResults.asScala.toList |
val textDetails: List[String] = res.map { s => |
s.getFieldValues(“Text“).toArray() |
} |
So we use the getResult Function to execute the slor Query . |
Then get the result in list and using the map function of Scala we iterate and gets the TEXT filed from the result . |
At end I would like to display the solr Query created and its result : – |
http://Ip-Address:8983/solr/Test_Keyspace.Test_Table/select?q=hashtag+%3A(modi+OR+balckMoney)&sort=score+desc+%2C+launchedtime+desc&start=0&rows=10&fl=hashtag&df=text&wt=json&indent=true |
Output : –
|
To check out other solr query Parameter : –
https://wiki.apache.org/solr/CommonQueryParameters
1 thought on “Solr Relevance Search Using SolrJ In Scala4 min read”
Comments are closed.