Apache Solr with Java: Result Grouping with Solrj


This blog is a detailed, step-by-step guide on implementing group by field in Apache Solr using Solrj.

Note: Grouping is different from Faceting in Apache Solr. While grouping returns the documents grouped by the specified field, faceting returns the count of documents for each of the different values for the specified field. However you can combine grouping and faceting in Solr. This blog talks about grouping without the use of facet and implementing the same through Solrj (version 6.4.1).

Without much ado, let’s get to it.

First, you need a running Solr instance with correctly indexed data.

Note: A Solr core is basically an index of the text and fields found in documents. A single Solr instance can contain multiple “cores”, which are separate from each other based on local criteria. If you don’t have a running solr instance with a core set up, Apache Solr also provides a number of useful examples to help you learn about key features. You can launch the examples using the -e flag.

Set up your local solr, by following the directions below:

(i) Download Solr for your operating system from Apache Solr – Downloads

(ii) Go to the directory and start solr with the demo data provided by Apache Solr by running the following command in the solr directory:

bin/solr -e techproducts

If you don’t want to set up the demo core and just need to run the instance and then to run your own indexer, simply run solr by hitting following command in the solr directory:

bin/solr start

However, for demonstration purpose, we will be using the techproducts core provided by Apache Solr

(iii) Go to your browser and hit the below URL:

http://localhost:8983/solr/#/techproducts

You can select your core from the core selector in the left pane.

Second, you need to set up an instance of SolrClient in your java program.

If you are not familiar with how to set up an instance SolrClient right now, worry not! Go ahead and take a look at this blog on  Solr with Java: A basic hands-on with SolrJ

Now let us first understand the query structure in Solr for getting a grouped response.

Use the following query parameters to get a grouped response from solr in browser

http://localhost:8983/solr/techproducts/select?wt=json&indent=true&fl=id,name&q=solr+memory&group=true&group.field=manu_exact

The above query is a simple example of getting a grouped response that uses just two request parameters, group and group.field. These parameters along with some other useful ones are described in the table below.

Parameter Type Description
group Boolean If true, query results will be grouped.
rows integer The number of groups to return. The default value is 10.
group.ngroups Boolean If true, Solr includes the number of groups that have matched the query in the results. The default value is false.
group.limit integer Specifies the number of results to return for each group. The default value is 1. To get all the results in each group, set this value as -1, along with the group.ngroups parameters set to true
group.field string The name of the field by which to group results. The field must be single-valued, and either be indexed or a field type that has a value source and works in a function query, such as ExternalFileField. It must also be a string-based field, such as StrField or TextField

To know more about all the request parameters that can be used to customize the grouped response from Solr, go to the Apache Solr Reference Guide for Result Grouping

Note: The field by which grouping is to be done should not be multi-valued in the schema. Grouping cannot be done on multi-valued fields in Solr as of Solr version 6.4.2

Let us now set up the query parameters in our java code.

In the code snippet below, we are setting up a simple SolrQuery with some basic request parameters required for querying for a grouped response from Solr.

SolrQuery query = new SolrQuery();
query.add(CommonParams.Q, "*:*");
query.set("group", "true");
query.set("group.field", "manu_exact"); //replace "manu_exact" with your field name here
query.set("group.ngroups", "true");
query.set("group.limit", "-1");

Lastly, let us retrieve and traverse through the grouped response.

The grouped response is returned as the value for the key grouped in the raw Solr response. To retrieve it using solrj, hit the query prepared above on the core (here, we are using the demo core techproducts) and get the grouped response as shown below:

QueryResponse response = null;
try {
                response = solrClient.query("techproducts", query); //replace "techproducts" with your core name here
            }catch(Exception e) {
                //log the exception
            }

if (isSuccess(response)) {
    if (response.getGroupResponse() != null) {
        for (GroupCommand groupCommand : response.getGroupResponse().getValues()) {
            /**
             * groupCommand.getName(); returns the field by which grouping was done
             * groupCommand.getMatches(); returns the number of documents matched in totality
             * groupCommand.getNGroups(); returns the number of groups identified
             * groupCommand.getValues(); returns the List of Groups returned
             **/
            for (Group group : groupCommand.getValues()) { //This loop traverses through each group
                /**
                 * group.getGroupValue(); returns the common value that each document shares inside this group
                 * group.getResult(); returns the SolrDocumentList retrieved for this group
                 */
                for (SolrDocument doc : group.getResult()) {
                    /**
                     * doc.getFieldValue(FIELD_NAME); returns the value or collection of values for a given FIELD_NAME
                     * doc.getFirstValue(FIELD_NAME); returns the first value for a field if it is FIELD_NAME
                     */
                }
            }
        }
    }
}

This is it from my side on Result Grouping using solrj. Please let me know in the comment section below if you have any further queries or if this was helpful to you in any way. Cheers. 🙂

References:

Advertisements
This entry was posted in Java and tagged , , . Bookmark the permalink.

One Response to Apache Solr with Java: Result Grouping with Solrj

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s