This blog is a detailed, step-by-step guide on implementing group by field in Apache Solr using Solrj.
Note: Grouping is different from Faceting in Apache Solr. While grouping returns the documents grouped by the specified field, faceting returns the count of documents for each of the different values for the specified field. However you can combine grouping and faceting in Solr. This blog talks about grouping without the use of facet and implementing the same through Solrj (version 6.4.1).
Without much ado, let’s get to it.
First, you need a running Solr instance with correctly indexed data.
Note: A Solr core is basically an index of the text and fields found in documents. A single Solr instance can contain multiple “cores”, which are separate from each other based on local criteria. If you don’t have a running solr instance with a core set up, Apache Solr also provides a number of useful examples to help you learn about key features. You can launch the examples using the -e flag.
Set up your local solr, by following the directions below:
(i) Download Solr for your operating system from Apache Solr – Downloads
(ii) Go to the directory and start solr with the demo data provided by Apache Solr by running the following command in the solr directory:
bin/solr -e techproducts
If you don’t want to set up the demo core and just need to run the instance and then to run your own indexer, simply run solr by hitting following command in the solr directory:
However, for demonstration purpose, we will be using the techproducts core provided by Apache Solr
(iii) Go to your browser and hit the below URL:
You can select your core from the core selector in the left pane.
Second, you need to set up an instance of SolrClient in your java program.
If you are not familiar with how to set up an instance SolrClient right now, worry not! Go ahead and take a look at this blog on Solr with Java: A basic hands-on with SolrJ
Now let us first understand the query structure in Solr for getting a grouped response.
Use the following query parameters to get a grouped response from solr in browser
The above query is a simple example of getting a grouped response that uses just two request parameters, group and group.field. These parameters along with some other useful ones are described in the table below.
|group||Boolean||If true, query results will be grouped.|
|rows||integer||The number of groups to return. The default value is 10.|
|group.ngroups||Boolean||If true, Solr includes the number of groups that have matched the query in the results. The default value is false.|
|group.limit||integer||Specifies the number of results to return for each group. The default value is 1. To get all the results in each group, set this value as -1, along with the group.ngroups parameters set to true|
|group.field||string||The name of the field by which to group results. The field must be single-valued, and either be indexed or a field type that has a value source and works in a function query, such as ExternalFileField. It must also be a string-based field, such as StrField or TextField|
To know more about all the request parameters that can be used to customize the grouped response from Solr, go to the Apache Solr Reference Guide for Result Grouping
Note: The field by which grouping is to be done should not be multi-valued in the schema. Grouping cannot be done on multi-valued fields in Solr as of Solr version 6.4.2
Let us now set up the query parameters in our java code.
In the code snippet below, we are setting up a simple SolrQuery with some basic request parameters required for querying for a grouped response from Solr.
Lastly, let us retrieve and traverse through the grouped response.
The grouped response is returned as the value for the key grouped in the raw Solr response. To retrieve it using solrj, hit the query prepared above on the core (here, we are using the demo core techproducts) and get the grouped response as shown below:
This is it from my side on Result Grouping using solrj. Please let me know in the comment section below if you have any further queries or if this was helpful to you in any way. Cheers. 🙂
- Apache Solr Reference Guide to Result Grouping