In this blog, we will learn to set up Elasticsearch on the minikube cluster but before that let’s look at elasticsearch.
Elasticsearch is a distributed, scalable, real-time search engine that supports full-text and structured searches, and analytics. It’s most typically used to index and search vast amounts of log data, but it can also be used to explore a variety of documents.
Use Cases of Elasticsearch:
- Application search
- Website search
- Enterprise search
- Logging and log analytics
- Infrastructure metrics and container monitoring
- Application performance monitoring
- Geospatial data analysis and visualization
- Business analytics
How does Elasticsearch work?
Elasticsearch receives raw data from a variety of sources, including logs, system metrics, and web applications. Data ingestion is the process of parsing, normalizing, and enriching raw data before indexing it in Elasticsearch. Once their data is indexed in Elasticsearch, users can perform complicated queries against it and utilize aggregations to generate complex summaries. Users can utilize Kibana to build rich data visualizations, share dashboards, and manage the Elastic Stack.
Why use Elasticsearch?
- Elasticsearch is fast: It excels at full-text search because it is built on top of Lucene. It is also a near real-time search platform, which means that the time between indexing a document and making it searchable is very short — typically one second.
- Elasticsearch is distributed by nature: Its documents are distributed across different containers known as shards, which are duplicated to provide redundant copies of the data in the event of hardware failure.
- Elasticsearch comes with a wide set of features: It has a number of powerful built-in features that make storing and searching data even more efficient, such as data rollups and index lifecycle management, in addition to its speed, scalability, and resiliency.
- The Elastic Stack simplifies data ingest, visualization, and reporting: Integration with Beats and Logstash makes it easy to process data before indexing into Elasticsearch.
As now we are aware of Elasticsearch, let’s get started with the demo part.
Step 1: Creating a Namespace
So before we roll out an Elasticsearch cluster, we’ll first create a Namespace.
kubectl create namespace elasticsearchdemo
kubectl get namespaces

Step 2: Creating the Headless Service
Now we’ll establish elasticsearch, a headless Kubernetes service that will define a DNS domain for Pods. A headless service has no load balancing and no static IP address.
kind: Service
apiVersion: v1
metadata:
name: elasticsearch
namespace: elasticsearchdemo
labels:
app: elasticsearch
spec:
selector:
app: elasticsearch
clusterIP: None
ports:
- port: 9200
name: rest
- port: 9300
name: inter-node
In the elasticsearchdemo Namespace, we’ve created a Service called elasticsearch
and given it the app: elasticsearch
label. The.spec.selector is then set to app: elasticsearch
causing the Service to only select Pods with the app: elasticsearch
label.
The Service will produce DNS A records that will point to Elasticsearch Pods with the app: elasticsearch
label when we associate our Elasticsearch StatefulSet with it.
kubectl create -f elasticsearch_svc.yml

The service is then made headless by setting clusterIP: None
. Finally, we define ports 9200 and 9300 for interacting with the REST API and communicating between nodes.
kubectl get svc

Step 3: Creating the Elasticsearch StatefulSet
Now, we will create a Statefulset as it will allow us to assign a stable identity to Pods and grant them stable, persistent storage. To retain data across Pod rescheduling and restarts, Elasticsearch requires stable storage.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-cluster
namespace: elasticsearchdemo
spec:
serviceName: elasticsearch
replicas: 2
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: rest
protocol: TCP
- containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
env:
- name: cluster.name
value: k8s-logs
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts
value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "es-cluster-0,es-cluster-1,es-cluster-2"
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
initContainers:
- name: fix-permissions
image: busybox
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
volumeClaimTemplates:
- metadata:
name: data
labels:
app: elasticsearch
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: do-block-storage
resources:
requests:
storage: 100Gi
Here, we have defined a Statefulset called es-cluster
in the elasticsearchdemo
namespace. We next use the serviceName
parameter to link it to our previously made elasticsearch
Service. This guarantees that each Pod in the StatefulSet may be reached via es-cluster-[0,1,2].elasticsearch.kube-logging.svc.cluster.local
where [0,1,2] corresponds to the Pod’s given integer ordinal.
We’ve set matchLabels
selector to elasticsearch
and specified two replicas (Pods), which we then mirror in the .spec.template.metadata
section. The fields .spec.selector.matchLabels
and .spec.template.metadata.labels
must be identical.
spec:
Here, we have defined pods in the statefulset. We give containers the name elasticsearch
and use the Docker image docker.elastic.co/elasticsearch/elasticsearch:7.2.0(
It can be modified to a different version).
The resources
parameter is used to specify that the container requires at least 0.1 vCPU and can burst up to 1 vCPU.
ports:
For REST API and inter-node communication, we open and identify ports 9200 and 9300.
volumeMounts:
The data volumeMount
mounts the PersistentVolume named data
to the container at the path /usr/share/elasticsearch/data
.
env:
In the container, we set the following environment variables:
cluster.name
: The Elasticsearch cluster’s name which isk8s-logs
.node.name
: It is set to the.metadata.name
field usingvalueFrom
. This will resolvees-cluster-[0,1]
, depending on the node’s assigned ordinal.discovery.seed_hosts
: A list of master-eligible cluster nodes that will be used to start the node discovery process.cluster.initial_master_nodes
: A list of master-eligible nodes that will participate in the master election process is also specified in this field.- ES_JAVA_OPTS: This is set to -Xms512m -Xmx512m, which directs the JVM to use a heap size of 512 MB for both the minimum and maximum heap sizes.
initcontainer :
We define Init Containers that will run before the main elasticsearch
app container. Init Containers will execute in the order as they are defined.
The first fix-permissions
will execute a chown
command to change the Elasticsearch data directory’s owner and group to 1000:1000
, the Elasticsearch user’s UID. Kubernetes mounts the data directory as root
by default, making it inaccessible to Elasticsearch.
The second command, increase-vm-max-map
will execute a command to increase the operating system’s mmap count limitations, which are by default too low, resulting in out-of-memory issues.
The third Init Container increase-fd-ulimit
will execute the ulimit
command to increase the maximum number of open file descriptors.
volumeClaimTemplates:
Here, we define the StatefulSet’s volumeClaimTemplates
. Kubernetes will use this to create PersistentVolumes for the Pods. We call it data
(the name we used in the volumeMounts definition) and give it the same app: elasticsearch
label as our StatefulSet.
We then define ReadWriteOnce access mode, which means that it can only be mounted as read-write by a single node. The storage class is known as do-block-storage
.
Finally, we define that each PersistentVolume should be 3 GiB in size.
Now, deploy the StatefulSet using kubectl
:
kubectl apply -f elasticsearch_sts.yaml

We can monitor the StatefulSet as it is rolled out using kubectl rollout status
:
kubectl rollout status sts/es-cluster --namespace=elasticsearchdemo
kubectl get po --namespace=elasticsearchdemo

After all of the Pods have been deployed, we can use the REST API to verify that our Elasticsearch cluster is up and running. To do so, use kubectl port-forward
to forward the local port 9200 to the port 9200 on one of the Elasticsearch nodes (es-cluster-0):
kubectl port-forward es-cluster-0 9200:9200 --namespace=elasticsearchdemo

Then, make the following curl request to the REST API:
curl http://localhost:9200/_cluster/state?pretty

To check the health of the Elasticsearch cluster,
The output will display the status of the Elasticsearch cluster which is ‘green’.
