Elasticsearch – Pulling Socks!

Table of contents
Reading Time: 3 minutes

Well I must say, during my internship I have come across new technologies. But I am quite fascinated by this search engine – Elasticsearch!

I want to explore this tool. Reasons are –

  • Used by big websites like Github (to search huge number of projects)
  • Works quite like Google (at first look)
  • Will enhance my knowledge
    – I’ll learn the “actual Java” – the practical one
    – I’ll learn how to read others’ code
    – I will learn to code well
    – My design methodologies will improve
    – I’ll learn a lot of algorithms and how to implement my own
  • It uses APIs to handle requests / give results which will
    – help me to learn to design my own APIs
    – help me to play with JSON
    – help me to learn the details of HTTP

And it’s just what I can see right now. It might be the tip of iceberg – I don’t know it’s depth. So, one criteria to know its depth was to know its lines of code, after all I’ll be reading these lines. Here’s a snapshot while I curiously tried to know its lines of code –


The current directory I’m in is clearly visible


Here I’m considering the elasticsearch’s code only. There is also another directory that points to lucene, on which elastic search works.


So this is the first DEPTH of the elasticsearch that I can witness. One interesting thing about it is that there are 283040 actual lines of code (removing blank and comment lines) and 108242 are comment lines! That means out of 391292 actual typed lines, 108242 are comment lines – that’s 0.2766% of the total LOC – a little more than one fourth of the whole code base of actual ES. But not all comments are the part of useful information. All files I read had this chunk of lines –

* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.

And likewise not all actual Lines of code are meaningful. There are so many imports and unnecessary java syntax (And that’s the reason why Scala came).

I am now a little confused – how I start reading it? An article online says “Find the entry point of the project”. How do I find the entry point of this project? Maybe I should ask someone experienced with reading project code from scratch.

The next big problem is – how do I keeps track of how do the modules work and talk to each other (call each other’s methods). I guess I’d need to draw the UML diagrams on chart papers.

Let’s see where it goes. I hope I keep my motivation high and if I loose my interest in it then I motivate myself for a little more. It’s my first experience with such thing and first experiences are the hardest ones.

Written by 

Principal Architect at Knoldus Inc