If it was still 2012 I would have eagerly heard and responded to any conversation about Big Data. Well, it was the buzz and you had to be speaking the magic words for getting people to listen to the latest and greatest in technology. But fortunately/unfortunately, it is 2017 now and it is disappointing to note that most of the world has not moved beyond Big Data. And believe me, it is not just the CIOs/CDOs who have been sitting in the ivory tower who are stuck with Big Data.It is also the energetic developers who are being scouted by talent firms for having Big Data on their resume.
We at Knoldus build a holistic software development capability for anyone who joins us as an intern. It does not matter if you have been working in the industry for 2 years or 10. When you undergo the internship we would give you a holistic software development immersion right from the Code Quality, Code conventions, Principles, practices, and patterns of software development further leading to Reactive Platforms and the ecosystem tailing into the stack that we embrace which is the Scala ecosystem and the Fast Data Platform.
The trigger for this post is the conversation with a top talent who joined us 3 months back. He was sad because he was not working on Big Data. When asked what did he mean by Big Data, the quick answer was Hadoop/Spark. When countered by the fact that he was learning Lagom and event sourcing which would allow him to build better solutions but he was not too convinced.
Now, there is nothing wrong with these technologies and in fact, they are what has made the ecosystem popular but these technologies are only a part, sometimes a very small part, of the product that would have any business value. They solve a particular piece of the puzzle and more often than not if you base your product “just” on these technologies you are bound to fail!
So where should we be headed if we are not talking about Big Data? The answer is to talk about Fast Data. Big Data as anomer/misnomer gets used in all kinds of scenarios. Talk to 10 CIOs and 9 would say that they struggle with Big Data. It is of no consequence whether one manages 1TB of data and the other is managing several hundred PB of data. I think where we should be headed is that with our solution/ product how do we make sure that the customers get the best experience. Customer Experience (CX) is going to be the king of the modern day applications. Just focusing on Spark/Hadoop/Flink and thinking that you can do Big Data is a fallacy.
Let us see how these set of so-called Big Data technologies fit into the grand scheme of things.
- If you are going to build a product which would include user interaction then you need a reactive front end to the product so that you can provide amazing customer experience.
- When hundreds and thousands of user requests come in, the product has to handle them without degrading performance. It has to be resilient.
- There are going to be transaction based processes like someone querying for something, adding an item, viewing their trades for the day. These could be handled by different micro services. These would have their individual life cycles and should be able to scale independently.
- You would like your system to be extensible and plan for any future business operations which are unforeseen at the moment. For this, you need to have event sourcing.
- You would want to separate out writes and reads to your system for making sure that the read and write SLAs are met and you are able to scale the read and write side separately.
- You would need to store your transaction data in the DB and for that, you would need either a SQL or NoSQL DB.
- Now some of your functionalities would also need analysis of data and come back with analyzed data. Now depending on the SLAs, this is where you would need Big Data frameworks to jump in.
- You would need to run some machine learning or deep learning algorithms for your product to stand out.
Of course, we are simplifying the scenario a lot but hopefully, you get the idea. Just being dependent upon a Big Data framework or hiring consultants who know a bit about Hadoop/Spark is not going to fly. You need an entire gamut of technologies that you need to work on. Right from
- Reactive UI
- Microservices framework
- Asynchronous Messaging System
- Big Data framework (there I said it!)
- Hosting strategy based on containers
- Monitoring and Telemetry
- Machine learning and AI
And believe me, this is a partial list.
And overlaying all of this is the Principles, Patterns, and Practices of effective software development. The main drivers of technology which are based on the principles of Reactive Manifesto would be
To sum it up here is one possible scheme of technologies that can fulfill the product vision.
As you would see Big Data frameworks are only a part of what you want to do. More than a drop in the ocean but still not big enough.
Hence, next time when someone comes and talks about Big Data and using Big Data framework to build the product, then do talk to them about all the other ancillaries and take what they say with a big bag of salt 🙂
Knoldus has implemented its Digital Transformation product KDP at two Fortune 50 organizations. The third implementation is underway.