Author: Shubham Dangare

Getting Started with Apache Spark Basic

Reading Time: 4 minutes Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLib for machine learning, Graphx for graph processing, and Spark Streaming. Here, are the Spark core components All Continue Reading

Real-time Data Analytics Engine

Reading Time: 2 minutes In this System, we are going to process Real-time data or server logs and perform analysis on them using Apache Flink. Instead of using the batch processing system we are using event processing system on a new event trigger. Whenever a new event occurs, the Flink Streaming Application performs search analysis on the consumed event. Source of data here can be Hadoop, MySql, HTTP logs, Continue Reading