Author: Shubham Dangare

Collecting logs in Azure Databricks

Reading Time: 3 minutes Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. In this blog, we are going to see how we can collect logs from Azure to ALA .Before going further we need to look how to setup spark cluster in azure Create a Spark cluster in Databricks In the Azure portal, go to the Databricks workspace that you created, Continue Reading

Getting Started with Apache Spark Basic

Reading Time: 4 minutes Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLib for machine learning, Graphx for graph processing, and Spark Streaming. Here, are the Spark core components All Continue Reading

Real-time Data Analytics Engine

Reading Time: 2 minutes In this System, we are going to process Real-time data or server logs and perform analysis on them using Apache Flink. Instead of using the batch processing system we are using event processing system on a new event trigger. Whenever a new event occurs, the Flink Streaming Application performs search analysis on the consumed event. Source of data here can be Hadoop, MySql, HTTP logs, Continue Reading