Search Results for: kubernetes

Deep Dive into Spark Cluster Managers

Reading Time: 5 minutes This blog aims to dig into the different Cluster Management modes in which you can run your spark application. Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object in your main program which is called the Driver Program. Specifically, to run on a cluster, the SparkContext can connect to several types of Cluster Managers, which allocate resources across Continue Reading

Introduction to Mesos

Reading Time: 4 minutes What is Mesos ? In layman’s term, Imagine a busy airport. Airplanes are constantly taking off and landing. There are multiple runways, and an airport dispatcher is assigning time-slots to airplanes to land or takeoff. So Mesos is the airport dispatcher, runways are compute nodes, airplanes are compute tasks, and frameworks like Hadoop, Spark and Google Kubernetes are airlines companies. In technical terms, Apache Mesos Continue Reading

Streaming in Spark, Flink and Kafka

Reading Time: 7 minutes There is a lot of buzz going on between when to use use spark, when to use flink, and when to use Kafka. Both spark streaming and flink provides exactly once guarantee that every record will be processed exactly once thereby eliminating any duplicates that might be available. Both provide very high throughput compared to any other processing system like storm, and the overhead of Continue Reading