Lineage

Understanding the working of Spark Driver and Executor

Reading Time: 4 minutes This blog pertains to Apache SPARK, where we will understand how Spark’s Driver and Executors communicate with each other to process a given job. So let’s get started. First, let’s see what Apache Spark is. The official definition of Apache Spark says that “Apache Spark™ is a unified analytics engine for large-scale data processing.” It is an in-memory computation processing engine where the data is Continue Reading

Logical and Physical Plan in Spark

Understanding Spark’s Logical and Physical Plan in layman’s term

Reading Time: 6 minutes This blog pertains to Apache SPARK 2.x, where we will find out how Spark SQL works internally in layman’s terms and try to understand what is Logical and Physical Plan. Also, we will be looking into Catalyst Optimizer. So let’s get started. First, let’s see what Apache Spark is. The official definition of Apache Spark says that “Apache Spark™ is a unified analytics engine for large-scale Continue Reading