performance optimization

Spark 3.0 – Adaptive Query Execution With Example

Reading Time: 4 minutes Introduction Adaptive Query Execution (AQE) is one of the greatest features of Spark 3.0 which reoptimizes and adjusts query plans based on runtime statistics collected during the execution of the query. Need of AQE With each major release of Spark, it’s been introducing new optimization features in order to better execute the query to achieve greater performance. Before spark 3.0, cost-based optimization uses table statistics to determine the Continue Reading

Web Application Optimization: Cases, Tips, Tricks & Tools

Reading Time: 4 minutes Website Optimization or Conversion Optimization has always been the need and core requirement for any web application and for every organization. Conversion rate optimization is the process of finding and eliminating the roadblocks and confusion on your website that keep website visitors from achieving their goals. Think the speed of your website doesn’t matter? Think again. A 1-second delay in page load time yields: 11% Continue Reading

The Dominant APIs of Spark: Datasets, DataFrames and RDDs

Reading Time: 4 minutes While working with Spark often we come across the three APIs: DataFrames, Datasets and RDDs.  In this blog I will discuss the three in terms of use case, performance and optimization.  It is essential to keep in mind that there is seamless transformation available between the three DataFrames, Datasets and RDDs. Implicitly the RDD forms the apex of both DataFrame and Datasets. The inception of Continue Reading