Author: Ayush Tiwari

Databricks jobs

Reading Time: 2 minutes Jobs A job is a way to run non-interactive code in a Databricks cluster. For example, you can run an extract, transform, and load (ETL) workload interactively or on a schedule. You can also run jobs interactively in the notebook UI. Your job can consist of a single task or can be a large, multi-task workflow with complex dependencies. Databricks manages the task orchestration, cluster management, Continue Reading

Introduction to GitLab CI/CD

Reading Time: 3 minutes To use GitLab CI/CD: Ensure you have runners available to run your jobs. Install GitLab Runner and register a runner for your instance, project, or group if you don’t have a runner. Create a .gitlab-ci.yml file at the root of your repository. This file is where you define your CI/CD jobs. In GitLab, runners are agents that run your CI/CD jobs.You might already have runners available for your project, Continue Reading

Understanding the Apache Spark Streaming

Reading Time: 2 minutes The Apache Streaming module is a stream processing-based module within Apache Spark. It uses the Spark cluster to offer the ability to scale to a high degree. Being based on Spark, it is also highly fault-tolerant, having the ability to rerun failed tasks by checkpointing the data stream that is being processed. Four Major Aspects of Spark Streaming Fast recovery from failures and stragglers Better Continue Reading

Spark Session

Understanding Spark Application Concepts

Reading Time: 3 minutes Once you have downloaded the spark and are ready with the SparkShell and executed some shortcode examples. After that, to understand what’s happening behind your sample code you should be familiar with some of the critical concepts of the Spark application. Some important terminology used are: ApplicationA user program built on Spark using its APIs. It consists of a driver program and executors on the Continue Reading

volatile

Difference Between Synchronized and Volatile in Java

Reading Time: 3 minutes Even though synchronized and volatile help to keep away from multi-threading issues, they’re completely unique from each other. Before seeing the difference between them, let’s understand what does synchronized and volatile variables in Java provide. Synchronization in Java We all know that Java is a multi-threaded language in which multiple threads execute in parallel to complete program execution, so in this multi-threaded environment synchronization of Java Continue Reading