Apache Airflow

Use cases of Apache Airflow

Reading Time: 5 minutes Apache Airflow’s versatility allows you to set up any type of workflow. Apache Airflow can run ad hoc workloads not related to any interval or schedule. However, it is most suitable for pipelines that change slowly, are related to a specific time interval, or are pre-scheduled. Adobe Adobe is a software company famously known for multimedia and creativity products such as Acrobat Reader and Photoshop. Continue Reading

Triggers in Apache Airflow

Reading Time: 3 minutes The easiest concept to understand in Airflow is Trigger rules. Let’s understand the Trigger Rules in Apache Airflow. Why Trigger Rules? By default, Airflow will wait for all the parent/upstream tasks for successful completion before it runs that task. However, this is just the default behavior of the Airflow, and you can control it using the trigger_rule the argument to a Task.  Basically, a trigger_rule defines Continue Reading

Apache Airflow: Understanding Operators

Reading Time: 5 minutes An Operator is the building block of an Airflow DAG. It determines what will be executed when the DAG runs. They can be considered as templates or blueprints that contain the logic for a predefined task, that we can define declaratively inside an Airflow DAG. When an operator is instantiated along with its required parameters then it is known as a task. An Operator defines one Continue Reading

Apache Airflow: Automate Email Alerts for Task Status

Reading Time: 4 minutes In this blog, we will learn how to Send Email Alerts to the user about the Task Status using Apache Airflow. Prerequisite: Airflow SMTP Configuration Step 1 Generate Google Application Password Visit this link and log in with your email Id and password. When you have successfully logged in, you will see the below window: You have to choose “Mail” as an app and select Continue Reading

Core Concepts of Apache Airflow

Reading Time: 4 minutes In this blog we will go over the core concepts basic you must understand if you want to use Apache airflow. In this article, you will learn: What is Airflow Architecture Overview Dag Task Operator Dag Run Execution Date Airflow Airflow was started in October 2014 and developed by Maxime Beauchemin at Airbnb. It is a platform for programmatically authoring, scheduling, and monitoring workflows. It Continue Reading

Dynamic DAGs in Apache Airflow

Reading Time: 4 minutes Airflow dynamic DAGs can save you a ton of time. As you know, Apache Airflow is written in Python, and DAGs are created via Python scripts. That makes it very flexible and powerful (even complex sometimes). By leveraging Python, you can create DAGs dynamically based on variables, connections, a typical pattern, etc. This very nice way of generating DAGs comes at the price of higher Continue Reading

Apache Airflow: Environment Variables Best Practices

Reading Time: 3 minutes A bit about Airflow Variables (Context): What is Airflow? Apache Airflow is a work-flow management tool. Airflow makes use of DAGs (Directed Acyclic Graph) to do the same. What are Airflow variables? Variables are the key-value pairs, where key represents the variable name, and value represents the assigned value of that particular variable. Where you store Airflow variables? Variables are stored inside the Airflow metadata Continue Reading

Apache Airflow: Scaling Using Celery Executor

Reading Time: 4 minutes Apache Airflow is a platform to programmatically author, schedule, and monitor workflows. If you are new to Airflow, please go through my introductory blog. One of Airflow’s biggest strengths is its ability to scale. In this blog, we will find out how we can scale Airflow using executors. Overview In Airflow, even the local executor is extremely powerful as you can start scaling Airflow and Continue Reading

Apache Airflow: DAG Structure and Data Pipeline

Reading Time: 6 minutes What is a DAG in Apache Airflow? In this blog, we are going to see what is the basic structure of DAG in Apache Airflow and we will also Configure our first Data pipeline. A DAG in apache airflow stands for Directed Acyclic Graph which means it is a graph with nodes, directed edges, and no cycles. An Apache Airflow DAG is a data pipeline Continue Reading

Apache Airflow: Writing your first pipeline

Reading Time: 3 minutes Before jumping into the code, you need to get used to what Airflow DAG is all about. it is important so stay with me, Airflow DAG? DAG stands for Directed Acyclic Graph. In simple terms, it is a graph with nodes, directed edges and no cycles. Basically, this is a DAG: We will learn step by step how to write your first DAG. Steps to Continue Reading

Apache Airflow: Write your first DAG in Apache Airflow

Reading Time: 3 minutes In this article, we’ll see how to write a basic “Hello World” DAG in Apache Airflow. We will go through all the files that we have to create in Apache Airflow to successfully write and execute our first DAG. Create a Python file Firstly, we will create a python file inside the “airflow/dags” directory. Since we are creating a basic Hello World script, we will Continue Reading

How to Set-up Airflow Environment Using Docker

Reading Time: 3 minutes In this blog we will learn how to set-up airflow environment using Docker.. Why we need Docker Apache airflow is an open source project and grows an overwhelming pace , As we can see the airflow github repository there are 632 contributors and 98 release and more than 5000 commits and the last commit was 4 hours ago. That means airflow had new commits everyday Continue Reading

A Quick insight on Apache Airflow

Reading Time: 4 minutes In this blog we are going to understand and what is Apache airflow ,workflow of Airflow, uses of Airflow and how we can install Airflow. So let’s get started.. What is Apache Airflow Airflow is a workflow platform that allows you to define, execute, and monitor workflows. A workflow can be defined as any series of steps you take to accomplish a given goal consequently Continue Reading