Apache Airflow: Installation guide and Basic Commands

Reading Time: 3 minutes

Installation of Airflow

The more preferable approach to installing Apache-Airflow is to install it in a virtual environment. Airflow requires the latest version of PYTHON and PIP (package installer for python).

Below are the steps to install it on your system

To set up a virtual environment for Apache Airflow :

virtualenv apache_airflow

To activate the virtual environment navigate to the “bin” folder inside the apache_airflow folder and activate it using the following command :

cd apache_airflow/bin
source activate

Next, we have to set the airflow home path :

export AIRFLOW_HOME=~/airflow

To install apache-airflow:

pip install apache-airflow

For Airflow to function properly we need to initialize a database:

airflow db init

The last step is to start the webserver for airflow:

airflow webserver -p 8081

To verify if Airflow is successfully installed, access the localhost using the port number :

http://localhost:8081/

Creating a User in Apache Airflow

To sign in to the Airflow dashboard we need to create a User. Go through the following steps to create a user using the Airflow command-line interface.

To create a USER with Admin privileges in the Airflow database :

airflow users create -e admin@example.org -f John -l Doe -p admin -r Admin -u admin

Now that we have created an Admin user, Login into the dashboard using the credentials. Once we successfully login to the Airflow Dashboard we see all the data pipelines we have by default.

When we login in for the first time, we get a warning on the landing page that says “The scheduler does not appear to be running“. To start the airflow scheduler execute the following command and reload the landing page :

airflow scheduler

Access Control in Airflow

When we create a user in Airflow, we also have to define what role that user will be assigned. Airflow contains a set of predefined roles by default: Admin, User, Op, Viewer, and Public. Only an Admin user has control over configuring and altering permissions for other roles.

Admin

An Admin user will have all possible permissions including granting and revoking permissions from other users.

Pubic

A Public user does not have any permission.

Viewer

A Viewer user has restricted viewing permission.

User

A User has Viewer permissions and also some extra User permission.

Op

An Op user has User permissions and extra Op permissions.

Basic Commands for Apache Airflow

List all the DAGS that airflow brings by default:

airflow dags list

Check what tasks a DAG contains:

airflow tasks list example_xcom_args

Execute a data pipeline with a defined execution date:

airflow dags trigger -e 2022-02-02 example_xcom_args

Conclusion

In this blog, we saw how to properly install Airflow on your system locally using the commands line interface. We also saw how to create the first user for the Airflow instance and what roles can a User have. Lastly, we went through some basic commands of Airflow.

Stay tuned for more blogs on: https://blog.knoldus.com/

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading