Installation of Airflow
The more preferable approach to installing Apache-Airflow is to install it in a virtual environment. Airflow requires the latest version of PYTHON and PIP (package installer for python).
Below are the steps to install it on your system
To set up a virtual environment for Apache Airflow :
To activate the virtual environment navigate to the “bin” folder inside the apache_airflow folder and activate it using the following command :
cd apache_airflow/bin source activate
Next, we have to set the airflow home path :
To install apache-airflow:
pip install apache-airflow
For Airflow to function properly we need to initialize a database:
airflow db init
The last step is to start the webserver for airflow:
airflow webserver -p 8081
To verify if Airflow is successfully installed, access the localhost using the port number :
Creating a User in Apache Airflow
To sign in to the Airflow dashboard we need to create a User. Go through the following steps to create a user using the Airflow command-line interface.
To create a USER with Admin privileges in the Airflow database :
airflow users create -e firstname.lastname@example.org -f John -l Doe -p admin -r Admin -u admin
Now that we have created an Admin user, Login into the dashboard using the credentials. Once we successfully login to the Airflow Dashboard we see all the data pipelines we have by default.
When we login in for the first time, we get a warning on the landing page that says “The scheduler does not appear to be running“. To start the airflow scheduler execute the following command and reload the landing page :
Access Control in Airflow
When we create a user in Airflow, we also have to define what role that user will be assigned. Airflow contains a set of predefined roles by default: Admin, User, Op, Viewer, and Public. Only an Admin user has control over configuring and altering permissions for other roles.
An Admin user will have all possible permissions including granting and revoking permissions from other users.
A Public user does not have any permission.
A Viewer user has restricted viewing permission.
A User has Viewer permissions and also some extra User permission.
An Op user has User permissions and extra Op permissions.
Basic Commands for Apache Airflow
List all the DAGS that airflow brings by default:
airflow dags list
Check what tasks a DAG contains:
airflow tasks list example_xcom_args
Execute a data pipeline with a defined execution date:
airflow dags trigger -e 2022-02-02 example_xcom_args
In this blog, we saw how to properly install Airflow on your system locally using the commands line interface. We also saw how to create the first user for the Airflow instance and what roles can a User have. Lastly, we went through some basic commands of Airflow.
Stay tuned for more blogs on: https://blog.knoldus.com/