In this blog we will learn how to set-up airflow environment using Docker..
Why we need Docker
Apache airflow is an open source project and grows an overwhelming pace , As we can see the airflow github repository there are 632 contributors and 98 release and more than 5000 commits and the last commit was 4 hours ago. That means airflow had new commits everyday and constant releases.
So To manage and maintains different version of airflow is already a challenge.
So Airflow is build to integrate with all databases, system, cloud environments,…
- Managing and maintaining all of the dependencies changes will be really difficult.
- Takes lots of time to set up, and config Airflow environment.
- How to share development and production environments for all developers.
Therefor,If you miss one installation steps then you have to clear everything and start over again , So with all the challenges in mind that all those problem gives us motivation to use Docker.
In simple words, Docker is a software containerization platform, meaning you can build your application, package them along with their dependencies into a container and then these containers can be shipped to run on other machines.
Okay, but what is Containerization anyway?
Containerization, also called container-based virtualization and application containerization — is an OS-level virtualization method for deploying and running distributed applications without launching an entire VM for each application. Instead, many isolated systems, called containers, on a single control host and access a single kernel.
Benefits of using Docker :
- Docker is freeing us from the task of managing, maintaining all of the Airflow dependencies, and deployment.
- Easy to share and deploy different versions and environments.
- Keep track through Github tags and releases.
- Ease of deployment from testing to production environment.
- Install the prerequisites
- Run the service
- Check http://localhost:8080
So for deploy Airflow on Docker Compose, you should fetch Docker-compose.yaml.
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.2.3/docker-compose.yaml'
Here we can see we got the docker-compose.yaml
Setting the right Airflow user
On Linux, the quick-start needs to know your host user id and needs to have group id set to
0. Otherwise the files created in
plugins will be created with
root user. You have to make sure to configure them for the docker-compose:
mkdir -p ./dags ./logs ./plugins echo -e "AIRFLOW_UID=$(id -u)" > .env
Initialize the Database
On all operating systems, you need to run database migrations and create the first user account. To do it, run.
docker-compose up airflow-init
After initialization is complete, you should see a message like below.
Now you can start all the services:
And Now go to the address http://localhost:8080 and you will be presented with the below screen
login with your credential, and if you are login for the first time :
Id : airflow
Password : airflow
And, If you see the above image when accessing the localhost (8080 Port) that means Airflow has been installed on your system.
That’s it, folks. I hope you liked the blogs. Thanks.
To read more tech blogs, visit Knoldus Blogs.