Use cases of Apache Airflow

Reading Time: 5 minutes

Apache Airflow’s versatility allows you to set up any type of workflow. Apache Airflow can run ad hoc workloads not related to any interval or schedule. However, it is most suitable for pipelines that change slowly, are related to a specific time interval, or are pre-scheduled.

Adobe

Apache Airflow use case for AbodeAdobe is a software company famously known for multimedia and creativity products such as Acrobat Reader and Photoshop. The Adobe Experience Platforms uses Apache Airflow’s plugin interface to write custom operators.

What was the problem?

Apache Airflow is highly extensible and its plugin interface are use to meet a variety of use cases. It supports variety of deployment models and has a very active community to scale innovation.

Big Data platforms require many data pipelines to connect with backend services enabling complex workflows.These workflows need to be deployed, monitored, and run either on regular schedules or triggered by external events.

How did Apache Airflow help to solve this problem?

Adobe Experience Platform built an orchestration service to meet our user and customer requirements. It is architected based on guiding principles to leverage an off-the-shelf, open-source orchestration engine that is abstracted to other services through an API and extendable to any application through a pluggable framework.

Adobe Experience Platform orchestration service leverages Apache Airflow execution engine for scheduling and executing various workflows. Apache Airflow is highly extensible . With support of K8s Executor it can solve problems.. It has a very rich Airflow Web UI to provide various workflow-related insights.

What are the results after Apache Airflow?

Adobe Experience Platform is using Apache Airflow’s plugin interface to write custom operators to meet the use cases. With K8s Executor, it could scale it to run 1000(s) of concurrent workflows. Adobe and Adobe Experience Platform teams can focus on business use cases because all scheduling, dependency management, and retrying logic is offloaded to Apache Airflow.

Experity

Apache Airflow use case for ExperityExperity, Inc. provides health care software and solutions. The Company designs and produces electronic medical record and practice management software for work flow management, electronic billing, accounts receivable reporting, and insurance billing for urgent care facilities.

Airflow can be an enterprise scheduling tool if used properly. Its ability to run “any command, on any node” is amazing. Handling complex, mixed-mode tasks was easy and scaling out with celery workers is huge. The open source community is great and we can help diagnose and debug our own problems as well as contribute those back to the greater good.

What was the problem?

They had to deploy their complex, flagship app to multiple nodes in multiple ways. This required tasks to communicate across Windows nodes and coordinate timing perfectly. they did not want to buy an expensive enterprise scheduling tool and needed ultimate flexibility.

How did Apache Airflow help to solve this problem?

Flexible, multi-node, DAG capable tooling was key and airflow was one of the few tools that fit that bill. Having it based on open source and python were large factors that upheld our core principles. At the time, Airflow was missing a windows hook and operator so they contributed the WinRM hook and operator back to the community.DAG generators also used to have there metadata drive and keep maintenance costs down.

What are the results after Apache Airflow?

Flexible deployment framework that allows to be as nimble as possible. The reliability is something have grown to trust as long as use the tool correctly. The scalability has also allowed to decrease the time it takes to operate on fleet of servers.

Plarium

Apache Airflow use case for PlariumPlarium is a gaming web platform. It provides over 20 games, including Vikings: War of Clans, The Stormfall franchise, and Raid: Shadow Legends. Creating a cross-platform gaming platform requires a more sophisticated workflow orchestrated for solving tasks related to game development.

Apache Airflow helps efficiently tackle crucial game dev tasks, such as working with churn or sorting bank offers.

What was the problem?

There Research & Development department carries out various experiments, and in all of them, they need to create workflow orchestrations for solving tasks in game dev. They had to orchestrate processes manually and entirely from scratch every time. This led to difficulties with dependencies and monitoring when building complex workflows. There was the need of centralised approach where all logs can be monitors, the number of retries, and the task performance time. The most important thing that lacked was the ability to backfill historical data and restart failed tasks.

How did Apache Airflow help to solve this problem?

Apache Airflow offers lots of convenient built-in solutions, including integrative ones. The DAG model helps avoid errors and follow general patterns when building workflows. In addition, this platform has a large community where find plenty of sensors and operators that cover 90% of our cases. This allows to save ourselves loads of time.

What are the results after Apache Airflow?

Apache Airflow managed to simplify the process of building complex workflows. Many procedures that are so important for game development, such as working with the churn rate, processing messages to the support team, and sorting bank offers, now run efficiently, and all issues are resolved centrally.

Dish

Apache Airflow use case for DishDISH was founded on adventure and an unshakeable desire to win. It’s what drove us to launch satellites into space when people said we couldn’t. To take on the world’s largest industrial corporation when people said we shouldn’t. And to connect millions of Americans to the TV they love when the cable companies wouldn’t.

Airflow is Batteries-Included. A great ecosystem and community that comes together to address about any (batch) data pipeline needs.

What was the problem?

Increasing complexity managing lengthy crontabs with scheduling being an issue, this required carefully planning timing due to resource constraints, usage patterns, and especially custom code needed for retry logic. Verify success of previous jobs and/or steps prior to running the next. Furthermore, time to results is important, but they were increasingly relying on buffers for processing, where things were effectively sitting idle and not processing, waiting for the next stage, in an effort to not rely as much on custom code/logic.

How did Apache Airflow help to solve this problem?

Relying on community built and existing hooks and operators to the majority of cloud services use has allowed to focus on business outcomes rather than operations.

What are the results after Apache Airflow?

Airflow helps to manage many of pain-points.Able to reduce time-to-end delivery of data products by being event-driven in our processing flows as now able to take out over 2 hours – on average – of various waiting between stages) Furthermore, able to arrive at and iterate on products quicker as a result of not needing as much custom or roll-our-solutions.Code base become smaller and simpler, it is easier to follow, and to a large extent DAGs serve as sufficient documentation for new contributors to understand what is going on.

Conclusion

Apache Airflow has been a leading workforce management tool since its introduction. It combines the features such as ease of use, high-level functionality, and many more under a single platform.

Others uses are also there for more please reference to following.

https://airflow.apache.org/use-cases/

For more blogs please check :

https://blog.knoldus.com/