Airflow and Spark Orchestration on Kubernetes

--

It sounds really amazing scaling airflow when we want grow and run as many as possible in parallell. Here I just present Airflow setup for scaleing out and the some spark airflow ETL. The source code for this article is in here.

There are several way to scale Airflow workers, one of the best way is to use celery queue. Another way is using Kubernetes Executor but Argo is more better solution if Kubernetes is going to be used for workflow management.

Here I am going to explain about scaling spark on kubernetes task on Airflow:

--

--