Back when I first started at Astronomer, the world was just beginning to pay attention to Apache Airflow, the open-source workflow management system created by the data engineering team at Airbnb in 2015. While a myriad of tools, both proprietary and open-source, had come onto the scene in recent years, flexible and code-based orchestration remained a second-class citizen; drag-and-drop tools catering to the business user were too inflexbile and proprietary schedulers offered by companies like Informatica, Talend, and IBM were way overpriced and required a deeper lock into their expensive web of services. People wanted to be able to do code-driven ETL and they wanted to use open-source.
Thus, Airflow swept onto the scene with some momentum that has continued to snowball over the past few years.
While Airlfow has come a long way in the past few years, it began as all open-source projects do: young and buggy. Thus, particularly in its early days, it needed some thought leadership to surface the main pain points that users experienced to the Apache PMC members and larger community actively developing on the project. Everyone we spoke to about Airflow was convinced it had major potential, but everyone also felt that there was much room for improvement (as one of our customers once said, the problem with being on the bleeding edge is that you have to bleed). My colleague and I thought that, if we were able to have some open discussion with some Airflow power users about its direction and flaws, it might help establish a more solid sense of community around the project and push it in the right direction.
The podcast itself is a blast to produce and we're actively working on it every day. We've done 10 episodes at the time of this write-up and have amassed over 14,000 views. I'm responsible for booking interviews, moderating interviews, and all of the audio production and mixing. We've had some really awesome guests come on the show, including Airflow creator Maxime Beauchemin, Luigi creator and CTO of Better Mortgage Erik Bernhardsson, and CTO of Advanced Analytics at ING Bolke de Bruin.