Member-only story

Orchestrating ELT with Prefect and dbt — a Flow of Flows (Part 1)

How to manage dependencies between data pipelines

Anna Geller
8 min readNov 16, 2021
Illustration of an ELT flow using Prefect and dbt

Workflow orchestration platforms have historically allowed managing task dependencies within individual data pipelines. While this is a good start, what if you have dependencies between data pipelines?

Say you have some flows or directed acyclic graphs (DAGs) that ingest operational data from various sources into the staging area of your data warehouse. You then want to build some business logic downstream, but only after the previous pipelines are completed.

Many frameworks either try to avoid the issue or provide half-baked solutions, such as offering only a visual grouping of tasks without actually treating those as individual first-class objects. But the underlying problem is real, and it affects nearly every company doing analytics at scale.

Table of contents:
· Describing the problem
· The desired solution
· Demo: orchestrating EL pipelines and dbt transformations with Prefect
· Preparing the environment
· Extract and load flow
· dbt flow
· Dashboards flow
· A flow of flows: where the real orchestration happens
· Extending the flow of flows to new use cases
· Next steps

Describing the problem

--

--

Anna Geller
Anna Geller

Written by Anna Geller

Data Engineering, AWS Cloud, Serverless & .py. Get my articles via email https://annageller.medium.com/subscribe YouTube: https://www.youtube.com/@anna__geller

Responses (3)