Member-only story
Orchestrating ELT with Prefect and dbt — a Flow of Flows (Part 1)
How to manage dependencies between data pipelines
Workflow orchestration platforms have historically allowed managing task dependencies within individual data pipelines. While this is a good start, what if you have dependencies between data pipelines?
Say you have some flows or directed acyclic graphs (DAGs) that ingest operational data from various sources into the staging area of your data warehouse. You then want to build some business logic downstream, but only after the previous pipelines are completed.
Many frameworks either try to avoid the issue or provide half-baked solutions, such as offering only a visual grouping of tasks without actually treating those as individual first-class objects. But the underlying problem is real, and it affects nearly every company doing analytics at scale.
Table of contents:
· Describing the problem
· The desired solution
· Demo: orchestrating EL pipelines and dbt transformations with Prefect
· Preparing the environment
· Extract and load flow
· dbt flow
· Dashboards flow
· A flow of flows: where the real orchestration happens
· Extending the flow of flows to new use cases
· Next steps