The Unbundling vs Rebundling of a Data Stack Debate Missed The Point
Do you have to choose?
Many tools in the contemporary data landscape started to become increasingly polarized. On the one hand, there are products highly specialized in one specific area, such as data ingestion, transformation, scheduling, cataloging, experiment tracking, alerting, etc. On the opposite side of the spectrum, there are tools that attempt to re-bundle all the pieces of the data stack as part of their single product.
Tools that fall into those extreme categories tend to see the world in black and white and force you to make either-or decisions. Either you choose a single product to manage the entire end-to-end lifecycle of your dataflow, or you lose data lineage and observability. Either you switch to a single product and vendor, or you end up with data stack fragmentation and chaos.
Coordination plane instead of a control plane
Prefect 2.0 provides an alternative to either-or decisions: a coordination plane that can simultaneously be used to orchestrate your dataflow and observe the state of your data stack living outside of that orchestration. You don’t have to change how you work and adjust your data stack only to gain the benefits of orchestration, lineage, and observability. You can have the best of both worlds without compromises.
This article explains how that’s possible:
(Re)Introducing Prefect: The Global Coordination Plane
Today, our team is very pleased to announce the public release of Prefect 2.0 and Prefect Cloud 2.0, along with our…
The above post is not just an announcement of Prefect 2.0, which was launched yesterday, but one of the most balanced perspectives on the future of the data stack I’ve seen so far. Here is how the author describes the problem:
“It’s unreasonable to presume that a single orchestration plane will ever be able to control dataflow across an entire stack. […] Not only would this be an extraordinary waste of time, but this forced re-bundling of every data application back into the orchestrator would be profoundly un-modern.”
The proposed alternative is centered around observing the state of your data stack (regardless of which tools you end up using!) and leveraging the state observed by your coordination plane to drive orchestration:
“It turns out that the solution to this problem isn’t to redefine and force all dataflow to pass through the orchestrator, but rather to enable a more passive mode of collecting information, one that lets the software observe dataflow as it moves through the stack without controlling it.”
With that approach, you can coordinate any (not only modern) data stack. You can scale and adapt your data platform to unknown future needs without switching costs and compromises. You can finally coordinate work between teams without having to agree on a single monolithic orchestration plane that would only lead to conflicts, friction, and unhealthy tradeoffs where everyone loses. And you don’t even have to think about unbundling or rebundling. You don’t have to choose.