Published in Towards Data Science·6 days agoDoes the Modern Data Stack Value the “Stack” Over “Data”?We hear more about tools than about using them the right way — There have been many debates lately about unbundling or rebundling, which led to some arguments over who’s gonna win the market for any given piece of the data stack. But those debates hide an important fact: tools are just tools. …Modern Data Stack4 min read
Jul 28The Unbundling vs Rebundling of a Data Stack Debate Missed The PointDo you have to choose? — Many tools in the contemporary data landscape started to become increasingly polarized. On the one hand, there are products highly specialized in one specific area, such as data ingestion, transformation, scheduling, cataloging, experiment tracking, alerting, etc. …Data3 min read
Published in The Prefect Blog·Jul 25Serverless Real-Time Data Pipelines on AWS with Prefect, ECS and GitHub ActionsA guide to fully automated serverless real-time data pipelines — Most data platforms these days are still operated using batch processing. Even though streaming technology matured, building automated and reliable real-time data pipelines is still difficult and often requires a team of engineers to operate the underlying platform. But it doesn’t have to be that way. We wrote about it…Data Engineering15 min read
Published in The Prefect Blog·Jul 9What is a Community Engineer in an Open-Source-Product CompanyAnd why you may consider applying for this role at Prefect — There are many misconceptions about who a community engineer is. This blog post defines such a person as “someone who figures out how to help a community of people do what they want with what they have.” …Data Engineering7 min read
Published in Towards Data Science·May 2Workflow Orchestration vs. Data Orchestration — Are Those Different?Let’s disambiguate the terms to understand workflow orchestration better — with a real-life analogy! — With the rise of the Modern Data Stack, many tools in the industry started positioning themselves as “data orchestrators” rather than “workflow orchestrators.” This article attempts to disambiguate the terms. …Workflow Automation6 min read
Published in The Prefect Blog·Feb 28Say Hi to Prefect Discourse — a New Forum for Data EngineersWhy we chose Discourse and how you can benefit from this new Prefect Community forum — We're excited to announce that we have launched Prefect Discourse! Discourse is a forum for online communities that provides an unmatched level of flexibility and customization. Going forward, we will use Discourse as a knowledge base that we can grow together as a community. It's a safe place to ask…Python7 min read
Published in In Fitness And In Health·Feb 16Member-onlyHow to Look After Yourself When Treating SARS-CoV-2 at HomeSharing tips that may help you recover faster when treating SARS-CoV-2 at home — This article is entirely off-topic. I usually write about data engineering, Python, serverless, and AWS. But I’ve recently got (and recovered from) SARS-CoV-2, and here are some tips that hopefully can help you mitigate the symptoms and recover faster in case you catch this terrible virus. Note: I’m no physician…Health4 min read
Published in The Prefect Blog·Feb 15How to Use Prefect and Monte Carlo to Achieve More Reliable Data PipelinesIntroducing Monte Carlo data lineage tasks in Prefect — As recently announced, Prefect has a brand-new integration with Monte Carlo — a leading platform that adds observability features to your data warehouse. This hands-on post will dive into what Monte Carlo is and how to use it to add even more observability to your Prefect flows. You’ll learn the…Data Engineering8 min read
Published in The Prefect Blog·Jan 25How to Make Your Data Pipelines More Dynamic Using Parameters in PrefectHow to pass runtime-specific parameter values to your data pipelines — Parametrization is one of the most critical features of any modern workflow orchestration solution. It allows you to dynamically overwrite parameter values for a given run without having to redeploy your workflow. Most orchestration frameworks provide rather limited functionality in that regard, such as only allowing to override global variables…Data Engineering9 min read
Published in The Prefect Blog·Jan 10Orchestrating ELT with Prefect, dbt Cloud, and Snowflake (Part 3)How to use Prefect and dbt Cloud with a Snowflake data warehouse — This is the third post in a series of articles about orchestrating ELT data pipelines with Prefect. The first post dealt with organizing a project and orchestrating flows in a local environment. The second one discussed deploying the ELT project to Snowflake, AWS EKS, and building a CI/CD process. …Data Engineering7 min read