Member-only story
Scheduled Data Pipelines in 5 Minutes with Prefect and GitHub Actions
The easiest way to get started with scheduled serverless workflows built in Python
Scheduling is a critical component of any data platform. Whether you are running nightly ETL jobs, Sunday-night maintenance scripts, or triggering high-velocity workflows every couple of minutes, some data workloads need to be executed at the right time. This holds true even when moving to (near) real-time data ingestion. Because of its importance, scheduling has become table stakes. There are countless enterprise tools and open-source frameworks allowing you to run some work on schedule, but they are usually difficult to use and maintain. In contrast, Prefect flows can be triggered from anywhere. If you are not ready to fully migrate to scheduling your workflows using Prefect deployments, you don’t have to.
In this post, we’ll demonstrate the easiest way to run scheduled serverless workflows. Using the combination of Prefect and GitHub Actions, you’ll be able to schedule your first Python-based workflows running in the cloud in under 5 minutes.
Getting started with scheduled workflows
This demo follows a top-down approach. First, we’ll build something (just five minutes, as promised!)…