Member-only story

Scheduled Data Pipelines in 5 Minutes with Prefect and GitHub Actions

The easiest way to get started with scheduled serverless workflows built in Python

Anna Geller
5 min readSep 27, 2022
Prefect as a bot in space: scheduling, coordinating, and observing dataflow in the galaxy

Scheduling is a critical component of any data platform. Whether you are running nightly ETL jobs, Sunday-night maintenance scripts, or triggering high-velocity workflows every couple of minutes, some data workloads need to be executed at the right time. This holds true even when moving to (near) real-time data ingestion. Because of its importance, scheduling has become table stakes. There are countless enterprise tools and open-source frameworks allowing you to run some work on schedule, but they are usually difficult to use and maintain. In contrast, Prefect flows can be triggered from anywhere. If you are not ready to fully migrate to scheduling your workflows using Prefect deployments, you don’t have to.

In this post, we’ll demonstrate the easiest way to run scheduled serverless workflows. Using the combination of Prefect and GitHub Actions, you’ll be able to schedule your first Python-based workflows running in the cloud in under 5 minutes.

Getting started with scheduled workflows

This demo follows a top-down approach. First, we’ll build something (just five minutes, as promised!)…

--

--

Anna Geller
Anna Geller

Written by Anna Geller

Data Engineering, AWS Cloud, Serverless & .py. Get my articles via email https://annageller.medium.com/subscribe YouTube: https://www.youtube.com/@anna__geller

No responses yet