Member-only story
Orchestrating ELT on Kubernetes with Prefect, dbt & Snowflake (Part 2)
A flow of flows: a guide on how to deploy large-scale data pipelines to production
9 min readJan 4, 2022
This article is the second in a series of tutorials about orchestrating ELT data pipelines. The first post demonstrated how to organize and orchestrate a variety of flows written by different teams and how to trigger those in the correct order using Prefect. This post builds on that by capturing more advanced use cases and showcases how to deploy the entire project to a Kubernetes cluster on AWS.
Table of contents:
· Snowflake configuration
∘ Creating database credentials
∘ SQL alchemy connection
∘ Using the connection to load raw data (Extract & Load)
∘ Turning the extract & load script into a Prefect flow
· dbt configuration
· Deploying your flows to a remote Kubernetes cluster on AWS EKS
∘ 1. Building a custom Docker image
∘ 2. Pushing the image to ECR
∘ 3. Creating a demo Kubernetes cluster on AWS EKS
∘ 4. Deploying a Prefect’s Kubernetes agent
∘ Changing the run configuration in your flows to KubernetesRun
∘ Cleanup no longer needed AWS resources
· Building a repeatable CI/CD process
· Next steps