Incremental Processing using Netflix Maestro and Apache Iceberg
Data Engineering
Exploring Spark Catalog — Mastering Pyspark
Data cataloguing in Spark | by Petrica Leuca | Medium
Streaming from Apache Iceberg - QCon NY 2023
Streaming from Apache Iceberg
Building Low-Latency and Cost Effective Data Pipelines
Steven Wu @ Apple
red-data-tools/YouPlot: A command line tool that draw plots on the terminal.
Data processing with Spark: data catalog – own your data
Delivering High Quality Analytics at Netflix
Same Data, Sturdier Frame: Layering in Dimensional Data Modeling at Whatnot
Unit Testing for Data Engineers.
r/dataengineering - What did ETL look like before the "modern data stack" was a thing?
Resolving Late Arriving Dimensions
r/dataengineering - Which lakehouse table format do you expect your organization will be using by the end of 2023?
🫡🐳 pedramdb🫡🐳 on Twitter
Data Systems Tend Towards Production
Airbyte Monitoring with dbt and Metabase - Part I | Airbyte
Building a Data Engineering Project in 20 Minutes
r/dataengineering - Has anyone built a data warehouse primarily using Databricks?
The Contract-Powered Data Platform | Buz
The Breakdown: Databricks, Snowflake, and Open Source Positioning in the Data World
Yet another post on Data Contracts - Part 1
The missing piece of the modern data stack
Kicking the tires on dbt Metrics
The modern data experience (w/ Benn Stancil)
Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department | Stitch Fix Technology – Multithreaded
Viewpoint | dbt Docs
Upgrading Data Warehouse Infrastructure at Airbnb
We the purple people
The end of Big Data
Ep 30: The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)
Microsoft, Google, and the original purple people