Data Engineering

47 bookmarks

Custom sorting

Using Amazon SageMaker Lakehouse with DuckDB

Using Amazon SageMaker Lakehouse with DuckDB

and Glue Data Catalog

·tobilg.com·Jun 27, 2025

Using Amazon SageMaker Lakehouse with DuckDB

Incremental Processing using Netflix Maestro and Apache Iceberg

Incremental Processing using Netflix Maestro and Apache Iceberg

·netflixtechblog.com·Nov 23, 2023

Incremental Processing using Netflix Maestro and Apache Iceberg

Exploring Spark Catalog — Mastering Pyspark

Exploring Spark Catalog — Mastering Pyspark

·pyspark.itversity.com·Jun 20, 2023

Exploring Spark Catalog — Mastering Pyspark

Data cataloguing in Spark | by Petrica Leuca | Medium

Data cataloguing in Spark | by Petrica Leuca | Medium

·12ft.io·Jun 20, 2023

Data cataloguing in Spark | by Petrica Leuca | Medium

Streaming from Apache Iceberg - QCon NY 2023

Streaming from Apache Iceberg - QCon NY 2023

Streaming from Apache Iceberg

Building Low-Latency and Cost Effective Data Pipelines

Steven Wu @ Apple

·speakerdeck.com·Jun 14, 2023

Streaming from Apache Iceberg - QCon NY 2023

red-data-tools/YouPlot: A command line tool that draw plots on the terminal.

red-data-tools/YouPlot: A command line tool that draw plots on the terminal.

·github.com·May 10, 2023

red-data-tools/YouPlot: A command line tool that draw plots on the terminal.

Data processing with Spark: data catalog – own your data

Data processing with Spark: data catalog – own your data

·ownyourdata.ai·Apr 26, 2023

Data processing with Spark: data catalog – own your data

Delivering High Quality Analytics at Netflix

Delivering High Quality Analytics at Netflix

·youtube.com·Apr 21, 2023

Delivering High Quality Analytics at Netflix

Same Data, Sturdier Frame: Layering in Dimensional Data Modeling at Whatnot

Same Data, Sturdier Frame: Layering in Dimensional Data Modeling at Whatnot

·medium.com·Mar 20, 2023

Same Data, Sturdier Frame: Layering in Dimensional Data Modeling at Whatnot

Unit Testing for Data Engineers.

Unit Testing for Data Engineers.

·dataengineeringcentral.substack.com·Feb 14, 2023

Unit Testing for Data Engineers.

r/dataengineering - What did ETL look like before the "modern data stack" was a thing?

r/dataengineering - What did ETL look like before the "modern data stack" was a thing?

·reddit.com·Feb 13, 2023

r/dataengineering - What did ETL look like before the "modern data stack" was a thing?

Resolving Late Arriving Dimensions

Resolving Late Arriving Dimensions

·medium.com·Feb 7, 2023

Resolving Late Arriving Dimensions

r/dataengineering - Which lakehouse table format do you expect your organization will be using by the end of 2023?

r/dataengineering - Which lakehouse table format do you expect your organization will be using by the end of 2023?

·reddit.com·Jan 31, 2023

r/dataengineering - Which lakehouse table format do you expect your organization will be using by the end of 2023?

🫡🐳 pedramdb🫡🐳 on Twitter

🫡🐳 pedramdb🫡🐳 on Twitter

·twitter.com·Nov 30, 2022

🫡🐳 pedramdb🫡🐳 on Twitter

Data Systems Tend Towards Production

Data Systems Tend Towards Production

·ian-macomber.medium.com·Nov 29, 2022

Data Systems Tend Towards Production

Airbyte Monitoring with dbt and Metabase - Part I | Airbyte

Airbyte Monitoring with dbt and Metabase - Part I | Airbyte

·airbyte.com·Nov 18, 2022

Airbyte Monitoring with dbt and Metabase - Part I | Airbyte

Building a Data Engineering Project in 20 Minutes

Building a Data Engineering Project in 20 Minutes

·sspaeti.com·Nov 18, 2022

Building a Data Engineering Project in 20 Minutes

r/dataengineering - Has anyone built a data warehouse primarily using Databricks?

r/dataengineering - Has anyone built a data warehouse primarily using Databricks?

·reddit.com·Oct 18, 2022

r/dataengineering - Has anyone built a data warehouse primarily using Databricks?

The Contract-Powered Data Platform | Buz

The Contract-Powered Data Platform | Buz

·buz.dev·Oct 11, 2022

The Contract-Powered Data Platform | Buz

The Breakdown: Databricks, Snowflake, and Open Source Positioning in the Data World

The Breakdown: Databricks, Snowflake, and Open Source Positioning in the Data World

·productofdata.substack.com·Oct 10, 2022

The Breakdown: Databricks, Snowflake, and Open Source Positioning in the Data World

Yet another post on Data Contracts - Part 1

Yet another post on Data Contracts - Part 1

·davidsj.substack.com·Oct 8, 2022

Yet another post on Data Contracts - Part 1

The missing piece of the modern data stack

The missing piece of the modern data stack

·benn.substack.com·Sep 30, 2022

The missing piece of the modern data stack

Kicking the tires on dbt Metrics

Kicking the tires on dbt Metrics

·stkbailey.substack.com·Sep 30, 2022

Kicking the tires on dbt Metrics

The modern data experience (w/ Benn Stancil)

The modern data experience (w/ Benn Stancil)

·youtube.com·Sep 30, 2022

The modern data experience (w/ Benn Stancil)

Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department | Stitch Fix Technology – Multithreaded

Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department | Stitch Fix Technology – Multithreaded

·multithreaded.stitchfix.com·Sep 30, 2022

Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department | Stitch Fix Technology – Multithreaded

Viewpoint | dbt Docs

Viewpoint | dbt Docs

·docs.getdbt.com·Sep 28, 2022

Viewpoint | dbt Docs

Upgrading Data Warehouse Infrastructure at Airbnb

Upgrading Data Warehouse Infrastructure at Airbnb

·medium.com·Sep 27, 2022

Upgrading Data Warehouse Infrastructure at Airbnb

We the purple people

We the purple people

·getdbt.com·Sep 23, 2022

We the purple people

The end of Big Data

The end of Big Data

·benn.substack.com·Sep 23, 2022

The end of Big Data

Ep 30: The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)

Ep 30: The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)

·roundup.getdbt.com·Sep 22, 2022

Ep 30: The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)