Tutorials

Tutorials

20 bookmarks
Custom sorting
How to trigger a spark job from AWS Lambda
How to trigger a spark job from AWS Lambda
Wondering how to execute a spark job on an AWS EMR cluster, based on a file upload event on S3? Then this post if for you. In this post we go over how to trigger spark jobs on an AWS EMR cluster, using AWS Lambda. The lambda function will execute in response to an S3 upload event. We will go over this event driven pattern with code snippets and set up a fully functioning pipeline.
·startdataengineering.com·
How to trigger a spark job from AWS Lambda
Course Information - Big Data Platforms, Autumn 2021
Course Information - Big Data Platforms, Autumn 2021
Helsingin yliopiston kaikille avoin ja ilmainen ohjelmoinnin perusteet opettava verkkokurssi. Kurssilla perehdytään nykyaikaisen ohjelmoinnin perusideoihin sekä ohjelmoinnissa käytettävien työvälineiden lisäksi algoritmien laatimiseen. Kurssille osallistuminen ei vaadi ennakkotietoja ohjelmoinnista.
·big-data-platforms-21.mooc.fi·
Course Information - Big Data Platforms, Autumn 2021
The Apache Cassandra Beginner Tutorial
The Apache Cassandra Beginner Tutorial
There are lots of data-storage options available today. You have to choose between managed or unmanaged, relational or NoSQL, write- or read-optimized, proprietary or open-source — and it doesn't end there. Once you begin your search, you will end up in the universe that is database marketing. All of the vendors
·freecodecamp.org·
The Apache Cassandra Beginner Tutorial
NoSQL databases sample models: MongoDB, Neo4j, Swagger, Cassandra
NoSQL databases sample models: MongoDB, Neo4j, Swagger, Cassandra
Get the sample models for MongoDB, Neo4j, Cassandra, Swagger, Avro, Parquet, Glue, and more! After download, open the models using Hackolade, and learn through the examples how to leverage the modeling power of the software.
·hackolade.com·
NoSQL databases sample models: MongoDB, Neo4j, Swagger, Cassandra
How to Put a Database in Kubernetes - DZone Cloud
How to Put a Database in Kubernetes - DZone Cloud
Learn the key steps of deploying databases and stateful workloads in Kubernetes and meet cloud-native technologies that can streamline Apache Cassandra for K8s.
·dzone.com·
How to Put a Database in Kubernetes - DZone Cloud
Ultimate CI Pipeline for All of Your Python Projects
Ultimate CI Pipeline for All of Your Python Projects
Everything you ever wanted for your Python project continuous integration pipeline — up-and-running in matter of minutes
·towardsdatascience.com·
Ultimate CI Pipeline for All of Your Python Projects
Starting your journey with Microsoft Azure Data Factory
Starting your journey with Microsoft Azure Data Factory
In this article, we will go through the Microsoft Azure Data Factory service, that can be used to ingest, copy and transform data generated from various data sources
·sqlshack.com·
Starting your journey with Microsoft Azure Data Factory
Whats the difference between ETL & ELT?
Whats the difference between ETL & ELT?
This post goes over what the ETL and ELT data pipeline paradigms are. It tries to address the inconsistency in naming conventions and how to understand what they really mean. Finally ends with a comparison of the 2 paradigms and how to use these concepts to build efficient and scalable data pipelines.
·startdataengineering.com·
Whats the difference between ETL & ELT?
Where to validate incoming data?
Where to validate incoming data?
When you watch the blueprint I also use in my cookbook you see the different phases: Connect, Processing Framework, Store and Buffer. At…
·medium.com·
Where to validate incoming data?
A Beginner Guide to Airflow
A Beginner Guide to Airflow
A step-by-step guide on how to start with Airflow: from your local set-up to creating simple tasks.
·medium.com·
A Beginner Guide to Airflow
How to improve at SQL as a data engineer
How to improve at SQL as a data engineer
Are you disappointed with online SQL tutorials that aren't deep enough? Are you frustrated knowing that you are missing SQL skills, but can't quite put your finger on it? This post is for you. In this post, we go over a few topics that can take your SQL skills to the next level and help you be a better data engineer.
·startdataengineering.com·
How to improve at SQL as a data engineer
6 Key Concepts, to Master Window Functions
6 Key Concepts, to Master Window Functions
In this post, we go over 6 key concepts to help you master window functions. Window functions are one the most powerful features of SQL, they are very useful in analytics and performing operations that cannot be done easily with the standard group by, subquery and filters. Despite this, window functions are not used frequently. If you have ever thought 'window functions are confusing', then this post is for you.
·startdataengineering.com·
6 Key Concepts, to Master Window Functions
What are Common Table Expressions(CTEs) and when to use them?
What are Common Table Expressions(CTEs) and when to use them?
You have heard of Common Table Expressions(CTEs), but are not be sure what they are and when to use them. What if you knew exactly what Common Table Expressions(CTEs) were and when to use them? In this post, we go over what CTEs are, and their performance comparisons against subqueries, derived tables, and temp tables to help decide when to use them.
·startdataengineering.com·
What are Common Table Expressions(CTEs) and when to use them?
Designing a Data Project to Impress Hiring Managers
Designing a Data Project to Impress Hiring Managers
Frustrated that hiring managers are not reading your Github projects? then this post is for you. In this post, we discuss a way to impress hiring managers by hosting a live dashboard with near real-time data. We will also go over coding best practices such as project structure, automated formatting, and testing to make your code professional. By the end of this post, you will have deployed a live dashboard that you can link to your resume and LinkedIn.
·startdataengineering.com·
Designing a Data Project to Impress Hiring Managers
Data Engineering Project: Stream Edition · Start Data Engineering
Data Engineering Project: Stream Edition · Start Data Engineering
Data engineering project for beginners, stream edition. In this post we design and build a simple data streaming pipeline using Apache Kafka, Apache Flink and PostgreSQL DB. We will also review the design and understand some common issues to avoid while building distributed stream processing systems.
·startdataengineering.com·
Data Engineering Project: Stream Edition · Start Data Engineering