Found 489 bookmarks
Newest
The Illustrated Transformer
The Illustrated Transformer
Discussions: Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments) Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese Watch: MIT’s Deep Learning State of the Art lecture referencing this post Featured in courses at Stanford, Harvard, MIT, Princeton, CMU and others Update: This post has now become a book! Check out LLM-book.com which contains (Chapter 3) an updated and expanded version of this post speaking about the latest Transformer models and how they've evolved in the seven years since the original Transformer (like Multi-Query Attention and RoPE Positional embeddings). In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformer outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions. The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter. 2025 Update: We’ve built a free short course that brings the contents of this post up-to-date with animations: A High-Level Look Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.
·jalammar.github.io·
The Illustrated Transformer
Aggregation Theory
Aggregation Theory
The disruption caused by the Internet in industry after industry has a common theoretical basis described by Aggregation Theory.
·stratechery.com·
Aggregation Theory
What Would a Kubernetes 2.0 Look Like
What Would a Kubernetes 2.0 Look Like
As we approach the 10 year anniversary of the 1.0 release of Kubernetes, let's take stock of the successes and failures of the project in the wild. Also what would be on a wish list for a Kubernetes 2.0 release.
·matduggan.com·
What Would a Kubernetes 2.0 Look Like
Real-world engineering challenges: building Cursor
Real-world engineering challenges: building Cursor
Cursor has grown 100x in load in just a year, sees 1M+ QPS for its data layer, and serves billions of code completions, daily. A deepdive into how it’s built with cofounder, Sualeh Asif
·newsletter.pragmaticengineer.com·
Real-world engineering challenges: building Cursor
OpenAI: Scaling PostgreSQL to the Next Level
OpenAI: Scaling PostgreSQL to the Next Level
At the PGConf.dev 2025 Global Developer Conference, Bohan Zhang from OpenAI shared OpenAI’s best practices with PostgreSQL, offering a glimpse into the database usage of one of the most prominen
·pixelstech.net·
OpenAI: Scaling PostgreSQL to the Next Level
Stuff I learned at Carta.
Stuff I learned at Carta.
Today’s my last day at Carta, where I got the chance to serve as their CTO for the past two years. I’ve learned so much working there, and I wanted to end my chapter there by collecting my thoughts on what I learned. (I am heading somewhere, and will share news in a week or two after firming up the communication plan with my new team there.) The most important things I learned at Carta were:
·lethain.com·
Stuff I learned at Carta.
Tech hiring: is this an inflection point?
Tech hiring: is this an inflection point?
We might be seeing the end of remote interviews as we know them, and a return of in-person interviews, trial weeks and longer trial periods. Could hiring be returning to pre-pandemic norms?
·newsletter.pragmaticengineer.com·
Tech hiring: is this an inflection point?
The Reality of Tech Interviews in 2025
The Reality of Tech Interviews in 2025
Interview processes are changing in a tech market that’s both cooling AND heating up at the same time. A deepdive with Hello Interview founders, Evan King and Stefan Mai
·newsletter.pragmaticengineer.com·
The Reality of Tech Interviews in 2025
Exploring Generative AI
Exploring Generative AI
Notes from my Thoughtworks colleagues on AI-assisted software delivery
·martinfowler.com·
Exploring Generative AI