Found 1995 bookmarks
Newest
How DeepSeek Rewrote the Transformer [MLA]
How DeepSeek Rewrote the Transformer [MLA]
Thanks to KiwiCo for sponsoring today’s video! Go to https://www.kiwico.com/welchlabs and use code WELCHLABS for 50% off your first monthly club crate or for 20% off your first Panda Crate! MLA/DeepSeek Poster at 17:12 (Free shipping for a limited time with code DEEPSEEK): https://www.welchlabs.com/resources/mladeepseek-attention-poster-13x19 Limited edition MLA Poster and Signed Book: https://www.welchlabs.com/resources/deepseek-bundle-mla-poster-and-signed-book-limited-run Imaginary Numbers book is back in stock! https://www.welchlabs.com/resources/imaginary-numbers-book Special Thanks to Patrons https://www.patreon.com/c/welchlabs Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin, Nicolas baumann, Jason Singh, Robert Riley, vornska, Barry Silverman, Jake Ehrlich References DeepSeek-V2 paper: https://arxiv.org/pdf/2405.04434 DeepSeek-R1 paper: https://arxiv.org/abs/2501.12948 Great Article by Ege Erdil: https://epoch.ai/gradient-updates/how-has-deepseek-improved-the-transformer-architecture GPT-2 Visualizaiton: https://github.com/TransformerLensOrg/TransformerLens Manim Animations: https://github.com/stephencwelch/manim_videos Technical Notes 1. Note that DeepSeek-V2 paper claims a KV cache size reduction of 93.3%. They don’t exactly publish their methodology, but as far as I can tell it’s something likes this: start with Deepseek-v2 hyperparameters here: https://huggingface.co/deepseek-ai/DeepSeek-V2/blob/main/configuration_deepseek.py. num_hidden_layers=30, num_attention_heads=32, v_head_dim = 128. If DeepSeek-v2 was implemented with traditional MHA, then KV cache size would be 2*32*128*30*2=491,520 B/token. With MLA with a KV cache size of 576, we get a total cache size of 576*30=34,560 B/token. The percent reduction in KV cache size is then equal to (491,520-34,560)/492,520=92.8%. The numbers I present in this video follow the same approach but are for DeepSeek-v3/R1 architecture: https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/config.json. num_hidden_layers=61, num_attention_heads=128, v_head_dim = 128. So traditional MHA cache would be 2*128*128*61*2 = 3,997,696 B/token. MLA reduces this to 576*61*2=70,272 B/token. Tor the DeepSeek-V3/R1 architecture, MLA reduces the KV cache size by a factor of 3,997,696/70,272 =56.9X. 2. I claim a couple times that MLA allows DeepSeek to generate tokens more than 6x faster than a vanilla transformer. The DeepSeek-V2 paper claims a slightly less than 6x throughput improvement with MLA, but since the V3/R1 architecture is heavier, we expect a larger lift, which is why i claim “more than 6x faster than a vanilla transformer” - in reality it’s probably significantly more than 6x for the V3/R1 architecture. 3. In all attention patterns and walkthroughs, we’re ignoring the |beginning of sentence| token. “The American flag is red, white, and” actually maps to 10 tokens if we include this starting token, and may attention patterns do assign high values to this token. 4. We’re ignoring bias terms matrix equations. 5. We’re ignoring positional embeddings. These are fascinating. See DeepSeek papers and ROPE.
·youtube.com·
How DeepSeek Rewrote the Transformer [MLA]
SmolDocling - The SmolOCR Solution?
SmolDocling - The SmolOCR Solution?
In this video I look at SmolDocling and how it compares to the other OCR solutions that are out there, both open and proprietary. Blog: https://huggingface.c...
·youtube.com·
SmolDocling - The SmolOCR Solution?
How to Build an In-N-Out Agent with OpenAI Agents SDK
How to Build an In-N-Out Agent with OpenAI Agents SDK
In this video, I take a deeper dive look at the OpenAI Agents SDK and how it can be used to build a fast food agent. Colab: https://dripl.ink/MZw2R For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps: 00:00 Intro 00:11 Creating an In-N-Out Agent (Colab Demo) 00:40 In-N-Out Burger Agent 04:35 Streaming runs 05:40 Adding Tools 08:20 Websearch Tool 09:45 Agents as Tools 12:21 Giving it a Chat Memory
·youtube.com·
How to Build an In-N-Out Agent with OpenAI Agents SDK
Gemma 3: What You Need To Know - Gradient Flow
Gemma 3: What You Need To Know - Gradient Flow
Gemma 3 represents Google’s approach to accessible AI, bridging the gap between cutting-edge research and practical application. While the Gemini family represents Google’s flagship, closed, and most powerful models, Gemma offers a lightweight, “open” counterpart designed for wider use and customization. Specifically, Gemma 3’s model weights are openly released, allowing developers to download, deploy, andContinue reading "Gemma 3: What You Need To Know"
·gradientflow.com·
Gemma 3: What You Need To Know - Gradient Flow
Gemma 3 - The NEW Gemma Family Members Have Arrived!!!
Gemma 3 - The NEW Gemma Family Members Have Arrived!!!
In this video, I look at the release of the new Gemma 3 models, which come in four different flavors: a 1B, a 4B, a 12B, and the new Big 27B parameter model. Demo: https://huggingface.co/spaces/huggingface-projects/gemma-3-12b-it Blog: https://blog.google/technology/developers/gemma-3/?linkId=sam_witteveen Model Weights: https://huggingface.co/collections/google/gemma-3-release-67c6c6f89c4f76621268bb6d For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps:
·youtube.com·
Gemma 3 - The NEW Gemma Family Members Have Arrived!!!
DeepSeek-R1: Model Architecture
DeepSeek-R1: Model Architecture
This article provides an in-depth exploration of the DeepSeek-R1 model architecture. Let’s trace DeepSeek-R1 model from input to the output…
·shaktiwadekar.medium.com·
DeepSeek-R1: Model Architecture
Mistral OCR - Multimodal & Multilingual OCR
Mistral OCR - Multimodal & Multilingual OCR
In this video, I look at the latest release from Mistral AI, which is their Mistral OCR model. I look at how it works and how it compares to other models, as well as how you can get started using it with code. Colab: https://dripl.ink/Sr4Uk Blog: https://mistral.ai/news/mistral-ocr For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps: 00:00 Intro 00:17 Other models 00:35 Mistral OCR Blog 05:45 Mistral OCR Demo 13:47 Mistral OCR Batch inference
·youtube.com·
Mistral OCR - Multimodal & Multilingual OCR
Can’t afford “Deep Research”? Me either. We don’t have to thanks to Ai2
Can’t afford “Deep Research”? Me either. We don’t have to thanks to Ai2
I'm sure OpenAI's implementation of "deep research" is great, but I can't afford that. Ai2’s ScholarQA tool is FREE and open source!! Allen AI’s Scholar QA: https://scholarqa.allen.ai/ Please Like and Subscribe to support the channel! @LearnMetaAnalysis Access state of the art LLMs all in one place with ChatLLM – My 3 month review of ChatLLM: https://youtu.be/_Z3nLKvTbGc Tutorials and how-to guides: Connect a LLM to your Zotero (or any other local folder): https://youtu.be/b2BSZfOtD_w Conventional meta-analysis: https://www.youtube.com/playlist?list=PLXa5cTEormkEbYpBIgikgE0y9QR7QIgzs Three-level meta-analysis: https://www.youtube.com/playlist?list=PLXa5cTEormkHwRmu_TJXa7fSb6-WBXXoJ Three-level meta-analysis with correlated and hierarchical effects and robust variance estimation: https://www.youtube.com/playlist?list=PLXa5cTEormkEGenfcnp9X5dQUhmm7f9Jp Want free point and click (no coding required) meta-analysis software? Check out Simple Meta-Analysis: https://learnmeta-analysis.com/pages/simple-meta-analysis-software Tired of manually extracting data for systematic review and meta-analysis? Check out AI-Assisted Data Extraction, a free package for R! https://youtu.be/HuWXbe7hgFc Free ebook on meta-analysis in R (no download required): https://noah-schroeder.github.io/reviewbook/ Visit our website at https://learnmeta-analysis.com/ 0:00 OpenAI’s Deep Research 0:36 ScholarQA 1:26 First Test 11:49 Second Test 21:15 Debrief
·youtube.com·
Can’t afford “Deep Research”? Me either. We don’t have to thanks to Ai2
Hands on with Deep Research
Hands on with Deep Research
Deep Research is the title of a new mode in several GenAI apps, including Google’s Gemini, OpenAI’s ChatGPT, and most recently, Perplexity. In this article, I will be focusing on the currently most hyped of these: OpenAI’s Deep Research. Although they weren’t first to release a product with this title (that was Google), they have […]
·leonfurze.com·
Hands on with Deep Research
DeepSeek R1 + Sonnet
DeepSeek R1 + Sonnet
I’m a big fan of Claude Sonnet. I’m ashamed to admit, it’s mostly vibes based. It’s friendlier and writes code in a way that I like. The…
·medium.com·
DeepSeek R1 + Sonnet
Wolfram LLM Benchmarking Project
Wolfram LLM Benchmarking Project
Results from Wolfram's ongoing tracking of LLM performance. The benchmark is based on a Wolfram Language code generation task.
·wolfram.com·
Wolfram LLM Benchmarking Project
olmOCR - The Open OCR System
olmOCR - The Open OCR System
In this video, I look at olmOCR, the OpenOCR system from Allen AI. Colab: https://dripl.ink/HpaK4 Blog: https://olmocr.allenai.org/blog macOS ver: https://jonathansoma.com/words/olmocr-on-macos-with-lm-studio.html For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps: 00:00 Intro 00:31 Allen AI Blog 01:20 olmOCR Blog 02:08 olmOCR Hugging Face 04:52 olmOCR GitHub 05:41 Demo 05:59 Running olmOCR on macOS with LM Studio
·youtube.com·
olmOCR - The Open OCR System