AI/ML

AI/ML

2363 bookmarks
Custom sorting
NEW Kimi K2 Thinking - Best Open Model?
NEW Kimi K2 Thinking - Best Open Model?
In this video, I look at Kimi K2 Thinking from Moonshot AI, the most recent fully open reasoning model that scores higher than GPT-5 and Anthropic for multiple benchmarks. Blog: https://moonshotai.github.io/Kimi-K2/thinking.html Model Weights: https://huggingface.co/moonshotai/Kimi-K2-Thinking For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials
·youtube.com·
NEW Kimi K2 Thinking - Best Open Model?
Transformers & Diffusion LLMs: What's the connection?
Transformers & Diffusion LLMs: What's the connection?
Diffusion-based LLMs are a new paradigm for text generation; they progressively refine gibberish into a coherent response. But what's their connection to Tra...
·youtube.com·
Transformers & Diffusion LLMs: What's the connection?
GitHub - samrolken/nokode
GitHub - samrolken/nokode

A web server with no application logic. Just an LLM with three tools. The Shower Thought

One day we won't need code. LLMs will output video at 120fps, sample inputs in realtime, and just... be our computers. No apps, no code, just intent and execution.

That's science fiction.

But I got curious: with a few hours this weekend and today's level of tech, how far can we get? The Hypothesis

I expected this to fail spectacularly.

Everyone's focused on AI that writes code. You know the usual suspects, Claude Code, Cursor, Copilot, all that. But that felt like missing the bigger picture. So I built something to test a different question: what if you skip code generation entirely? A web server with zero application code. No routes, no controllers, no business logic. Just an HTTP server that asks an LLM "what should I do?" for every request.

The goal: prove how far away we really are from that future. The Target

Contact manager. Basic CRUD: forms, database, list views, persistence.

Why? Because most software is just CRUD dressed up differently. If this works at all, it would be something.

·github.com·
GitHub - samrolken/nokode
On MiniMax M2 and LLMs with Interleaved Thinking Steps
On MiniMax M2 and LLMs with Interleaved Thinking Steps
In addition to Kimi K2 (which I recently wrote about here) and GLM-4.6 (which will become an option on Cerebras in a few days, when I’ll play around with it), one of the more interesting open-source LLM releases out of China lately is MiniMax M2. This MoE model (230B parameters, 10B activated at any given
·macstories.net·
On MiniMax M2 and LLMs with Interleaved Thinking Steps
Putting ChatGPT on the Couch
Putting ChatGPT on the Couch
When I played doctor with the chatbot, the simulated patient confessed problems that are real—and that should worry all of us.
·newyorker.com·
Putting ChatGPT on the Couch
google/langextract: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
google/langextract: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization. - google/langextract
·github.com·
google/langextract: A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Code Mode: the better way to use MCP
Code Mode: the better way to use MCP
It turns out we've all been using MCP wrong. Most agents today use MCP by exposing the "tools" directly to the LLM. We tried something different: Convert the MCP tools into a TypeScript API, and then ask an LLM to write code that calls that API. The results are striking.
·blog.cloudflare.com·
Code Mode: the better way to use MCP
Richard Sutton – Father of RL thinks LLMs are a dead end
Richard Sutton – Father of RL thinks LLMs are a dead end
Richard Sutton is the father of reinforcement learning, winner of the 2024 Turing Award, and author of The Bitter Lesson. And he thinks LLMs are a dead end. After interviewing him, my steel man of Richard’s position is this: LLMs aren’t capable of learning on-the-job, so no matter how much we scale, we’ll need *some* new architecture to enable continual learning. And once we have it, we won’t need a special training phase — the agent will just learn on-the-fly, like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. In our interview, I did my best to represent the view that LLMs might function as the foundation on which experiential learning can happen… Some sparks flew. A big thanks to the Alberta Machine Intelligence Institute for inviting me up to Edmonton and for letting me use their studio and equipment. Enjoy! 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Transcript: https://www.dwarkesh.com/p/richard-sutton * Apple Podcasts: https://podcasts.apple.com/us/podcast/richard-sutton-father-of-rl-thinks-llms-are-a-dead-end/id1516093381?i=1000728584744 * Spotify: https://open.spotify.com/episode/3zAXRCFrHPShU4MuuIx4V5?si=c9f4bf24fb4c43e3 𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒 * Labelbox makes it possible to train AI agents in hyperrealistic RL environments. With an experienced team of applied researchers and a massive network of subject-matter experts, Labelbox ensures your training reflects important, real-world nuance. Turn your demo projects into working systems at https://labelbox.com/dwarkesh * Gemini Deep Research is designed for thorough exploration of hard topics. For this episode, it helped me trace reinforcement learning from early policy gradients up to current-day methods, combining clear explanations with curated examples. Try it out yourself at https://gemini.google.com/ * Hudson River Trading doesn’t silo their teams. Instead, HRT researchers openly trade ideas and share strategy code in a mono-repo. This means you’re able to learn at incredible speed and your contributions have impact across the entire firm. Find open roles at https://hudsonrivertrading.com/dwarkesh To sponsor a future episode, visit https://dwarkesh.com/advertise 𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒 00:00:00 – Are LLMs a dead end? 00:13:51 – Do humans do imitation learning? 00:23:57 – The Era of Experience 00:34:25 – Current architectures generalize poorly out of distribution 00:42:17 – Surprises in the AI field 00:47:28 – Will The Bitter Lesson still apply after AGI? 00:54:35 – Succession to AI
·youtube.com·
Richard Sutton – Father of RL thinks LLMs are a dead end
Why AI isn't replacing radiologists
Why AI isn't replacing radiologists
Radiology combines digital images, clear benchmarks, and repeatable tasks. But demand for human radiologists is ay an all-time high.
·worksinprogress.news·
Why AI isn't replacing radiologists
harlan-zw/mdream: ☁️ Convert any site to clean markdown & llms.txt. Boost your site's AI discoverability or generate LLM context for a project you're working with.
harlan-zw/mdream: ☁️ Convert any site to clean markdown & llms.txt. Boost your site's AI discoverability or generate LLM context for a project you're working with.
☁️ Convert any site to clean markdown & llms.txt. Boost your site's AI discoverability or generate LLM context for a project you're working with. - harlan-zw/mdream
·github.com·
harlan-zw/mdream: ☁️ Convert any site to clean markdown & llms.txt. Boost your site's AI discoverability or generate LLM context for a project you're working with.