AI/ML

AI/ML

2473 bookmarks
Custom sorting
The LLM's RL Revelation We Didn't See Coming
The LLM's RL Revelation We Didn't See Coming
Try out Warp 2.0 now, the current rank #1 AI on Terminal Bench, outperforming Claude Code: https://go.warp.dev/bycloud You can also use code "BYCLOUD" to get Warp Pro for 1 month free. (limited for 1,000 redemptions) My Newsletter https://mail.bycloud.ai/ my project: find, discover & explain AI research semantically https://findmypapers.ai/ My Patreon (get bundle access for my newsletter & findmypapers) https://www.patreon.com/c/bycloud Training language models to follow instructions with human feedback [Paper] https://arxiv.org/abs/2203.02155 DeepSeek-R1 (Aha Moment) [Paper] https://arxiv.org/abs/2501.12948 Understanding R1-Zero-Like Training: A Critical Perspective [Paper] https://arxiv.org/pdf/2503.20783 Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? [Paper] https://arxiv.org/abs/2504.13837 Reinforcement Learning Finetunes Small Subnetworks in Large Language Models [Paper] https://arxiv.org/abs/2505.11711 Spurious Rewards: Rethinking Training Signals in RLVR [Paper] https://arxiv.org/abs/2506.10947 Try out my new fav place to learn how to code https://scrimba.com/?via=bycloudAI This video is supported by the kind Patrons & YouTube Members: 🙏Nous Research, Chris LeDoux, Ben Shaener, DX Research Group, Poof N' Inu, Andrew Lescelius, Deagan, Robert Zawiasa, Ryszard Warzocha, Tobe2d, Louis Muk, Akkusativ, Kevin Tai, Mark Buckler, NO U, Tony Jimenez, Ângelo Fonseca, jiye, Anushka, Asad Dhamani, Binnie Yiu, Calvin Yan, Clayton Ford, Diego Silva, Etrotta, Gonzalo Fidalgo, Handenon, Hector, Jake Disco very, Michael Brenner, Nilly K, OlegWock, Daddy Wen, Shuhong Chen, Sid_Cipher, Stefan Lorenz, Sup, tantan assawade, Thipok Tham, Thomas Di Martino, Thomas Lin, Richárd Nagyfi, Paperboy, mika, Leo, Berhane-Meskel, Kadhai Pesalam, mayssam, Bill Mangrum, nyaa, Toru Mon [Discord] https://discord.gg/NhJZGtH [Twitter] https://twitter.com/bycloudai [Patreon] https://www.patreon.com/bycloud [Business Inquiries] bycloud@smoothmedia.co [Profile & Banner Art] https://twitter.com/pygm7 [Video Editor] @Booga04 [Ko-fi] https://ko-fi.com/bycloudai
·youtube.com·
The LLM's RL Revelation We Didn't See Coming
Build and share AI-powered apps with Claude
Build and share AI-powered apps with Claude
Anthropic have added one of the most important missing features to Claude Artifacts: apps built as artifacts now have the ability to run their own prompts against Claude via a …
·simonwillison.net·
Build and share AI-powered apps with Claude
Nxtscape
Nxtscape
Nxtscape is a browser that is built for productivity and privacy.
·nxtscape.ai·
Nxtscape
How OpenElections Uses LLMs
How OpenElections Uses LLMs
The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are …
·simonwillison.net·
How OpenElections Uses LLMs
An LLM Codegen Hero's Journey
An LLM Codegen Hero's Journey
A comprehensive guide detailing the evolution of using AI-assisted software development, from basic code completion to fully autonomous coding agents, with practical steps and insights for maximizing productivity through LLM integration.
·harper.blog·
An LLM Codegen Hero's Journey
Basic Claude Code
Basic Claude Code
A detailed walkthrough of using Claude Code AI assistant for software development, including workflow tips, testing practices, and practical examples from real projects. Covers defensive coding strategies, TDD, and team implementation.
·harper.blog·
Basic Claude Code