AI Gets Better at Getting Better at Writing Code
Potemkin Understanding in Large Language Models
View PDF
Susan Schneider & Mark Bailey, Superpsychism - PhilArchive
The Prompt Report
(In a recent video, they highlighted techniques: few-shot prompting, decomposition, self-criticism, providing context, and ensemble prompts or models. Still vulnerable to prompt injection or misalignment. Fine-tuning can be safer for specific narrow tasks. Agentic capabilities are a new area.)
Gemini v2 5 report
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities.
Utility Engineering: Analyzing and Controlling Emergent Value...
View PDF
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using...
View PDF
GPS as a Control Signal for Image Generation
View PDF
Scaling Language-Free Visual Representation Learning
View PDF
A novel approach to studying the role influence plays in team collective intelligence - Lisa R O’Bryan, Timothy Oxendahl, Simon Garnier, Santiago Segarra, Matthew Wettergreen, Ashutosh Sabharwal, Margaret E Beier, 2025
Multimodal Large Language Models: A Survey
View PDF
Future of Work with AI Agents: Auditing Automation and...
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly...
View PDF
A knockout blow for LLMs?
Self-Challenging Language Model Agents
The State of Multilingual LLM Safety Research: From Measuring the...
HardTests: Synthesizing High-Quality Test Cases for LLM Coding
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!
Machine Culture
When AI Co-Scientists Fail: SPOT-a Benchmark for Automated...
ARC-AGI-2: A New Challenge for Frontier AI Reasoning Systems
MMaDA: Multimodal Large Diffusion Language Models
Vid2World: Crafting Video Diffusion Models to Interactive World Models
UniVG-R1: Reasoning Guided Universal Visual Grounding with...
Learning to Reason via Mixture-of-Thought for Logical Reasoning
Scalable Chain of Thoughts via Elastic Reasoning
Reward Reasoning Model
Think Only When You Need with Large Hybrid-Reasoning Models
View PDF
When Thinking Fails: The Pitfalls of Reasoning for...
View PDF