Capabilities of GPT-5 on Multimodal Medical Reasoning
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning
View PDF
The Medium is the Message: How Non-Clinical Information Shapes Clinical Decisions in LLMs | Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency
PDF
Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!
UniVG-R1: Reasoning Guided Universal Visual Grounding with...
Learning to Reason via Mixture-of-Thought for Logical Reasoning
Scalable Chain of Thoughts via Elastic Reasoning
Reward Reasoning Model
Think Only When You Need with Large Hybrid-Reasoning Models
View PDF
When Thinking Fails: The Pitfalls of Reasoning for...
View PDF
Competitive Programming with Large Reasoning Models
s1: Simple test-time scaling
On the Diagram of Thought
Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
View PDF
COMBINING INDUCTION AND TRANSDUCTION FOR ABSTRACT REASONING
GSM-Symbolic: Understanding the Limitations of Mathematical...
Schrodinger's Memory: Large Language Models
One Thousand and One Pairs: A "novel" challenge for...
View PDF
A Careful Examination of Large Language Model Performance on Grade School Arithmetic
View PDF
Iterative Reasoning Preference Optimization
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap
Download PDF
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs | OpenReview
WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models
Download PDF
Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks
Download PDF
Preventing Language Models From Hiding Their Reasoning
Download PDF
Large Language Models can Learn Rules
Download PDF
Large Language Models Cannot Self-Correct Reasoning Yet
Download PDF
The Jiminy Advisor: Moral Agreements among Stakeholders Based on Norms and Argumentation | Journal of Artificial Intelligence Research
Advances in apparent conceptual physics reasoning in GPT-4
Improving Factuality and Reasoning in Language Models through Multiagent Debate