Search Test Information Space

Found 31 bookmarks

Custom sorting

Capabilities of GPT-5 on Multimodal Medical Reasoning

#Multimodal #Medical #Reasoning #GPT-5 #Paper #PDF

·arxiv.org·Aug 17, 2025

Capabilities of GPT-5 on Multimodal Medical Reasoning

Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning

View PDF

#Reasoning #Paper #PDF

·arxiv.org·Jul 24, 2025

Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning

The Medium is the Message: How Non-Clinical Information Shapes Clinical Decisions in LLMs | Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency

PDF

#Large Language Models #Medical #User Interfaces #Paper #PDF #Reasoning #Conversational AI #Chatbot

·dl.acm.org·Jul 8, 2025

The Medium is the Message: How Non-Clinical Information Shapes Clinical Decisions in LLMs | Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency

Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

#Criticism #Anthropomorphism #Large Language Models #Paper #PDF #Reasoning #Chain of Thought

·arxiv.org·May 29, 2025

Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

UniVG-R1: Reasoning Guided Universal Visual Grounding with...

#Reasoning #Reinforcement Learning #Large Language Models #Multimodal #Paper #PDF

·arxiv.org·May 22, 2025

UniVG-R1: Reasoning Guided Universal Visual Grounding with...

Learning to Reason via Mixture-of-Thought for Logical Reasoning

#Reasoning #Large Language Models #Paper #PDF

·arxiv.org·May 22, 2025

Learning to Reason via Mixture-of-Thought for Logical Reasoning

Scalable Chain of Thoughts via Elastic Reasoning

#Reasoning #Economics #Chain of Thought #Large Language Models #Paper #PDF

·arxiv.org·May 21, 2025

Scalable Chain of Thoughts via Elastic Reasoning

Reward Reasoning Model

#Reward #Reasoning #Large Language Models #Paper #PDF

·arxiv.org·May 21, 2025

Reward Reasoning Model

Think Only When You Need with Large Hybrid-Reasoning Models

View PDF

#Reasoning #Performance #Large Language Models #Paper #PDF

·arxiv.org·May 21, 2025

Think Only When You Need with Large Hybrid-Reasoning Models

When Thinking Fails: The Pitfalls of Reasoning for...

View PDF

#Chain of Thought #Criticism #Reasoning #Large Language Models #Paper #PDF

·arxiv.org·May 20, 2025

When Thinking Fails: The Pitfalls of Reasoning for...

Competitive Programming with Large Reasoning Models

#Reasoning #OpenAI #Large Language Models #Paper #PDF #Coding

·arxiv.org·Feb 12, 2025

Competitive Programming with Large Reasoning Models

s1: Simple test-time scaling

#Large Language Models #Reasoning #Paper #PDF

·arxiv.org·Feb 6, 2025

s1: Simple test-time scaling

On the Diagram of Thought

#Reasoning #Large Language Models #Diagrams #Chain of Thought #Paper #PDF

·arxiv.org·Feb 1, 2025

On the Diagram of Thought

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge

View PDF

#Meta #Large Language Models #Reasoning #Evaluation #Planning #Paper #PDF

·arxiv.org·Feb 1, 2025

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge

COMBINING INDUCTION AND TRANSDUCTION FOR ABSTRACT REASONING

#Reasoning #Abstract #Paper #PDF

·cs.cornell.edu·Nov 3, 2024

COMBINING INDUCTION AND TRANSDUCTION FOR ABSTRACT REASONING

GSM-Symbolic: Understanding the Limitations of Mathematical...

#Reasoning #Large Language Models #Paper #PDF

·arxiv.org·Oct 17, 2024

GSM-Symbolic: Understanding the Limitations of Mathematical...

Schrodinger's Memory: Large Language Models

#Reasoning #Large Language Models #Paper #PDF #Memory

·arxiv.org·Sep 19, 2024

Schrodinger's Memory: Large Language Models

One Thousand and One Pairs: A "novel" challenge for...

View PDF

#Large Language Models #Reasoning #Paper #PDF

·arxiv.org·Jun 29, 2024

One Thousand and One Pairs: A "novel" challenge for...

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

View PDF

#Large Language Models #Mathematics #Reasoning #Benchmark #Paper #PDF

·arxiv.org·May 2, 2024

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Iterative Reasoning Preference Optimization

#Reasoning #Preferences #Paper #PDF #Meta #Large Language Models #Algorithms #Chain of Thought

·arxiv.org·May 1, 2024

Iterative Reasoning Preference Optimization

Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap

Download PDF

#Reasoning #Large Language Models #Paper #PDF

·arxiv.org·Mar 2, 2024

Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs | OpenReview

#Large Language Models #Reasoning #Paper #PDF #Cohere

·openreview.net·Dec 13, 2023

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs | OpenReview

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models

Download PDF

#Reasoning #Large Language Models #Meta #Paper #PDF

·arxiv.org·Dec 1, 2023

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models

Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks

Download PDF

#GPT-4 #Abstract #Reasoning #PDF #Paper

·arxiv.org·Nov 17, 2023

Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks

Preventing Language Models From Hiding Their Reasoning

Download PDF

#Large Language Models #Chain of Thought #Reasoning #Steganography #Paper #PDF

·arxiv.org·Nov 14, 2023

Preventing Language Models From Hiding Their Reasoning

Large Language Models can Learn Rules

Download PDF

#Large Language Models #Reasoning #Paper #PDF

·arxiv.org·Oct 14, 2023

Large Language Models can Learn Rules

Large Language Models Cannot Self-Correct Reasoning Yet

Download PDF

#Large Language Models #Reasoning #Paper #PDF

·arxiv.org·Oct 9, 2023

Large Language Models Cannot Self-Correct Reasoning Yet

The Jiminy Advisor: Moral Agreements among Stakeholders Based on Norms and Argumentation | Journal of Artificial Intelligence Research

#Preferences #Reasoning #Stakeholders #Paper #PDF

·jair.org·Jul 12, 2023

The Jiminy Advisor: Moral Agreements among Stakeholders Based on Norms and Argumentation | Journal of Artificial Intelligence Research

Advances in apparent conceptual physics reasoning in GPT-4

#Physics #Reasoning #ChatGPT #GPT-4 #Paper #PDF

·arxiv.org·Jun 10, 2023

Advances in apparent conceptual physics reasoning in GPT-4

Improving Factuality and Reasoning in Language Models through Multiagent Debate

#Reasoning #Large Language Models #Machine Learning #Computer Vision #Paper #PDF

·arxiv.org·May 30, 2023

Improving Factuality and Reasoning in Language Models through Multiagent Debate