Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-JudgeView PDF#Meta#Large Language Models#Reasoning#Evaluation#Planning#Paper#PDF·arxiv.org·Feb 1, 2025Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
Iterative Reasoning Preference Optimization#Reasoning#Preferences#Paper#PDF#Meta#Large Language Models#Algorithms#Chain of Thought·arxiv.org·May 1, 2024Iterative Reasoning Preference Optimization
WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language ModelsDownload PDF#Reasoning#Large Language Models#Meta#Paper#PDF·arxiv.org·Dec 1, 2023WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models