Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-JudgeView PDF#Meta#Large Language Models#Reasoning#Evaluation#Planning#Paper#PDF·arxiv.org·Feb 1, 2025Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
Agent-as-a-Judge: Evaluate Agents with AgentsView PDF#Agents#Evaluation#Meta#Paper#PDF·arxiv.org·Dec 14, 2024Agent-as-a-Judge: Evaluate Agents with Agents