Search Test Information Space

Found 2 bookmarks

Custom sorting

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

#Large Language Models #Evaluation #Peer Review #Paper #PDF #Cohere

·arxiv.org·Apr 30, 2024

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation

Download PDF

#Large Language Models #RLHF #Evaluation #Paper #PDF #Cohere

·arxiv.org·Oct 27, 2023

Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation