Elo Uncovered: Robustness and Best Practices in Language Model Evaluation#Comparison#Large Language Models#Cohere#Paper#PDF·arxiv.org·Dec 1, 2023Elo Uncovered: Robustness and Best Practices in Language Model Evaluation