yet-another-applied-llm-benchmark
Nicholas Carlini introduced this personal LLM benchmark suite [back in February](https://nicholas.carlini.com/writing/2024/my-benchmark-for-large-language-models.html) as a collection of over 100 automated tests he runs against new LLM models to evaluate their performance against …