MIT researchers use large language models to flag problems in complex systems
Smaller, Safer, More Transparent: Advancing Responsible AI with Gemma
GPT-4o Long Output | OpenAI
[Own work] On Measuring Faithfulness or Self-consistency of Natural Language Explanations
Large Enough
Introducing Llama 3.1: Our most capable models to date
The Vision of Autonomic Computing: Can LLMs Make It a Reality?
Wolfram LLM Benchmarking Project
Prover-Verifier Games improve legibility of language model outputs | OpenAI
Microsoft CTO Kevin Scott thinks LLM “scaling laws” will hold despite criticism
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
GAVEL: Generating Games Via Evolution and Language Models
View PDF
AI Chatbots Seem as Ethical as a New York Times Advice Columnist
One Thousand and One Pairs: A "novel" challenge for...
View PDF
OpenAI Builds AI to Critique AI
Scalable MatMul-free Language Modeling
View PDF
Empathic AI can’t get under the skin - Nature Machine Intelligence
Open Source LibreChat Offers More Than Just Extra LLMs
Human vs. Machine: Behavioral Differences Between Expert Humans...
LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks
View PDF
Advancing personal health and wellness insights with AI
NATURAL PLAN: Benchmarking LLMs on Natural Language Planning
View PDF
How Game Theory Can Make AI More Reliable
Scaling and evaluating sparse autoencoders
View PDF
To Believe or Not to Believe Your LLM
View PDF
LLMs achieve adult human performance on higher-order theory of mind tasks
View PDF
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
1-bit LLMs Could Solve AI’s Energy Demands
Aya | Cohere For AI
Aya 23 - 8B is the newest.
China’s latest answer to OpenAI is ‘Chat Xi PT’