AI Giants Go Nuclear, A Tech Bromance Turns Turbulent, Mistral Sharpens the Edge, and more...
Suppressing Pink Elephants with Direct Principle Feedback
Download PDF
Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained
The third New England RLHF Hackers Hackathon
Scientists Are Researching a Device That Can Induce Lucid Dreams on Demand
Yann LeCun on X
Human Feedback is not Gold Standard
Download PDF
OpenELM/OpenELM_Paper.pdf at paper · CarperAI/OpenELM · GitHub
SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation
Quicker feedback on new content
Addressing criticism, OpenAI will no longer use customer data to train its models by default
The Flan Collection: Advancing open source methods for instruction tuning
Charlie George on Twitter