Researchers Replicate OpenAI's Hot New AI Tool in 24 Hours(Incidentally, a bar or vinculum over a letter in Roman numerals is a multiplier of 1000.)#Reasoning#Training#Testing#Large Language Models#Blog#Research#Questions and Answers·futurism.com·Feb 10, 2025Researchers Replicate OpenAI's Hot New AI Tool in 24 Hours
The Surprising Effectiveness of Test-Time Training for Abstract Reasoning#Large Language Models#Training#Testing#Paper#PDF·arxiv.org·Dec 9, 2024The Surprising Effectiveness of Test-Time Training for Abstract Reasoning
Red Teaming o1 Part 2/2– Detecting Deception with Marius Hobbhahn of Apollo Research#OpenAI#Testing#Large Language Models·youtube.com·Sep 19, 2024Red Teaming o1 Part 2/2– Detecting Deception with Marius Hobbhahn of Apollo Research
Red Teaming o1 Part 1/2–Automated Jailbreaking w/ Haize Labs' Leonard Tang, Aidan Ewart& Brian Huang#OpenAI#Testing#Large Language Models·youtube.com·Sep 19, 2024Red Teaming o1 Part 1/2–Automated Jailbreaking w/ Haize Labs' Leonard Tang, Aidan Ewart& Brian Huang
Scale AI to set the Pentagon’s path for testing and evaluating large language models#Testing#Large Language Models#Defense#Datasets·defensescoop.com·Feb 21, 2024Scale AI to set the Pentagon’s path for testing and evaluating large language models
Automated Testing for LLMOps#Large Language Models#Testing#Automation#Courses#Beta·deeplearning.ai·Jan 24, 2024Automated Testing for LLMOps
MART: Improving LLM Safety with Multi-round Automatic Red-TeamingDownload PDF#Testing#Large Language Models#Paper#PDF·arxiv.org·Nov 17, 2023MART: Improving LLM Safety with Multi-round Automatic Red-Teaming