Toward understanding and preventing misalignment generalization | OpenAIRead the paper(opens in a new window)#Persona#Alignment#OpenAI#Large Language Models·openai.com·Jun 18, 2025Toward understanding and preventing misalignment generalization | OpenAI
Deliberative alignment: reasoning enables safer language models | OpenAI#Alignment#OpenAI·openai.com·Jan 8, 2025Deliberative alignment: reasoning enables safer language models | OpenAI
OpenAI trained o1 and o3 to 'think' about its safety policy | TechCrunch#Alignment#OpenAI·techcrunch.com·Dec 23, 2024OpenAI trained o1 and o3 to 'think' about its safety policy | TechCrunch
12 Days of OpenAI | OpenAI#Alignment#Reasoning#Large Language Models#OpenAI·openai.com·Dec 20, 202412 Days of OpenAI | OpenAI
Weak to strong generalization#OpenAI#Alignment#Paper#PDF·cdn.openai.com·Dec 15, 2023Weak to strong generalization
Now we know what OpenAI’s superalignment team has been up to#OpenAI#Alignment·technologyreview.com·Dec 14, 2023Now we know what OpenAI’s superalignment team has been up to
Weak-to-strong generalization#OpenAI#Alignment#Proxy·openai.com·Dec 14, 2023Weak-to-strong generalization
Superalignment Fast Grants#OpenAI#Alignment#Funding·openai.com·Dec 14, 2023Superalignment Fast Grants
What Sam Altman’s Firing Means for the Future of OpenAI#OpenAI#Alignment·wired.com·Nov 19, 2023What Sam Altman’s Firing Means for the Future of OpenAI