Deliberative alignment: reasoning enables safer language models | OpenAI
OpenAI trained o1 and o3 to 'think' about its safety policy | TechCrunch
12 Days of OpenAI | OpenAI
Weak to strong generalization
Now we know what OpenAI’s superalignment team has been up to
Weak-to-strong generalization
Superalignment Fast Grants
What Sam Altman’s Firing Means for the Future of OpenAI