Weak-to-strong generalization#OpenAI#Alignment#Proxy·openai.com·Dec 14, 2023Weak-to-strong generalization
Researchers From Stanford And DeepMind Come Up With The Idea of Using Large Language Models LLMs as a Proxy Reward Function#Preferences#Proxy#Large Language Models·marktechpost.com·Mar 9, 2023Researchers From Stanford And DeepMind Come Up With The Idea of Using Large Language Models LLMs as a Proxy Reward Function