Influence and cyber operations an update october 2024#OpenAI#Report#Cybersecurity#Paper#PDF#Deception#AI·openai.com·Oct 9, 2024Influence and cyber operations an update october 2024
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety TrainingDownload PDF#Deception#Large Language Models#Paper#PDF·arxiv.org·Jan 13, 2024Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure#Deception#Large Language Models#Paper#PDF·arxiv.org·Nov 15, 2023Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure
Role-Play with Large Language Models#Large Language Models#Dialogue#Deception#Self-Awareness#Paper#PDF#DeepMind·arxiv.org·Nov 13, 2023Role-Play with Large Language Models