Influence and cyber operations an update october 2024
AI deception: A survey of examples, risks, and potential solutions
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Download PDF
Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure
Role-Play with Large Language Models