Influence and cyber operations an update october 2024
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Download PDF
Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure
Role-Play with Large Language Models