Search Test Information Space

Found 2 bookmarks

Custom sorting

Constitutional Classifiers: Defending against Universal Jailbreaks...

#Anthropic #Classification #Safety #Large Language Models #Paper #PDF

·arxiv.org·Feb 3, 2025

Constitutional Classifiers: Defending against Universal Jailbreaks...

Core Views on AI Safety: When, Why, What, and How

#Anthropic #AI #Safety

·anthropic.com·Mar 9, 2023

Core Views on AI Safety: When, Why, What, and How