Constitutional Classifiers: Defending against Universal Jailbreaks...
Dario Amodei — On DeepSeek and Export Controls
Introducing Citations on the Anthropic API \ Anthropic
The code whisperer: How Anthropic’s Claude is changing the game for software developers
Greenblatt, R. et al. (2024). Alignment faking in large language models.
How Claude uses AI to identify new threats
Introducing the Model Context Protocol \ Anthropic
Powering the next generation of AI development with AWS \ Anthropic
Well... Anthropic which started out with an AI ethics narrative and intention is now allying itself tightly with the US military & intel establishment,
Introducing the analysis tool in Claude.ai \ Anthropic
Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku \ Anthropic
Announcing our updated Responsible Scaling Policy \ Anthropic
Dario Amodei — Machines of Loving Grace
Anthropic Makes Play for Business Customers
Claude Android app \ Anthropic
Evaluate prompts in the developer console \ Anthropic
Collaborate with Claude on Projects \ Anthropic
Introducing Claude 3.5 Sonnet \ Anthropic
Anthropic claims its latest model is best-in-class | TechCrunch
Testing and mitigating elections-related risks \ Anthropic
Anthropic’s AI now lets you create bots to work for you
Simple probes can catch sleeper agents \ Anthropic
Anil, C., Durmus, E., Sharma, M., Benton, J., Kundu, S., Batson, J., ... & Duvenaud, D. (2024). Many-shot Jailbreaking.
Long contexts represent a new front in the struggle to control LLMs. We explored a family of attacks that are newly feasible due to longer context lengths, as well as candidate mitigations. We found that the effectiveness of attacks, and of in-context learning more generally, could be characterized by simple power laws. This provides a richer source of feedback for mitigating long-context attacks than the standard approach of measuring frequency of success
Amazon spends $2.75 billion on AI startup Anthropic in its largest venture investment yet
Claude 3 Haiku: our fastest model yet \ Anthropic
Prompt library
No, Anthropic's Claude 3 is NOT sentient
Model card claude 3
Introducing the next generation of Claude \ Anthropic
Crypto exchange FTX to sell shares in AI startup Anthropic