Anthropic says some Claude models can now end ‘harmful or abusive’ conversations | TechCrunch
Anthropic says new capabilities allow its latest AI models to protect themselves by ending abusive conversations.
Anthropic has announced new capabilities that will allow some of its newest, largest models to end conversations in what the company describes as “rare, extreme cases of persistently harmful or abusive user interactions.” Strikingly, Anthropic says it’s doing this not to protect the human user, but rather the AI model.