Search cyberveille.decio.ch

Found 6 bookmarks

Custom sorting

Recent Jailbreaks Demonstrate Emerging Threat to DeepSeek

Evaluation of three jailbreaking techniques on DeepSeek shows risks of generating prohibited content. Evaluation of three jailbreaking techniques on DeepSeek shows risks of generating prohibited content.

#paloaltonetworks #EN #2025 #LLM #Jailbreak #DeepSeek

·unit42.paloaltonetworks.com·Feb 3, 2025

Recent Jailbreaks Demonstrate Emerging Threat to DeepSeek

Many-shot jailbreaking \ Anthropic

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

#anthropic #EN #2024 #AI #LLM #Jailbreak #Many-shot

·anthropic.com·Jan 8, 2025

Many-shot jailbreaking \ Anthropic

Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability

The jailbreak technique "Bad Likert Judge" manipulates LLMs to generate harmful content using Likert scales, exposing safety gaps in LLM guardrails. The jailbreak technique "Bad Likert Judge" manipulates LLMs to generate harmful content using Likert scales, exposing safety gaps in LLM guardrails.

#unit42 #EN #2024 #LLM #Jailbreak #Likert

·unit42.paloaltonetworks.com·Jan 8, 2025

Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability

EPFL: des failles de sécurité dans les modèles d'IA

Les modèles d'intelligence artificielle (IA) peuvent être manipulés malgré les mesures de protection existantes. Avec des attaques ciblées, des scientifiques lausannois ont pu amener ces systèmes à générer des contenus dangereux ou éthiquement douteux.

#swissinfo #FR #2024 #EPFL #IA #chatgpt #Jailbreak #failles #LLM #vulnerabilités #Manipulation

·swissinfo.ch·Dec 23, 2024

EPFL: des failles de sécurité dans les modèles d'IA

Here is Apple's official 'jailbroken' iPhone for security researchers | TechCrunch

A security researchers shared a picture of the instructions that go along Apple's Security Research Device and more details about this special iPhone.

#techcrunch #EN #2024 #apple #bugs #cybersecurity #iphone #vulnerabilities #Jailbreak

·techcrunch.com·Feb 1, 2024

Here is Apple's official 'jailbroken' iPhone for security researchers | TechCrunch

Using AI to Automatically Jailbreak GPT-4 and Other LLMs in Under a Minute

It’s been one year since the launch of ChatGPT, and since that time, the market has seen astonishing advancement of large language models (LLMs). Despite the pace of development continuing to outpace model security, enterprises are beginning to deploy LLM-powered applications. Many rely on guardrails implemented by model developers to prevent LLMs from responding to sensitive prompts. However, even with the considerable time and effort spent by the likes of OpenAI, Google, and Meta, these guardrails are not resilient enough to protect enterprises and their users today. Concerns surrounding model risk, biases, and potential adversarial exploits have come to the forefront.

#robustintelligence #EN #AI #Jailbreak #GPT-4 #chatgpt #hacking #LLMs #research

·robustintelligence.com·Dec 9, 2023

Using AI to Automatically Jailbreak GPT-4 and Other LLMs in Under a Minute