Search cyberveille.decio.ch

Found 13 bookmarks

Custom sorting

Recent Jailbreaks Demonstrate Emerging Threat to DeepSeek

Evaluation of three jailbreaking techniques on DeepSeek shows risks of generating prohibited content. Evaluation of three jailbreaking techniques on DeepSeek shows risks of generating prohibited content.

#paloaltonetworks #EN #2025 #LLM #Jailbreak #DeepSeek

·unit42.paloaltonetworks.com·Feb 3, 2025

Recent Jailbreaks Demonstrate Emerging Threat to DeepSeek

Many-shot jailbreaking \ Anthropic

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

#anthropic #EN #2024 #AI #LLM #Jailbreak #Many-shot

·anthropic.com·Jan 8, 2025

Many-shot jailbreaking \ Anthropic

Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability

The jailbreak technique "Bad Likert Judge" manipulates LLMs to generate harmful content using Likert scales, exposing safety gaps in LLM guardrails. The jailbreak technique "Bad Likert Judge" manipulates LLMs to generate harmful content using Likert scales, exposing safety gaps in LLM guardrails.

#unit42 #EN #2024 #LLM #Jailbreak #Likert

·unit42.paloaltonetworks.com·Jan 8, 2025

Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability

EPFL: des failles de sécurité dans les modèles d'IA

Les modèles d'intelligence artificielle (IA) peuvent être manipulés malgré les mesures de protection existantes. Avec des attaques ciblées, des scientifiques lausannois ont pu amener ces systèmes à générer des contenus dangereux ou éthiquement douteux.

#swissinfo #FR #2024 #EPFL #IA #chatgpt #Jailbreak #failles #LLM #vulnerabilités #Manipulation

·swissinfo.ch·Dec 23, 2024

EPFL: des failles de sécurité dans les modèles d'IA

Exclusive: Chinese researchers develop AI model for military use on back of Meta's Llama

Papers show China reworked Llama model for military tool China's top PLA-linked Academy of Military Science involved Meta says PLA 'unauthorised' to use Llama model * Pentagon says it is monitoring competitors' AI capabilities

#reuters #EN #China #Llama #model #military #tool #Meta #AI #LLM #Pentagon

·reuters.com·Nov 1, 2024

Exclusive: Chinese researchers develop AI model for military use on back of Meta's Llama

Data Exfiltration from Slack AI via indirect prompt injection

This vulnerability can allow attackers to steal anything a user puts in a private Slack channel by manipulating the language model used for content generation. This was responsibly disclosed to Slack (more details in Responsible Disclosure section at the end).

#promptarmor #EN #2024 #Slack #prompt-injection #LLM #vulnerability #steal #indirect-prompt #injection

·promptarmor.substack.com·Aug 20, 2024

Data Exfiltration from Slack AI via indirect prompt injection

Project Naptime: Evaluating Offensive Security Capabilities of Large Language Models

At Project Zero, we constantly seek to expand the scope and effectiveness of our vulnerability research. Though much of our work still relies on traditional methods like manual source code audits and reverse engineering, we're always looking for new approaches. As the code comprehension and general reasoning ability of Large Language Models (LLMs) has improved, we have been exploring how these models can reproduce the systematic approach of a human security researcher when identifying and demonstrating security vulnerabilities. We hope that in the future, this can close some of the blind spots of current automated vulnerability discovery approaches, and enable automated detection of "unfuzzable" vulnerabilities.

#googleprojectzero #EN #2024 #Offensive #Project-Naptime #LLM

·googleprojectzero.blogspot.com·Jun 21, 2024

Project Naptime: Evaluating Offensive Security Capabilities of Large Language Models

Security Brief: TA547 Targets German Organizations with Rhadamanthys Stealer

What happened Proofpoint identified TA547 targeting German organizations with an email campaign delivering Rhadamanthys malware. This is the first time researchers observed TA547 use Rhadamanthys,...

#proofpoint #EN #2024 #LLM #chatgpt #analysis #TA547 #Rhadamanthys #Stealer

·proofpoint.com·Apr 17, 2024

Security Brief: TA547 Targets German Organizations with Rhadamanthys Stealer

Diving Deeper into AI Package Hallucinations

Lass Security's recent research on AI Package Hallucinations extends the attack technique to GPT-3.5-Turbo, GPT-4, Gemini Pro (Bard), and Coral (Cohere).

#lasso #EN #2024 #AI #Package #Hallucinations #GPT-4 #Bard #Cohere #analysis #LLM

·lasso.security·Mar 28, 2024

Diving Deeper into AI Package Hallucinations

Personal Information Exploit on OpenAI’s ChatGPT Raise Privacy Concerns

Last month, I received an alarming email from someone I did not know: Rui Zhu, a Ph.D. candidate at Indiana University Bloomington. Mr. Zhu had my email address, he explained, because GPT-3.5 Turbo, one of the latest and most robust large language models (L.L.M.) from OpenAI, had delivered it to him.

#nytimes #en #2023 #exploit #LLM #AI #privacy #chatgpt

·nytimes.com·Dec 24, 2023

Personal Information Exploit on OpenAI’s ChatGPT Raise Privacy Concerns

Les 10 principales vulnérabilités des modèles GPT

Les grands modèles de langage peuvent être sujets à des cyberattaques et mettre en danger la sécurité des systèmes

#ictjournal #FR #chatGPT #cyberattaques #vulnérabilités #LLM #OWASP #top10

·ictjournal.ch·Nov 17, 2023

Les 10 principales vulnérabilités des modèles GPT

Large Language Models and Elections

Earlier this week, the Republican National Committee released a video that it claims was “built entirely with AI imagery.” The content of the ad isn’t especially novel—a dystopian vision of America under a second term with President Joe Biden—but the deliberate emphasis on the technology used to create it stands out: It’s a “Daisy” moment for the 2020s.

#Schneier #EN #2023 #LLM #election #disinformation #AI

·schneier.com·May 4, 2023

Large Language Models and Elections

AI-Powered 'BlackMamba' Keylogging Attack Evades Modern EDR Security

Researchers warn that polymorphic malware created with ChatGPT and other LLMs will force a reinvention of security automation.

#darkreading #EN #2023 #ChatGPT #EDR #evasion #Polymorphic #BlackMamba #LLM

·darkreading.com·May 3, 2023

AI-Powered 'BlackMamba' Keylogging Attack Evades Modern EDR Security