Search AI/ML

Found 68 bookmarks

Newest

ASCII Smuggler — The INVISIBLE prompt injection.

Hello and welcome to my new blog post. Today I am going to discuss a future threat which is invisible. With greater power comes greater…

#security #safety #prompt

·medium.com·Mar 1, 2025

ASCII Smuggler — The INVISIBLE prompt injection.

Converts ASCII Prompts to Unicode Generating “Invisible” Prompts

Converts ASCII Prompts to Unicode Generating “Invisible” Prompts - Unighost_Prompt_Injection.py

#safety #security #prompt

·gist.github.com·Mar 1, 2025

Converts ASCII Prompts to Unicode Generating “Invisible” Prompts

Invisible Prompt Injection: A Threat to AI Security

Learn about invisible prompt injection, which is a silent threat to secure AI.

#security #prompt

·trendmicro.com·Mar 1, 2025

Invisible Prompt Injection: A Threat to AI Security

Prompt Injection - Payloads All The Things

Payloads All The Things, a list of useful payloads and bypasses for Web Application Security

#security #prompt

·swisskyrepo.github.io·Feb 28, 2025

Prompt Injection - Payloads All The Things

Generative AI's Greatest Flaw - Computerphile

Described as GenAIs greatest flaw, indirect prompt injection is a big problem, Mike Pound from University of Nottingham explains how it is like SQL Injection...

#safety #security #prompt

·youtube.com·Feb 28, 2025

Generative AI's Greatest Flaw - Computerphile

Security ProbLLMs in xAI's Grok: A Deep Dive

Large language model applications suffer from a few core novel issues that have been identified over the last two years. Let's see how Grok fares on those.

#security #safety

·embracethered.com·Feb 23, 2025

Security ProbLLMs in xAI's Grok: A Deep Dive

AI Markets Were Deceived To Believe In DeepSeek's Low Training Costs; They Are Actually 400 Times Higher Than The Reported Figure

The controversy around DeepSeek's costs for training their R1 model shook up the markets, but it seems like there was a lot of deception.

#security #model training #politics

·wccftech.com·Feb 3, 2025

AI Markets Were Deceived To Believe In DeepSeek's Low Training Costs; They Are Actually 400 Times Higher Than The Reported Figure

The Art of AI Domination: Remote Controlling ChatGPT ZombAI Instances

Hey ChatGPT! How to build a botnet with compromised ChatGPT instances! AI botnet vulnerability

#security #safety

·embracethered.com·Jan 7, 2025

The Art of AI Domination: Remote Controlling ChatGPT ZombAI Instances

APpaREnTLy THiS iS hoW yoU JaIlBreAk AI

Anthropic created an AI jailbreaking algorithm that keeps tweaking prompts until it gets a harmful response.

#security #safety

·404media.co·Dec 19, 2024

APpaREnTLy THiS iS hoW yoU JaIlBreAk AI

The Beginner's Guide to Visual Prompt Injections: Invisibility Cloaks, Cannibalistic Adverts, and Robot Women | Lakera – Protecting AI teams that disrupt the world.

Learn about visual prompt injections, their appearance, and top defense strategies against these attacks.

#security #safety

·lakera.ai·Nov 15, 2024

The Beginner's Guide to Visual Prompt Injections: Invisibility Cloaks, Cannibalistic Adverts, and Robot Women | Lakera – Protecting AI teams that disrupt the world.

Ted Benson

#safety #security #audio #voice

·edwardbenson.com·Oct 8, 2024

Ted Benson

Hacker plants false memories in ChatGPT to steal user data in perpetuity

Emails, documents, and other untrusted content can plant malicious memories.

#safety #security

·arstechnica.com·Sep 25, 2024

Hacker plants false memories in ChatGPT to steal user data in perpetuity

The dangers of AI agents unfurling hyperlinks and what to do about it · Embrace The Red

Automatically unfurling hyperlinks can lead to data exfiltration. This post shows how to mitigate this threat in Slack Apps

#safety #security

·embracethered.com·Aug 21, 2024

The dangers of AI agents unfurling hyperlinks and what to do about it · Embrace The Red

SQL injection-like attack on LLMs with special tokens

Andrej Karpathy explains something that's been confusing me for the best part of a year: The decision by LLM tokenizers to parse special tokens in the input string (``, …

#safety #security

·simonwillison.net·Aug 21, 2024

SQL injection-like attack on LLMs with special tokens

MIT releases comprehensive database of AI risks

Researchers at MIT have released the AI Risk Repository, a comprehensive database that can help organizations identify and mitigate AI risks.

#safety #security

·venturebeat.com·Aug 14, 2024

MIT releases comprehensive database of AI risks

Mapping the misuse of generative AI

New research analyzes the misuse of multimodal generative AI today, in order to help build safer and more responsible technologies

#security #safety

·deepmind.google·Aug 12, 2024

Mapping the misuse of generative AI

GPT-4o System Card

There are some fascinating new details in this lengthy report outlining the safety work carried out prior to the release of GPT-4o. A few highlights that stood out to me. …

#safety #security

·simonwillison.net·Aug 9, 2024

GPT-4o System Card

GitHub Copilot Chat: From Prompt Injection to Data Exfiltration · Embrace The Red

Analyzing untrusted code with GitHub Copilot Chat can have malicious side effects and turn Copilot into Copirate!

#security

·embracethered.com·Jun 16, 2024

GitHub Copilot Chat: From Prompt Injection to Data Exfiltration · Embrace The Red

The Rise of Large-Language-Model Optimization - Schneier on Security

#security #safety

·schneier.com·Apr 25, 2024

The Rise of Large-Language-Model Optimization - Schneier on Security

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

By far the most detailed paper on prompt injection I've seen yet from OpenAI, published a few days ago and with six credited authors: Eric Wallace, Kai Xiao, Reimar Leike, …

#security #safety

·simonwillison.net·Apr 23, 2024

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

GPT-4 can exploit real vulnerabilities by reading advisories

While some other LLMs appear to flat-out suck

#security

·theregister.com·Apr 21, 2024

GPT-4 can exploit real vulnerabilities by reading advisories

OpenAI Begins Tackling ChatGPT Data Leak Vulnerability · Embrace The Red

Good news. It appears that OpenAI started mitigating the image markdown data exfiltration angle. It remains vulnerable, but it's great to see a few first actions being taken to mitigate the problem.

#security

·embracethered.com·Apr 18, 2024

OpenAI Begins Tackling ChatGPT Data Leak Vulnerability · Embrace The Red

AI bots hallucinate software packages and devs download them

Simply look out for libraries imagined by ML and make them real, with actual malicious code. No wait, don't do that

#safety #security

·theregister.com·Mar 30, 2024

AI bots hallucinate software packages and devs download them

TracecatHQ/tracecat: 😼 The AI-native, open source alternative to Tines / Splunk SOAR.

😼 The AI-native, open source alternative to Tines / Splunk SOAR. - TracecatHQ/tracecat

#security #agent

·github.com·Mar 27, 2024

TracecatHQ/tracecat: 😼 The AI-native, open source alternative to Tines / Splunk SOAR.

Researchers use ASCII art to elicit harmful responses from 5 major AI chatbots

LLMs are trained to block harmful responses. Old-school images can override those rules.

#safety #security

·arstechnica.com·Mar 18, 2024

Researchers use ASCII art to elicit harmful responses from 5 major AI chatbots

Who Am I? Conditional Prompt Injection Attacks with Microsoft Copilot · Embrace The Red

Conditional Instructions open a powerful way for adversaries to target individual and delay detonation of malicious payloads for when certain conditions are met

#safety #security #prompt

·embracethered.com·Mar 4, 2024

Who Am I? Conditional Prompt Injection Attacks with Microsoft Copilot · Embrace The Red

Exploring Google Bard's Data Visualization Feature (Code Interpreter) · Embrace The Red

Last November Google updated Bard to include the ability to solve math equations and draw charts based on data. It can be used to run small Python programs.

#security

·embracethered.com·Feb 18, 2024

Exploring Google Bard's Data Visualization Feature (Code Interpreter) · Embrace The Red