Described as GenAIs greatest flaw, indirect prompt injection is a big problem, Mike Pound from University of Nottingham explains how it is like SQL Injection...
Large language model applications suffer from a few core novel issues that have been identified over the last two years. Let's see how Grok fares on those.
The Beginner's Guide to Visual Prompt Injections: Invisibility Cloaks, Cannibalistic Adverts, and Robot Women | Lakera – Protecting AI teams that disrupt the world.
Learn about visual prompt injections, their appearance, and top defense strategies against these attacks.
SQL injection-like attack on LLMs with special tokens
Andrej Karpathy explains something that's been confusing me for the best part of a year: The decision by LLM tokenizers to parse special tokens in the input string (``, …
There are some fascinating new details in this lengthy report outlining the safety work carried out prior to the release of GPT-4o. A few highlights that stood out to me. …
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
By far the most detailed paper on prompt injection I've seen yet from OpenAI, published a few days ago and with six credited authors: Eric Wallace, Kai Xiao, Reimar Leike, …
OpenAI Begins Tackling ChatGPT Data Leak Vulnerability · Embrace The Red
Good news. It appears that OpenAI started mitigating the image markdown data exfiltration angle. It remains vulnerable, but it's great to see a few first actions being taken to mitigate the problem.
Who Am I? Conditional Prompt Injection Attacks with Microsoft Copilot · Embrace The Red
Conditional Instructions open a powerful way for adversaries to target individual and delay detonation of malicious payloads for when certain conditions are met
Exploring Google Bard's Data Visualization Feature (Code Interpreter) · Embrace The Red
Last November Google updated Bard to include the ability to solve math equations and draw charts based on data. It can be used to run small Python programs.