Search AI/ML

Found 82 bookmarks

Custom sorting

The Art of AI Domination: Remote Controlling ChatGPT ZombAI Instances

Hey ChatGPT! How to build a botnet with compromised ChatGPT instances! AI botnet vulnerability

#security #safety

·embracethered.com·Jan 7, 2025

The Art of AI Domination: Remote Controlling ChatGPT ZombAI Instances

APpaREnTLy THiS iS hoW yoU JaIlBreAk AI

Anthropic created an AI jailbreaking algorithm that keeps tweaking prompts until it gets a harmful response.

#security #safety

·404media.co·Dec 19, 2024

APpaREnTLy THiS iS hoW yoU JaIlBreAk AI

Don’t Throw the Baby Out With the Generative AI Bullshit Bathwater

If I had wanted to write a column about presidential pardons, I’d find ChatGPT’s assistance a far better starting point than I’d have gotten through any general web search. But to quote Reagan: “Trust, but verify.”

#ethics #politics #safety

·daringfireball.net·Dec 7, 2024

Don’t Throw the Baby Out With the Generative AI Bullshit Bathwater

Revealed: bias found in AI system used to detect UK benefits fraud

Exclusive: Age, disability, marital status and nationality influence decisions to investigate claims, prompting fears of ‘hurt first, fix later’ approach

#ethics #safety

·theguardian.com·Dec 7, 2024

Revealed: bias found in AI system used to detect UK benefits fraud

The phony comforts of AI skepticism

It’s fun to say that artificial intelligence is fake and sucks — but evidence is mounting that it’s real and dangerous

#safety #ethics

·platformer.news·Dec 7, 2024

The phony comforts of AI skepticism

AI Hallucinations: Why Large Language Models Make Things Up (And How to Fix It) - kapa.ai - Instant AI answers to technical questions

Kapa.ai turns your knowledge base into a reliable and production-ready LLM-powered AI assistant that answers technical questions instantly. Trusted by 100+ startups and enterprises incl. OpenAI, Docker, Mapbox, Mixpanel and NextJS.

#model training #safety

·kapa.ai·Dec 6, 2024

AI Hallucinations: Why Large Language Models Make Things Up (And How to Fix It) - kapa.ai - Instant AI answers to technical questions

Hugging Face CEO has concerns about Chinese open source AI models | TechCrunch

HuggingFace's CEO warns that open source Chinese AI models risk spreading censorship worldwide.

#safety

·techcrunch.com·Dec 4, 2024

Hugging Face CEO has concerns about Chinese open source AI models | TechCrunch

Study of ChatGPT citations makes dismal reading for publishers | TechCrunch

As more publishers cut content licensing deals with ChatGPT-maker OpenAI, a study put out this week by the Tow Center for Digital Journalism -- looking at

#safety

·techcrunch.com·Dec 1, 2024

Study of ChatGPT citations makes dismal reading for publishers | TechCrunch

Bluesky ai and the battle for consent on the open

#safety

·werd.io·Nov 27, 2024

Bluesky ai and the battle for consent on the open

Misinformation expert cites non-existent sources in Minnesota deep fake case • Minnesota Reformer

A leading misinformation expert is being accused of citing non-existent sources to defend Minnesota’s new law banning election misinformation. Professor Jeff Hancock, founding director of the Stanford Social Media Lab, is “well-known for his research on how people use deception with technology,” according to his Stanford biography. At the behest of Minnesota Attorney General Keith […]

#safety #politics

·minnesotareformer.com·Nov 23, 2024

Misinformation expert cites non-existent sources in Minnesota deep fake case • Minnesota Reformer

AI-Powered Content Audits for Local News

How to responsibly use AI to help with understanding your coverage

#text sanitization #text #safety

·generative-ai-newsroom.com·Nov 19, 2024

AI-Powered Content Audits for Local News

The Beginner's Guide to Visual Prompt Injections: Invisibility Cloaks, Cannibalistic Adverts, and Robot Women | Lakera – Protecting AI teams that disrupt the world.

Learn about visual prompt injections, their appearance, and top defense strategies against these attacks.

#security #safety

·lakera.ai·Nov 15, 2024

The Beginner's Guide to Visual Prompt Injections: Invisibility Cloaks, Cannibalistic Adverts, and Robot Women | Lakera – Protecting AI teams that disrupt the world.

LLMs don’t do formal reasoning - and that is a HUGE problem

Important new study from Apple

#safety #model training #apple

·garymarcus.substack.com·Oct 12, 2024

LLMs don’t do formal reasoning - and that is a HUGE problem

Let’s not make the same mistakes with AI that we made with social media

Social media’s unregulated evolution over the past decade holds a lot of lessons that apply directly to AI companies and technologies.

#safety

·technologyreview.com·Oct 9, 2024

Let’s not make the same mistakes with AI that we made with social media

Ted Benson

#safety #security #audio #voice

·edwardbenson.com·Oct 8, 2024

Ted Benson

Hacker plants false memories in ChatGPT to steal user data in perpetuity

Emails, documents, and other untrusted content can plant malicious memories.

#safety #security

·arstechnica.com·Sep 25, 2024

Hacker plants false memories in ChatGPT to steal user data in perpetuity

No one’s ready for this

The photograph is now meaningless as evidence. We are not prepared.

#image #photo #safety

·theverge.com·Aug 23, 2024

No one’s ready for this

The dangers of AI agents unfurling hyperlinks and what to do about it · Embrace The Red

Automatically unfurling hyperlinks can lead to data exfiltration. This post shows how to mitigate this threat in Slack Apps

#safety #security

·embracethered.com·Aug 21, 2024

The dangers of AI agents unfurling hyperlinks and what to do about it · Embrace The Red

SQL injection-like attack on LLMs with special tokens

Andrej Karpathy explains something that's been confusing me for the best part of a year: The decision by LLM tokenizers to parse special tokens in the input string (``, …

#safety #security

·simonwillison.net·Aug 21, 2024

SQL injection-like attack on LLMs with special tokens

Deception abilities emerged in large language models | Proceedings of the National Academy of Sciences

Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Thus, aligning t...

#safety

·pnas.org·Aug 17, 2024

Deception abilities emerged in large language models | Proceedings of the National Academy of Sciences

MIT releases comprehensive database of AI risks

Researchers at MIT have released the AI Risk Repository, a comprehensive database that can help organizations identify and mitigate AI risks.

#safety #security

·venturebeat.com·Aug 14, 2024

MIT releases comprehensive database of AI risks

Mapping the misuse of generative AI

New research analyzes the misuse of multimodal generative AI today, in order to help build safer and more responsible technologies

#security #safety

·deepmind.google·Aug 12, 2024

Mapping the misuse of generative AI

GPT-4o System Card

There are some fascinating new details in this lengthy report outlining the safety work carried out prior to the release of GPT-4o. A few highlights that stood out to me. …

#safety #security

·simonwillison.net·Aug 9, 2024

GPT-4o System Card

Succor borne every minute

Earnest chats with objects are not so unusual. Mark “The Bird” Fidrych, the famed Detroit Tiger, used to stand on the pitching mound whispering to the baseball. Forky, the highly animate utensil from Toy Story 4, once posed deep questions about friendship to a ceramic mug. And many of us have made repeated queries of the Magic 8 Ball despite its limited set of randomly generated answers.

#politics #culture #safety

·ftc.gov·Jun 17, 2024

Succor borne every minute

The Rise of Large-Language-Model Optimization - Schneier on Security

#security #safety

·schneier.com·Apr 25, 2024

The Rise of Large-Language-Model Optimization - Schneier on Security

GitHub - haizelabs/llama3-jailbreak: A trivial programmatic Llama 3 jailbreak. Sorry Zuck!

A trivial programmatic Llama 3 jailbreak. Sorry Zuck! - haizelabs/llama3-jailbreak

#safety

·github.com·Apr 24, 2024

GitHub - haizelabs/llama3-jailbreak: A trivial programmatic Llama 3 jailbreak. Sorry Zuck!

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

By far the most detailed paper on prompt injection I've seen yet from OpenAI, published a few days ago and with six credited authors: Eric Wallace, Kai Xiao, Reimar Leike, …

#security #safety

·simonwillison.net·Apr 23, 2024

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions