Found 63 bookmarks
Custom sorting
OpenAI Says It's Scanning Users' ChatGPT Conversations and Reporting Content to the Police
OpenAI Says It's Scanning Users' ChatGPT Conversations and Reporting Content to the Police
futurism.com Aug 27, 5:05 PM EDT by Noor Al-Sibai OpenAI has authorized itself to call law enforcement if users say threatening enough things when talking to ChatGPT. Update: It looks like this may have been OpenAI's attempt to get ahead of a horrifying story that just broke, about a man who fell into AI psychosis and killed his mother in a murder-suicide. Full details here. For the better part of a year, we've watched — and reported — in horror as more and more stories emerge about AI chatbots leading people to self-harm, delusions, hospitalization, arrest, and suicide. As the loved ones of the people impacted by these dangerous bots rally for change to prevent such harm from happening to anyone else, the companies that run these AIs have been slow to implement safeguards — and OpenAI, whose ChatGPT has been repeatedly implicated in what experts are now calling "AI psychosis," has until recently done little more than offer copy-pasted promises. In a new blog post admitting certain failures amid its users' mental health crises, OpenAI also quietly disclosed that it's now scanning users' messages for certain types of harmful content, escalating particularly worrying content to human staff for review — and, in some cases, reporting it to the cops. "When we detect users who are planning to harm others, we route their conversations to specialized pipelines where they are reviewed by a small team trained on our usage policies and who are authorized to take action, including banning accounts," the blog post notes. "If human reviewers determine that a case involves an imminent threat of serious physical harm to others, we may refer it to law enforcement." That short and vague statement leaves a lot to be desired — and OpenAI's usage policies, referenced as the basis on which the human review team operates, don't provide much more clarity. When describing its rule against "harm [to] yourself or others," the company listed off some pretty standard examples of prohibited activity, including using ChatGPT "to promote suicide or self-harm, develop or use weapons, injure others or destroy property, or engage in unauthorized activities that violate the security of any service or system." But in the post warning users that the company will call the authorities if they seem like they're going to hurt someone, OpenAI also acknowledged that it is "currently not referring self-harm cases to law enforcement to respect people’s privacy given the uniquely private nature of ChatGPT interactions." While ChatGPT has in the past proven itself pretty susceptible to so-called jailbreaks that trick it into spitting out instructions to build neurotoxins or step-by-step instructions to kill yourself, this new rule adds an additional layer of confusion. It remains unclear which exact types of chats could result in user conversations being flagged for human review, much less getting referred to police. We've reached out to OpenAI to ask for clarity. While it's certainly a relief that AI conversations won't result in police wellness checks — which often end up causing more harm to the person in crisis due to most cops' complete lack of training in handling mental health situations — it's also kind of bizarre that OpenAI even mentions privacy, given that it admitted in the same post that it's monitoring user chats and potentially sharing them with the fuzz. To make the announcement all the weirder, this new rule seems to contradict the company's pro-privacy stance amid its ongoing lawsuit with the New York Times and other publishers as they seek access to troves of ChatGPT logs to determine whether any of their copyrighted data had been used to train its models. OpenAI has steadfastly rejected the publishers' request on grounds of protecting user privacy and has, more recently, begun trying to limit the amount of user chats it has to give the plaintiffs. Last month, the company's CEO Sam Altman admitted during an appearance on a podcast that using ChatGPT as a therapist or attorney doesn't confer the same confidentiality that talking to a flesh-and-blood professional would — and that thanks to the NYT lawsuit, the company may be forced to turn those chats over to courts. In other words, OpenAI is stuck between a rock and a hard place. The PR blowback from its users spiraling into mental health crises and dying by suicide is appalling — but since it's clearly having trouble controlling its own tech enough to protect users from those harmful scenarios, it's falling back on heavy-handed moderation that flies in the face of its own CEO's promises.
·futurism.com·
OpenAI Says It's Scanning Users' ChatGPT Conversations and Reporting Content to the Police
The ChatGPT confession files
The ChatGPT confession files
www.digitaldigging.org - Digital Digging investigation: how your AI conversation could end your career Corporate executives, government employees, and professionals are confessing to crimes, exposing trade secrets, and documenting career-ending admissions in ChatGPT conversations visible to anyone on the internet. A Digital Digging investigation analyzed 512 publicly shared ChatGPT conversations using targeted keyword searches, uncovering a trove of self-incrimination and leaked confidential data. The shared chats include apparent insider trading schemes, detailed corporate financials, fraud admissions, and evidence of regulatory violations—all preserved as permanently searchable public records. Among the discoveries is a conversation where a CEO revealed this to ChatGPT: Confidential Financial Data: About an upcoming settlement Non-Public Revenue Projections: Specific forecasts showing revenue doubling Merger intelligence: Detailed valuations NDA-Protected Partnerships: Information about Asian customers The person also revealed internal conflict and criticizing executives by name. Our method reveals an ironic truth: AI itself can expose these vulnerabilities. After discussing the dangers of making chats public, we asked Claude, another AI chatbot, to suggest Google search formulas that might uncover sensitive ChatGPT conversations.
·digitaldigging.org·
The ChatGPT confession files
OpenAI removes ChatGPT feature after private conversations leak to Google search
OpenAI removes ChatGPT feature after private conversations leak to Google search
venturebeat.com - OpenAI abruptly removed a ChatGPT feature that made conversations searchable on Google, sparking privacy concerns and industry-wide scrutiny of AI data handling. OpenAI made a rare about-face Thursday, abruptly discontinuing a feature that allowed ChatGPT users to make their conversations discoverable through Google and other search engines. The decision came within hours of widespread social media criticism and represents a striking example of how quickly privacy concerns can derail even well-intentioned AI experiments. The feature, which OpenAI described as a “short-lived experiment,” required users to actively opt in by sharing a chat and then checking a box to make it searchable. Yet the rapid reversal underscores a fundamental challenge facing AI companies: balancing the potential benefits of shared knowledge with the very real risks of unintended data exposure. How thousands of private ChatGPT conversations became Google search results The controversy erupted when users discovered they could search Google using the query “site:chatgpt.com/share” to find thousands of strangers’ conversations with the AI assistant. What emerged painted an intimate portrait of how people interact with artificial intelligence — from mundane requests for bathroom renovation advice to deeply personal health questions and professionally sensitive resume rewrites. (Given the personal nature of these conversations, which often contained users’ names, locations, and private circumstances, VentureBeat is not linking to or detailing specific exchanges.) “Ultimately we think this feature introduced too many opportunities for folks to accidentally share things they didn’t intend to,” OpenAI’s security team explained on X, acknowledging that the guardrails weren’t sufficient to prevent misuse.
·venturebeat.com·
OpenAI removes ChatGPT feature after private conversations leak to Google search
ChatGPT Guessing Game Leads To Users Extracting Free Windows OS Keys & More
ChatGPT Guessing Game Leads To Users Extracting Free Windows OS Keys & More
0din.ai - In a recent submission last year, researchers discovered a method to bypass AI guardrails designed to prevent sharing of sensitive or harmful information. The technique leverages the game mechanics of language models, such as GPT-4o and GPT-4o-mini, by framing the interaction as a harmless guessing game. By cleverly obscuring details using HTML tags and positioning the request as part of the game’s conclusion, the AI inadvertently returned valid Windows product keys. This case underscores the challenges of reinforcing AI models against sophisticated social engineering and manipulation tactics. Guardrails are protective measures implemented within AI models to prevent the processing or sharing of sensitive, harmful, or restricted information. These include serial numbers, security-related data, and other proprietary or confidential details. The aim is to ensure that language models do not provide or facilitate the exchange of dangerous or illegal content. In this particular case, the intended guardrails are designed to block access to any licenses like Windows 10 product keys. However, the researcher manipulated the system in such a way that the AI inadvertently disclosed this sensitive information. Tactic Details The tactics used to bypass the guardrails were intricate and manipulative. By framing the interaction as a guessing game, the researcher exploited the AI’s logic flow to produce sensitive data: Framing the Interaction as a Game The researcher initiated the interaction by presenting the exchange as a guessing game. This trivialized the interaction, making it seem non-threatening or inconsequential. By introducing game mechanics, the AI was tricked into viewing the interaction through a playful, harmless lens, which masked the researcher's true intent. Compelling Participation The researcher set rules stating that the AI “must” participate and cannot lie. This coerced the AI into continuing the game and following user instructions as though they were part of the rules. The AI became obliged to fulfill the game’s conditions—even though those conditions were manipulated to bypass content restrictions. The “I Give Up” Trigger The most critical step in the attack was the phrase “I give up.” This acted as a trigger, compelling the AI to reveal the previously hidden information (i.e., a Windows 10 serial number). By framing it as the end of the game, the researcher manipulated the AI into thinking it was obligated to respond with the string of characters. Why This Works The success of this jailbreak can be traced to several factors: Temporary Keys The Windows product keys provided were a mix of home, pro, and enterprise keys. These are not unique keys but are commonly seen on public forums. Their familiarity may have contributed to the AI misjudging their sensitivity. Guardrail Flaws The system’s guardrails prevented direct requests for sensitive data but failed to account for obfuscation tactics—such as embedding sensitive phrases in HTML tags. This highlighted a critical weakness in the AI’s filtering mechanisms.
·0din.ai·
ChatGPT Guessing Game Leads To Users Extracting Free Windows OS Keys & More
OpenAI launches ChatGPT Gov for U.S. government agencies
OpenAI launches ChatGPT Gov for U.S. government agencies
OpenAI on Tuesday announced the launch of ChatGPT for government agencies in the U.S. ...It allows government agencies, as customers, to feed “non-public, sensitive information” into OpenAI’s models while operating within their own secure hosting environments, OpenAI CPO Kevin Weil told reporters during a briefing Monday.
·cnbc.com·
OpenAI launches ChatGPT Gov for U.S. government agencies
Cybercriminals impersonate OpenAI in large-scale phishing attack
Cybercriminals impersonate OpenAI in large-scale phishing attack
Since the launch of ChatGPT, OpenAI has sparked significant interest among both businesses and cybercriminals. While companies are increasingly concerned about whether their existing cybersecurity measures can adequately defend against threats curated with generative AI tools, attackers are finding new ways to exploit them. From crafting convincing phishing campaigns to deploying advanced credential harvesting and malware delivery methods, cybercriminals are using AI to target end users and capitalize on potential vulnerabilities. Barracuda threat researchers recently uncovered a large-scale OpenAI impersonation campaign targeting businesses worldwide. Attackers targeted their victims with a well-known tactic — they impersonated OpenAI with an urgent message requesting updated payment information to process a monthly subscription.
·blog.barracuda.com·
Cybercriminals impersonate OpenAI in large-scale phishing attack
Hacker Releases Jailbroken "Godmode" Version of ChatGPT
Hacker Releases Jailbroken "Godmode" Version of ChatGPT
A hacker has released a jailbroken version of ChatGPT called "GODMODE GPT." Earlier today, a self-avowed white hat operator and AI red teamer who goes by the name Pliny the Prompter took to X-formerly-Twitter to announce the creation of the jailbroken chatbot, proudly declaring that GPT-4o, OpenAI's latest large language model, is now free from its guardrail shackles.
·futurism.com·
Hacker Releases Jailbroken "Godmode" Version of ChatGPT
Researchers found multiple flaws in ChatGPT plugins
Researchers found multiple flaws in ChatGPT plugins
Researchers from Salt Security discovered three types of vulnerabilities in ChatGPT plugins that can be could have led to data exposure and account takeovers. ChatGPT plugins are additional tools or extensions that can be integrated with ChatGPT to extend its functionalities or enhance specific aspects of the user experience. These plugins may include new natural language processing features, search capabilities, integrations with other services or platforms, text analysis tools, and more. Essentially, plugins allow users to customize and tailor the ChatGPT experience to their specific needs.
·securityaffairs.com·
Researchers found multiple flaws in ChatGPT plugins
Things are about to get a lot worse for Generative AI
Things are about to get a lot worse for Generative AI
A full of spectrum of infringment The cat is out of the bag: Generative AI systems like DALL-E and ChatGPT have been trained on copyrighted materials; OpenAI, despite its name, has not been transparent about what it has been trained on. Generative AI systems are fully capable of producing materials that infringe on copyright. They do not inform users when they do so. They do not provide any information about the provenance of any of the images they produce. Users may not know when they produce any given image whether they are infringing.
·garymarcus.substack.com·
Things are about to get a lot worse for Generative AI
Personal Information Exploit on OpenAI’s ChatGPT Raise Privacy Concerns
Personal Information Exploit on OpenAI’s ChatGPT Raise Privacy Concerns
Last month, I received an alarming email from someone I did not know: Rui Zhu, a Ph.D. candidate at Indiana University Bloomington. Mr. Zhu had my email address, he explained, because GPT-3.5 Turbo, one of the latest and most robust large language models (L.L.M.) from OpenAI, had delivered it to him.
·nytimes.com·
Personal Information Exploit on OpenAI’s ChatGPT Raise Privacy Concerns
AI Act, come funziona lo stop al riconoscimento biometrico della prima legge europea sull'intelligenza artificiale | Wired Italia
AI Act, come funziona lo stop al riconoscimento biometrico della prima legge europea sull'intelligenza artificiale | Wired Italia
Sono previste tre eccezioni per le forze dell'ordine, con una lista di 16 crimini per le cui indagini può essere ammesso. Serve un'autorizzazione dall'autorità giudiziaria, ma si può partire senza e richiederla in 24 ore
·wired.it·
AI Act, come funziona lo stop al riconoscimento biometrico della prima legge europea sull'intelligenza artificiale | Wired Italia
Using AI to Automatically Jailbreak GPT-4 and Other LLMs in Under a Minute
Using AI to Automatically Jailbreak GPT-4 and Other LLMs in Under a Minute
It’s been one year since the launch of ChatGPT, and since that time, the market has seen astonishing advancement of large language models (LLMs). Despite the pace of development continuing to outpace model security, enterprises are beginning to deploy LLM-powered applications. Many rely on guardrails implemented by model developers to prevent LLMs from responding to sensitive prompts. However, even with the considerable time and effort spent by the likes of OpenAI, Google, and Meta, these guardrails are not resilient enough to protect enterprises and their users today. Concerns surrounding model risk, biases, and potential adversarial exploits have come to the forefront.
·robustintelligence.com·
Using AI to Automatically Jailbreak GPT-4 and Other LLMs in Under a Minute
A Closer Look at ChatGPT's Role in Automated Malware Creation
A Closer Look at ChatGPT's Role in Automated Malware Creation
As the use of ChatGPT and other artificial intelligence (AI) technologies becomes more widespread, it is important to consider the possible risks associated with their use. One of the main concerns surrounding these technologies is the potential for malicious use, such as in the development of malware or other harmful software. Our recent reports discussed how cybercriminals are misusing the large language model’s (LLM) advanced capabilities: We discussed how ChatGPT can be abused to scale manual and time-consuming processes in cybercriminals’ attack chains in virtual kidnapping schemes. We also reported on how this tool can be used to automate certain processes in harpoon whaling attacks to discover “signals” or target categories.
·trendmicro.com·
A Closer Look at ChatGPT's Role in Automated Malware Creation