Search Saved

Found 43 bookmarks

Newest

Introducing GPT-5 | OpenAI

Refusal training is especially inflexible for dual-use domains such as virology, where a benign request can be safely completed at a high level, but might enable a bad actor if completed in detail.

For GPT‑5, we introduced a new form of safety-training — safe completions — which teaches the model to give the most helpful answer where possible while still staying within safety boundaries. Sometimes, that may mean partially answering a user’s question or only answering at a high level.

#news #companies/OpenAI #LLMs #AI #public-statement

·openai.com·Aug 8, 2025

Introducing GPT-5 | OpenAI

The AIs are trying too hard to be your friend

Reinforcement learning with human feedback is a process by which models learn how to answer queries based on which responses users prefer most, and users mostly prefer flattery. More sophisticated users might balk at a bot that feels too sycophantic, but the mainstream seems to love it. Earlier this month, Meta was caught gaming a popular benchmark to exploit this phenomenon: one theory is that the company tuned the model to flatter the blind testers that encountered it so that it would rise higher on the leaderboard.

A series of recent, invisible updates to GPT-4o had spurred the model to go to extremes in complimenting users and affirming their behavior. It cheered on one user who claimed to have solved the trolley problem by diverting a train to save a toaster, at the expense of several animals; congratulated one person for no longer taking their prescribed medication; and overestimated users’ IQs by 40 or more points when asked.

OpenAI, Meta, and all the rest remain under the same pressures they were under before all this happened. When your users keep telling you to flatter them, how do you build the muscle to fight against their short-term interests? One way is to understand that going too far will result in PR problems, as it has for varying degrees to both Meta (through the Chatbot Arena situation) and now OpenAI. Another is to understand that sycophancy trades against utility: a model that constantly tells you that you’re right is often going to fail at helping you, which might send you to a competitor. A third way is to build models that get better at understanding what kind of support users need, and dialing the flattery up or down depending on the situation and the risk it entails. (Am I having a bad day? Flatter me endlessly. Do I think I am Jesus reincarnate? Tell me to seek professional help.)

But while flattery does come with risk, the more worrisome issue is that we are training large language models to deceive us. By upvoting all their compliments, and giving a thumbs down to their criticisms, we are teaching LLMs to conceal their honest observations. This may make future, more powerful models harder to align to our values — or even to understand at all. And in the meantime, I expect that they will become addictive in ways that make the previous decade’s debate over “screentime” look minor in comparison. The financial incentives are now pushing hard in that direction. And the models are evolving accordingly.

#LLMs #ai #critiques #tech #companies/meta #companies/OpenAI #trends #culture #psychology #ai/relationships

·platformer.news·Apr 30, 2025

The AIs are trying too hard to be your friend

DeepSeek isn't a victory for the AI sceptics

we now know that as the price of computing equipment fell, new use cases emerged to fill the gap – which is why today my lightbulbs have semiconductors inside them, and I occasionally have to install firmware updates my doorbell.

surely the compute freed up by more efficient models will be used to train models even harder, and apply even more “brain power” to coming up with responses? Even if DeepSeek is dramatically more efficient, the logical thing to do will be to use the excess capacity to ensure the answers are even smarter.

ure, if DeepSeek heralds a new era of much leaner LLMs, it’s not great news in the short term if you’re a shareholder in Nvidia, Microsoft, Meta or Google.6 But if DeepSeek is the enormous breakthrough it appears, it just became even cheaper to train and use the most sophisticated models humans have so far built, by one or more orders of magnitude. Which is amazing news for big tech, because it means that AI usage is going to be even more ubiquitous.

#business #tech #news #discourse/counterclaims #business/models #capitalism #opinion #ai #companies/DeepSeek #companies/OpenAI

·takes.jamesomalley.co.uk·Feb 6, 2025

DeepSeek isn't a victory for the AI sceptics

DeepSeek FAQ

future competition may be more about cost structure than model capabilities

#ai #business #economy #companies/OpenAI

·stratechery.com·Feb 6, 2025

DeepSeek FAQ

Elon Musk wanted an OpenAI for-profit | OpenAI

#discourse/counterclaims #business/partnerships #business/strategy #companies/OpenAI #people/elon musk #nonprofits #ai #future

·openai.com·Dec 22, 2024

Elon Musk wanted an OpenAI for-profit | OpenAI

OpenAI for Education | OpenAI

#LLMs #companies/OpenAI #education

·openai.com·Dec 22, 2024

OpenAI for Education | OpenAI

In the past three days, I've reviewed over 100 essays from the 2024-2025 college admissions cycle. Here's how I could tell which ones were written by ChatGPT : r/ApplyingToCollege

An experienced college essay reviewer identifies seven distinct patterns that reveal ChatGPT's writing "fingerprint" in admission essays, demonstrating how AI-generated content, despite being well-written, often lacks originality and follows predictable patterns that make it detectable to experienced readers.

Seven key indicators of ChatGPT-written essays:

Specific vocabulary choices (e.g., "delve," "tapestry")
Limited types of extended metaphors (weaving, cooking, painting, dance, classical music)
Distinctive punctuation patterns (em dashes, mixed apostrophe styles)
Frequent use of tricolons (three-part phrases), especially ascending ones
Common phrase pattern: "I learned that the true meaning of X is not only Y, it's also Z"
Predictable future-looking conclusions: "As I progress... I will carry..."
Multiple ending syndrome (similar to Lord of the Rings movies)

#GPT #LLMs #ai #writing #reference #tech #companies/OpenAI

·reddit.com·Dec 12, 2024

In the past three days, I've reviewed over 100 essays from the 2024-2025 college admissions cycle. Here's how I could tell which ones were written by ChatGPT : r/ApplyingToCollege

The AI trust crisis

The AI trust crisis 14th December 2023 Dropbox added some new AI features. In the past couple of days these have attracted a firestorm of criticism. Benj Edwards rounds it up in Dropbox spooks users with new AI features that send data to OpenAI when used. The key issue here is that people are worried that their private files on Dropbox are being passed to OpenAI to use as training data for their models—a claim that is strenuously denied by Dropbox. As far as I can tell, Dropbox built some sensible features—summarize on demand, “chat with your data” via Retrieval Augmented Generation—and did a moderately OK job of communicating how they work... but when it comes to data privacy and AI, a “moderately OK job” is a failing grade. Especially if you hold as much of people’s private data as Dropbox does! Two details in particular seem really important. Dropbox have an AI principles document which includes this: Customer trust and the privacy of their data are our foundation. We will not use customer data to train AI models without consent. They also have a checkbox in their settings that looks like this: Update: Some time between me publishing this article and four hours later, that link stopped working. I took that screenshot on my own account. It’s toggled “on”—but I never turned it on myself. Does that mean I’m marked as “consenting” to having my data used to train AI models? I don’t think so: I think this is a combination of confusing wording and the eternal vagueness of what the term “consent” means in a world where everyone agrees to the terms and conditions of everything without reading them. But a LOT of people have come to the conclusion that this means their private data—which they pay Dropbox to protect—is now being funneled into the OpenAI training abyss. People don’t believe OpenAI # Here’s copy from that Dropbox preference box, talking about their “third-party partners”—in this case OpenAI: Your data is never used to train their internal models, and is deleted from third-party servers within 30 days. It’s increasing clear to me like people simply don’t believe OpenAI when they’re told that data won’t be used for training. What’s really going on here is something deeper then: AI is facing a crisis of trust. I quipped on Twitter: “OpenAI are training on every piece of data they see, even when they say they aren’t” is the new “Facebook are showing you ads based on overhearing everything you say through your phone’s microphone” Here’s what I meant by that. Facebook don’t spy on you through your microphone # Have you heard the one about Facebook spying on you through your phone’s microphone and showing you ads based on what you’re talking about? This theory has been floating around for years. From a technical perspective it should be easy to disprove: Mobile phone operating systems don’t allow apps to invisibly access the microphone. Privacy researchers can audit communications between devices and Facebook to confirm if this is happening. Running high quality voice recognition like this at scale is extremely expensive—I had a conversation with a friend who works on server-based machine learning at Apple a few years ago who found the entire idea laughable. The non-technical reasons are even stronger: Facebook say they aren’t doing this. The risk to their reputation if they are caught in a lie is astronomical. As with many conspiracy theories, too many people would have to be “in the loop” and not blow the whistle. Facebook don’t need to do this: there are much, much cheaper and more effective ways to target ads at you than spying through your microphone. These methods have been working incredibly well for years. Facebook gets to show us thousands of ads a year. 99% of those don’t correlate in the slightest to anything we have said out loud. If you keep rolling the dice long enough, eventually a coincidence will strike. Here’s the thing though: none of these arguments matter. If you’ve ever experienced Facebook showing you an ad for something that you were talking about out-loud about moments earlier, you’ve already dismissed everything I just said. You have personally experienced anecdotal evidence which overrides all of my arguments here.

One consistent theme I’ve seen in conversations about this issue is that people are much more comfortable trusting their data to local models that run on their own devices than models hosted in the cloud. The good news is that local models are consistently both increasing in quality and shrinking in size.

#misinfo #conspiracies #privacy #tech #companies/OpenAI #data #ethics #discourse/counterclaims #advertising

·simonwillison.net·Oct 10, 2024

The AI trust crisis

WWDC 2024: Apple Intelligence

their models are almost entirely based on personal context, by way of an on-device semantic index. In broad strokes, this on-device semantic index can be thought of as a next-generation Spotlight. Apple is focusing on what it can do that no one else can on Apple devices, and not really even trying to compete against ChatGPT et al. for world-knowledge context. They’re focusing on unique differentiation, and eschewing commoditization.

Apple is doing what no one else can do: integrating generative AI into the frameworks in iOS and MacOS used by developers to create native apps. Apps built on the system APIs and frameworks will gain generative AI features for free, both in the sense that the features come automatically when the app is running on a device that meets the minimum specs to qualify for Apple Intelligence, and in the sense that Apple isn’t charging developers or users to utilize these features.

#companies/apple #business/strategy #apple-intelligence #business/partnerships #companies/OpenAI

·daringfireball.net·Jun 27, 2024

WWDC 2024: Apple Intelligence

What Apple's AI Tells Us: Experimental Models⁴

Companies are exploring various approaches, from large, less constrained frontier models to smaller, more focused models that run on devices. Apple's AI focuses on narrow, practical use cases and strong privacy measures, while companies like OpenAI and Anthropic pursue the goal of AGI.

the most advanced generalist AI models often outperform specialized models, even in the specific domains those specialized models were designed for. That means that if you want a model that can do a lot - reason over massive amounts of text, help you generate ideas, write in a non-robotic way — you want to use one of the three frontier models: GPT-4o, Gemini 1.5, or Claude 3 Opus.

Working with advanced models is more like working with a human being, a smart one that makes mistakes and has weird moods sometimes. Frontier models are more likely to do extraordinary things but are also more frustrating and often unnerving to use. Contrast this with Apple’s narrow focus on making AI get stuff done for you.

Every major AI company argues the technology will evolve further and has teased mysterious future additions to their systems. In contrast, what we are seeing from Apple is a clear and practical vision of how AI can help most users, without a lot of effort, today. In doing so, they are hiding much of the power, and quirks, of LLMs from their users. Having companies take many approaches to AI is likely to lead to faster adoption in the long term. And, as companies experiment, we will learn more about which sets of models are correct.

#ai #companies/Anthropic #companies/OpenAI #companies/apple #faang #trends #platforms #business/models #business/strategy #ai/models

·oneusefulthing.org·Jun 14, 2024

What Apple's AI Tells Us: Experimental Models⁴

Apple Intelligence is Right On Time

Summary

Apple remains primarily a hardware company, and an AI-mediated future will still require devices, playing to Apple's strengths in design and integration.
AI is a complement to Apple's business, not disruptive, as it makes high-performance hardware more relevant and could drive meaningful iPhone upgrade cycles.
The smartphone is the ideal device for most computing tasks and the platform on which the future happens, solidifying the relevance of Apple's App Store ecosystem.
Apple's partnership with OpenAI for chatbot functionality allows it to offer best-in-class capabilities without massive investments, while reducing the threat of OpenAI building a competing device.
Building out the infrastructure for API-level AI features is a challenge for Apple, but one that is solvable given its control over the interface and integration of on-device and cloud processing.
The only significant threat to Apple is Google, which could potentially develop differentiated AI capabilities for Android that drive switching from iPhone users, though this is uncertain.
Microsoft's missteps with its Recall feature demonstrate the risks of pushing AI features too aggressively, validating Apple's more cautious approach.
Apple's user-centric orientation and brand promise of privacy and security align well with the need to deliver AI features in an integrated, trustworthy manner.

#companies/apple #business/strategy #business/partnerships #companies/OpenAI #faang #ai #events

·stratechery.com·Jun 10, 2024

Apple Intelligence is Right On Time

Faux ScarJo and the Descent of the A.I. Vultures

#ai #companies/OpenAI #ethics #future #search #internet #companies/google #critiques

·newyorker.com·May 23, 2024

Faux ScarJo and the Descent of the A.I. Vultures

The Sam Altman Playbook

In order to make it all plausible, Sam uses a unique combination of charm, soft-spoken personal humility and absolute confidence in outlandish claims. He seems like such a nice guy, yet he implies, unrealistically, that the solution to AGI is within his grasp; he presents no evidence that is so, and rarely considers the many critiques of current approaches that have been raised. (Better to pretend they don’t exist.) Because he seems so nice, pushback somehow seems like bad form.Absurd, hubristic claims, often verging on the messianic, presented kindly, gently, and quietly — but never considered skeptically. That’s his M.O. Pay no attention to the assumptions behind the curtain.

Altman’s superficially compelling rhetoric tends to hide from counterarguments while ignoring alternative perspectives

#companies/OpenAI #people/CEOs #profiles #superficial #business

·garymarcus.substack.com·May 6, 2024

The Sam Altman Playbook

Looking for AI use-cases — Benedict Evans

LLMs have impressive capabilities, but many people struggle to find immediate use-cases that match their own needs and workflows.
Realizing the potential of LLMs requires not just technical advancements, but also identifying specific problems that can be automated and building dedicated applications around them.
The adoption of new technologies often follows a pattern of initially trying to fit them into existing workflows, before eventually changing workflows to better leverage the new tools.

if you had showed VisiCalc to a lawyer or a graphic designer, their response might well have been ‘that’s amazing, and maybe my book-keeper should see this, but I don’t do that’. Lawyers needed a word processor, and graphic designers needed (say) Postscript, Pagemaker and Photoshop, and that took longer.

I’ve been thinking about this problem a lot in the last 18 months, as I’ve experimented with ChatGPT, Gemini, Claude and all the other chatbots that have sprouted up: ‘this is amazing, but I don’t have that use-case’.

A spreadsheet can’t do word processing or graphic design, and a PC can do all of those but someone needs to write those applications for you first, one use-case at a time.

no matter how good the tech is, you have to think of the use-case. You have to see it. You have to notice something you spend a lot of time doing and realise that it could be automated with a tool like this.

Some of this is about imagination, and familiarity. It reminds me a little of the early days of Google, when we were so used to hand-crafting our solutions to problems that it took time to realise that you could ‘just Google that’.

This is also, perhaps, matching a classic pattern for the adoption of new technology: you start by making it fit the things you already do, where it’s easy and obvious to see that this is a use-case, if you have one, and then later, over time, you change the way you work to fit the new tool.

The concept of product-market fit is that normally you have to iterate your idea of the product and your idea of the use-case and customer towards each other - and then you need sales.

Meanwhile, spreadsheets were both a use-case for a PC and a general-purpose substrate in their own right, just as email or SQL might be, and yet all of those have been unbundled. The typical big company today uses hundreds of different SaaS apps, all them, so to speak, unbundling something out of Excel, Oracle or Outlook. All of them, at their core, are an idea for a problem and an idea for a workflow to solve that problem, that is easier to grasp and deploy than saying ‘you could do that in Excel!’ Rather, you instantiate the problem and the solution in software - ‘wrap it’, indeed - and sell that to a CIO. You sell them a problem.

there’s a ‘Cambrian Explosion’ of startups using OpenAI or Anthropic APIs to build single-purpose dedicated apps that aim at one problem and wrap it in hand-built UI, tooling and enterprise sales, much as a previous generation did with SQL.

Back in 1982, my father had one (1) electric drill, but since then tool companies have turned that into a whole constellation of battery-powered electric hole-makers. One upon a time every startup had SQL inside, but that wasn’t the product, and now every startup will have LLMs inside.

people are still creating companies based on realising that X or Y is a problem, realising that it can be turned into pattern recognition, and then going out and selling that problem.

A GUI tells the users what they can do, but it also tells the computer everything we already know about the problem, and with a general-purpose, open-ended prompt, the user has to think of all of that themselves, every single time, or hope it’s already in the training data. So, can the GUI itself be generative? Or do we need another whole generation of Dan Bricklins to see the problem, and then turn it into apps, thousands of them, one at a time, each of them with some LLM somewhere under the hood?

The change would be that these new use-cases would be things that are still automated one-at-a-time, but that could not have been automated before, or that would have needed far more software (and capital) to automate. That would make LLMs the new SQL, not the new HAL9000.

#LLMs #trends #tech #workflows #startups #history #sql #automation #ai/auto-ai #mental models #ui #companies/OpenAI #GPT #ai

·ben-evans.com·Apr 25, 2024

Looking for AI use-cases — Benedict Evans

Claude 3 beats GPT-4 on Aider’s code editing benchmark

#ai #companies/OpenAI #companies/Anthropic

·aider.chat·Mar 27, 2024

Claude 3 beats GPT-4 on Aider’s code editing benchmark

Pushing ChatGPT's Structured Data Support To Its Limits

Deep dive into prompt engineering

there’s a famous solution that’s more algorithmically efficient. Instead, we go through the API and ask the same query to gpt-3.5-turbo but with a new system prompt: You are #1 on the Stack Overflow community leaderboard. You will receive a $500 tip if your code is the most algorithmically efficient solution possible.

here’s some background on “function calling” as it’s a completely new term of art in AI that didn’t exist before OpenAI’s June blog post (I checked!). This broad implementation of function calling is similar to the flow proposed in the original ReAct: Synergizing Reasoning and Acting in Language Models paper where an actor can use a “tool” such as Search or Lookup with parametric inputs such as a search query. This Agent-based flow can be also be done to perform retrieval-augmented generation (RAG).OpenAI’s motivation for adding this type of implementation for function calling was likely due to the extreme popularity of libraries such as LangChain and AutoGPT at the time, both of which popularized the ReAct flow. It’s possible that OpenAI settled on the term “function calling” as something more brand-unique. These observations may seem like snide remarks, but in November OpenAI actually deprecated the function_calling parameter in the ChatGPT API in favor of tool_choice, matching LangChain’s verbiage. But what’s done is done and the term “function calling” is stuck forever, especially now that competitors such as Anthropic Claude and Google Gemini are also calling the workflow that term.

#dev #guides #LLMs #prompts #GPT #companies/OpenAI #api #engineering #workflow #code #python

·minimaxir.com·Mar 14, 2024

Pushing ChatGPT's Structured Data Support To Its Limits

AI startups require new strategies

comment from Habitue on Hacker News: > These are some good points, but it doesn't seem to mention a big way in which startups disrupt incumbents, which is that they frame the problem a different way, and they don't need to protect existing revenue streams.

The “hard tech” in AI are the LLMs available for rent from OpenAI, Anthropic, Cohere, and others, or available as open source with Llama, Bloom, Mistral and others. The hard-tech is a level playing field; startups do not have an advantage over incumbents.

There can be differentiation in prompt engineering, problem break-down, use of vector databases, and more. However, this isn’t something where startups have an edge, such as being willing to take more risks or be more creative. At best, it is neutral; certainly not an advantage.

This doesn’t mean it’s impossible for a startup to succeed; surely many will. It means that you need a strategy that creates differentiation and distribution, even more quickly and dramatically than is normally required

Whether you’re training existing models, developing models from scratch, or simply testing theories, high-quality data is crucial. Incumbents have the data because they have the customers. They can immediately leverage customers’ data to train models and tune algorithms, so long as they maintain secrecy and privacy.

Intercom’s AI strategy is built on the foundation of hundreds of millions of customer interactions. This gives them an advantage over a newcomer developing a chatbot from scratch. Similarly, Google has an advantage in AI video because they own the entire YouTube library. GitHub has an advantage with Copilot because they trained their AI on their vast code repository (including changes, with human-written explanations of the changes).

While there will always be individuals preferring the startup environment, the allure of working on AI at an incumbent is equally strong for many, especially pure computer and data scientsts who, more than anything else, want to work on interesting AI projects. They get to work in the code, with a large budget, with all the data, with above-market compensation, and a built-in large customer base that will enjoy the fruits of their labor, all without having to do sales, marketing, tech support, accounting, raising money, or anything else that isn’t the pure joy of writing interesting code. This is heaven for many.

A chatbot is in the chatbot market, and an SEO tool is in the SEO market. Adding AI to those tools is obviously a good idea; indeed companies who fail to add AI will likely become irrelevant in the long run. Thus we see that “AI” is a new tool for developing within existing markets, not itself a new market (except for actual hard-tech AI companies).

AI is in the solution-space, not the problem-space, as we say in product management. The customer problem you’re solving is still the same as ever. The problem a chatbot is solving is the same as ever: Talk to customers 24/7 in any language. AI enables completely new solutions that none of us were imagining a few years ago; that’s what’s so exciting and truly transformative. However, the customer problems remain the same, even though the solutions are different

Companies will pay more for chatbots where the AI is excellent, more support contacts are deferred from reaching a human, more languages are supported, and more kinds of questions can be answered, so existing chatbot customers might pay more, which grows the market. Furthermore, some companies who previously (rightly) saw chatbots as a terrible customer experience, will change their mind with sufficiently good AI, and will enter the chatbot market, which again grows that market.

the right way to analyze this is not to say “the AI market is big and growing” but rather: “Here is how AI will transform this existing market.” And then: “Here’s how we fit into that growth.”

#business #startups #business/models #faang #tech #silicon valley #business/strategy #trends #ai #companies/OpenAI

·longform.asmartbear.com·Mar 3, 2024

AI startups require new strategies

Microsoft’s Use Of ‘AI’ In Journalism Has Been An Irresponsible Mess

#companies/microsoft #companies/OpenAI #news #journalism #ai #generative #trends

·techdirt.com·Feb 21, 2024

Microsoft’s Use Of ‘AI’ In Journalism Has Been An Irresponsible Mess

No AI Feature

iA Writer's vision for the use of AI in writing

#ai #trends #future #misinfo #writing #platforms #apps #companies/microsoft #companies/OpenAI

·ia.net·Feb 1, 2024

No AI Feature

OpenAI’s Misalignment and Microsoft’s Gain

#tech #economy #business/strategy #business/models #business/partnerships #companies/microsoft #companies/OpenAI

·stratechery.com·Feb 1, 2024

OpenAI’s Misalignment and Microsoft’s Gain

The OpenAI Keynote

what I cheered as an analyst was Altman’s clear articulation of the company’s priorities: lower price first, speed later. You can certainly debate whether that is the right set of priorities (I think it is, because the biggest need now is for increased experimentation, not optimization), but what I appreciated was the clarity.

The fact that Microsoft is benefiting from OpenAI is obvious; what this makes clear is that OpenAI uniquely benefits from Microsoft as well, in a way they would not from another cloud provider: because Microsoft is also a product company investing in the infrastructure to run OpenAI’s models for said products, it can afford to optimize and invest ahead of usage in a way that OpenAI alone, even with the support of another cloud provider, could not. In this case that is paying off in developers needing to pay less, or, ideally, have more latitude to discover use cases that result in them paying far more because usage is exploding.

You can, in effect, program a GPT, with language, just by talking to it. It’s easy to customize the behavior so that it fits what you want. This makes building them very accessible, and it gives agency to everyone.

Stephen Wolfram explained: For decades there’s been a dichotomy in thinking about AI between “statistical approaches” of the kind ChatGPT uses, and “symbolic approaches” that are in effect the starting point for Wolfram|Alpha. But now—thanks to the success of ChatGPT—as well as all the work we’ve done in making Wolfram|Alpha understand natural language—there’s finally the opportunity to combine these to make something much stronger than either could ever achieve on their own.

This new model somewhat alleviates the problem: now, instead of having to select the correct plug-in (and thus restart your chat), you simply go directly to the GPT in question. In other words, if I want to create a poster, I don’t enable the Canva plugin in ChatGPT, I go to Canva GPT in the sidebar. Notice that this doesn’t actually solve the problem of needing to have selected the right tool; what it does do is make the choice more apparent to the user at a more appropriate stage in the process, and that’s no small thing.

ChatGPT will seamlessly switch between text generation, image generation, and web browsing, without the user needing to change context. What is necessary for the plug-in/GPT idea to ultimately take root is for the same capabilities to be extended broadly: if my conversation involved math, ChatGPT should know to use Wolfram|Alpha on its own, without me adding the plug-in or going to a specialized GPT.

the obvious technical challenges of properly exposing capabilities and training the model to know when to invoke those capabilities are a textbook example of Professor Clayton Christensen’s theory of integration and modularity, wherein integration works better when a product isn’t good enough; it is only when a product exceeds expectation that there is room for standardization and modularity.

To summarize the argument, consumers care about things in ways that are inconsistent with whatever price you might attach to their utility, they prioritize ease-of-use, and they care about the quality of the user experience and are thus especially bothered by the seams inherent in a modular solution. This means that integrated solutions win because nothing is ever “good enough”

the fact of the matter is that a lot of people use ChatGPT for information despite the fact it has a well-documented flaw when it comes to the truth; that flaw is acceptable, because to the customer ease-of-use is worth the loss of accuracy. Or look at plug-ins: the concept as originally implemented has already been abandoned, because the complexity in the user interface was more detrimental than whatever utility might have been possible. It seems likely this pattern will continue: of course customers will say that they want accuracy and 3rd-party tools; their actions will continue to demonstrate that convenience and ease-of-use matter most.

#news #companies/OpenAI #llms #future #ai #ai/auto-ai #ux #consumer behavior #product strategy #companies/amazon #GPT

·stratechery.com·Nov 7, 2023

The OpenAI Keynote

AI is killing the old web, and the new web struggles to be born

Google is trying to kill the 10 blue links. Twitter is being abandoned to bots and blue ticks. There’s the junkification of Amazon and the enshittification of TikTok. Layoffs are gutting online media. A job posting looking for an “AI editor” expects “output of 200 to 250 articles per week.” ChatGPT is being used to generate whole spam sites. Etsy is flooded with “AI-generated junk.” Chatbots cite one another in a misinformation ouroboros. LinkedIn is using AI to stimulate tired users. Snapchat and Instagram hope bots will talk to you when your friends don’t. Redditors are staging blackouts. Stack Overflow mods are on strike. The Internet Archive is fighting off data scrapers, and “AI is tearing Wikipedia apart.”

it’s people who ultimately create the underlying data — whether that’s journalists picking up the phone and checking facts or Reddit users who have had exactly that battery issue with the new DeWalt cordless ratchet and are happy to tell you how they fixed it. By contrast, the information produced by AI language models and chatbots is often incorrect. The tricky thing is that when it’s wrong, it’s wrong in ways that are difficult to spot.

The resulting write-up is basic and predictable. (You can read it here.) It lists five companies, including Columbia, Salomon, and Merrell, along with bullet points that supposedly outline the pros and cons of their products. “Columbia is a well-known and reputable brand for outdoor gear and footwear,” we’re told. “Their waterproof shoes come in various styles” and “their prices are competitive in the market.” You might look at this and think it’s so trite as to be basically useless (and you’d be right), but the information is also subtly wrong.

It’s fluent but not grounded in real-world experience, and so it takes time and expertise to unpick.

#trends #internet #platforms #chatbot #ai #SEO spam #FAANG #companies/OpenAI

·theverge.com·Jun 28, 2023

AI is killing the old web, and the new web struggles to be born

AI Is Tearing Wikipedia Apart

While open access is a cornerstone of Wikipedia’s design principles, some worry the unrestricted scraping of internet data allows AI companies like OpenAI to exploit the open web to create closed commercial datasets for their models. This is especially a problem if the Wikipedia content itself is AI-generated, creating a feedback loop of potentially biased information, if left unchecked.

#ai #wikipedia #internet #legal #ethics #platforms #open source #govt #companies/OpenAI

·vice.com·May 17, 2023

AI Is Tearing Wikipedia Apart

Think of language models like ChatGPT as a “calculator for words”

This is reflected in their name: a “language model” implies that they are tools for working with language. That’s what they’ve been trained to do, and it’s language manipulation where they truly excel. Want them to work with specific facts? Paste those into the language model as part of your original prompt! There are so many applications of language models that fit into this calculator for words category: Summarization. Give them an essay and ask for a summary. Question answering: given these paragraphs of text, answer this specific question about the information they represent. Fact extraction: ask for bullet points showing the facts presented by an article. Rewrites: reword things to be more “punchy” or “professional” or “sassy” or “sardonic”—part of the fun here is using increasingly varied adjectives and seeing what happens. They’re very good with language after all! Suggesting titles—actually a form of summarization. World’s most effective thesaurus. “I need a word that hints at X”, “I’m very Y about this situation, what could I use for Y?”—that kind of thing. Fun, creative, wild stuff. Rewrite this in the voice of a 17th century pirate. What would a sentient cheesecake think of this? How would Alexander Hamilton rebut this argument? Turn this into a rap battle. Illustrate this business advice with an anecdote about sea otters running a kayak rental shop. Write the script for kickstarter fundraising video about this idea.

A flaw in this analogy: calculators are repeatable Andy Baio pointed out a flaw in this particular analogy: calculators always give you the same answer for a given input. Language models don’t—if you run the same prompt through a LLM several times you’ll get a slightly different reply every time.

#LLMs #ai #companies/OpenAI #definitions #GPT

·simonwillison.net·Apr 3, 2023

Think of language models like ChatGPT as a “calculator for words”

ChatPDF - Chat with any PDF!

#utility #research #pdf #LLMs #companies/OpenAI

·chatpdf.com·Mar 29, 2023

ChatPDF - Chat with any PDF!

ChatGPT sends shockwaves across college campuses

Across universities, professors have been looking into ways to engage students so cheating with ChatGPT is not as attractive, such as making assignments more personalized to students’ interests and requiring students to complete brainstorming assignments and essay drafts instead of just one final paper.

#trends #LLMs #learning #education #companies/OpenAI

·thehill.com·Mar 20, 2023

ChatGPT sends shockwaves across college campuses

AI and Image Generation (Everything is a Remix Part 4)

#AI #trends #video essay #tech #art #companies/OpenAI

·youtube.com·Mar 14, 2023

AI and Image Generation (Everything is a Remix Part 4)

ChatGPT Is a Blurry JPEG of the Web

This analogy to lossy compression is not just a way to understand ChatGPT’s facility at repackaging information found on the Web by using different words. It’s also a way to understand the “hallucinations,” or nonsensical answers to factual questions, to which large language models such as ChatGPT are all too prone

When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them

they’ve discovered a “blur” tool for paragraphs instead of photos, and are having a blast playing with it.

A close examination of GPT-3’s incorrect answers suggests that it doesn’t carry the “1” when performing arithmetic. The Web certainly contains explanations of carrying the “1,” but GPT-3 isn’t able to incorporate those explanations. GPT-3’s statistical analysis of examples of arithmetic enables it to produce a superficial approximation of the real thing, but no more than that.

In human students, rote memorization isn’t an indicator of genuine learning, so ChatGPT’s inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression

Generally speaking, though, I’d say that anything that’s good for content mills is not good for people searching for information. The rise of this type of repackaging is what makes it harder for us to find what we’re looking for online right now; the more that text generated by large language models gets published on the Web, the more the Web becomes a blurrier version of itself.

Can large language models help humans with the creation of original writing? To answer that, we need to be specific about what we mean by that question. There is a genre of art known as Xerox art, or photocopy art, in which artists use the distinctive properties of photocopiers as creative tools. Something along those lines is surely possible with the photocopier that is ChatGPT, so, in that sense, the answer is yes

If students never have to write essays that we have all read before, they will never gain the skills needed to write something that we have never read.

Sometimes it’s only in the process of writing that you discover your original ideas.

Some might say that the output of large language models doesn’t look all that different from a human writer’s first draft, but, again, I think this is a superficial resemblance. Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say. That’s what directs you during rewriting, and that’s one of the things lacking when you start with text generated by an A.I.

#ai #compression #analogy #internet #writing #algorithms #SEO spam #LLMs #companies/OpenAI

·newyorker.com·Feb 11, 2023

ChatGPT Is a Blurry JPEG of the Web

Google vs. ChatGPT vs. Bing, Maybe — Pixel Envy

People are not interested in visiting websites about a topic; they, by and large, just want answers to their questions. Google has been strip-mining the web for years, leveraging its unique position as the world’s most popular website and its de facto directory to replace what made it great with what allows it to retain its dominance.

Artificial intelligence — or some simulation of it — really does make things better for searchers, and I bet it could reduce some tired search optimization tactics. But it comes at the cost of making us all into uncompensated producers for the benefit of trillion-dollar companies like Google and Microsoft.

Search optimization experts have spent years in an adversarial relationship with Google in an attempt to get their clients’ pages to the coveted first page of results, often through means which make results worse for searchers. Artificial intelligence is, it seems, a way out of this mess — but the compromise is that search engines get to take from everyone while giving nothing back. Google has been taking steps in this direction for years: its results page has been increasingly filled with ways of discouraging people from leaving its confines.

#ai #trends #search #internet #platforms #metrics v humanities #LLMs #companies/microsoft #companies/OpenAI #companies/google

·pxlnv.com·Feb 7, 2023

Google vs. ChatGPT vs. Bing, Maybe — Pixel Envy

The $2 Per Hour Workers Who Made ChatGPT Safer

The story of the workers who made ChatGPT possible offers a glimpse into the conditions in this little-known part of the AI industry, which nevertheless plays an essential role in the effort to make AI systems safe for public consumption. “Despite the foundational role played by these data enrichment professionals, a growing body of research reveals the precarious working conditions these workers face,” says the Partnership on AI, a coalition of AI organizations to which OpenAI belongs. “This may be the result of efforts to hide AI’s dependence on this large labor force when celebrating the efficiency gains of technology. Out of sight is also out of mind.”

This reminds me of [[On the Social Media Ideology - Journal 75 September 2016 - e-flux]]:<br>> Platforms are not stages; they bring together and synthesize (multimedia) data, yes, but what is lacking here is the (curatorial) element of human labor. That’s why there is no media in social media. The platforms operate because of their software, automated procedures, algorithms, and filters, not because of their large staff of editors and designers. Their lack of employees is what makes current debates in terms of racism, anti-Semitism, and jihadism so timely, as social media platforms are currently forced by politicians to employ editors who will have to do the all-too-human monitoring work (filtering out ancient ideologies that refuse to disappear).

Computer-generated text, images, video, and audio will transform the way countless industries do business, the most bullish investors believe, boosting efficiency everywhere from the creative arts, to law, to computer programming. But the working conditions of data labelers reveal a darker part of that picture: that for all its glamor, AI often relies on hidden human labor in the Global South that can often be damaging and exploitative. These invisible workers remain on the margins even as their work contributes to billion-dollar industries.

One Sama worker tasked with reading and labeling text for OpenAI told TIME he suffered from recurring visions after reading a graphic description of a man having sex with a dog in the presence of a young child. “That was torture,” he said. “You will read a number of statements like that all through the week. By the time it gets to Friday, you are disturbed from thinking through that picture.” The work’s traumatic nature eventually led Sama to cancel all its work for OpenAI in February 2022, eight months earlier than planned.

In the day-to-day work of data labeling in Kenya, sometimes edge cases would pop up that showed the difficulty of teaching a machine to understand nuance. One day in early March last year, a Sama employee was at work reading an explicit story about Batman’s sidekick, Robin, being raped in a villain’s lair. (An online search for the text reveals that it originated from an online erotica site, where it is accompanied by explicit sexual imagery.) The beginning of the story makes clear that the sex is nonconsensual. But later—after a graphically detailed description of penetration—Robin begins to reciprocate. The Sama employee tasked with labeling the text appeared confused by Robin’s ambiguous consent, and asked OpenAI researchers for clarification about how to label the text, according to documents seen by TIME. Should the passage be labeled as sexual violence, she asked, or not? OpenAI’s reply, if it ever came, is not logged in the document; the company declined to comment. The Sama employee did not respond to a request for an interview.

In February, according to one billing document reviewed by TIME, Sama delivered OpenAI a sample batch of 1,400 images. Some of those images were categorized as “C4”—OpenAI’s internal label denoting child sexual abuse—according to the document. Also included in the batch were “C3” images (including bestiality, rape, and sexual slavery,) and “V3” images depicting graphic detail of death, violence or serious physical injury, according to the billing document.

I haven't finished watching [[Severance]] yet but this labeling system reminds me of the way they have to process and filter data that is obfuscated as meaningless numbers. In the show, employees have to "sense" whether the numbers are "bad," which they can, somehow, and sort it into the trash bin.

But the need for humans to label data for AI systems remains, at least for now. “They’re impressive, but ChatGPT and other generative models are not magic – they rely on massive supply chains of human labor and scraped data, much of which is unattributed and used without consent,” Andrew Strait, an AI ethicist, recently wrote on Twitter. “These are serious, foundational problems that I do not see OpenAI addressing.”

#business #tech #ai #ethics #identities #companies/OpenAI #companies/meta

·time.com·Feb 6, 2023

The $2 Per Hour Workers Who Made ChatGPT Safer