important:1 #companies/OpenAI

In the past three days, I've reviewed over 100 essays from the 2024-2025 college admissions cycle. Here's how I could tell which ones were written by ChatGPT : r/ApplyingToCollege

An experienced college essay reviewer identifies seven distinct patterns that reveal ChatGPT's writing "fingerprint" in admission essays, demonstrating how AI-generated content, despite being well-written, often lacks originality and follows predictable patterns that make it detectable to experienced readers.

Seven key indicators of ChatGPT-written essays:

Specific vocabulary choices (e.g., "delve," "tapestry")
Limited types of extended metaphors (weaving, cooking, painting, dance, classical music)
Distinctive punctuation patterns (em dashes, mixed apostrophe styles)
Frequent use of tricolons (three-part phrases), especially ascending ones
Common phrase pattern: "I learned that the true meaning of X is not only Y, it's also Z"
Predictable future-looking conclusions: "As I progress... I will carry..."
Multiple ending syndrome (similar to Lord of the Rings movies)

#GPT #LLMs #ai #writing #reference #tech #companies/OpenAI

·reddit.com·Dec 12, 2024

In the past three days, I've reviewed over 100 essays from the 2024-2025 college admissions cycle. Here's how I could tell which ones were written by ChatGPT : r/ApplyingToCollege

The AI trust crisis

The AI trust crisis 14th December 2023 Dropbox added some new AI features. In the past couple of days these have attracted a firestorm of criticism. Benj Edwards rounds it up in Dropbox spooks users with new AI features that send data to OpenAI when used. The key issue here is that people are worried that their private files on Dropbox are being passed to OpenAI to use as training data for their models—a claim that is strenuously denied by Dropbox. As far as I can tell, Dropbox built some sensible features—summarize on demand, “chat with your data” via Retrieval Augmented Generation—and did a moderately OK job of communicating how they work... but when it comes to data privacy and AI, a “moderately OK job” is a failing grade. Especially if you hold as much of people’s private data as Dropbox does! Two details in particular seem really important. Dropbox have an AI principles document which includes this: Customer trust and the privacy of their data are our foundation. We will not use customer data to train AI models without consent. They also have a checkbox in their settings that looks like this: Update: Some time between me publishing this article and four hours later, that link stopped working. I took that screenshot on my own account. It’s toggled “on”—but I never turned it on myself. Does that mean I’m marked as “consenting” to having my data used to train AI models? I don’t think so: I think this is a combination of confusing wording and the eternal vagueness of what the term “consent” means in a world where everyone agrees to the terms and conditions of everything without reading them. But a LOT of people have come to the conclusion that this means their private data—which they pay Dropbox to protect—is now being funneled into the OpenAI training abyss. People don’t believe OpenAI # Here’s copy from that Dropbox preference box, talking about their “third-party partners”—in this case OpenAI: Your data is never used to train their internal models, and is deleted from third-party servers within 30 days. It’s increasing clear to me like people simply don’t believe OpenAI when they’re told that data won’t be used for training. What’s really going on here is something deeper then: AI is facing a crisis of trust. I quipped on Twitter: “OpenAI are training on every piece of data they see, even when they say they aren’t” is the new “Facebook are showing you ads based on overhearing everything you say through your phone’s microphone” Here’s what I meant by that. Facebook don’t spy on you through your microphone # Have you heard the one about Facebook spying on you through your phone’s microphone and showing you ads based on what you’re talking about? This theory has been floating around for years. From a technical perspective it should be easy to disprove: Mobile phone operating systems don’t allow apps to invisibly access the microphone. Privacy researchers can audit communications between devices and Facebook to confirm if this is happening. Running high quality voice recognition like this at scale is extremely expensive—I had a conversation with a friend who works on server-based machine learning at Apple a few years ago who found the entire idea laughable. The non-technical reasons are even stronger: Facebook say they aren’t doing this. The risk to their reputation if they are caught in a lie is astronomical. As with many conspiracy theories, too many people would have to be “in the loop” and not blow the whistle. Facebook don’t need to do this: there are much, much cheaper and more effective ways to target ads at you than spying through your microphone. These methods have been working incredibly well for years. Facebook gets to show us thousands of ads a year. 99% of those don’t correlate in the slightest to anything we have said out loud. If you keep rolling the dice long enough, eventually a coincidence will strike. Here’s the thing though: none of these arguments matter. If you’ve ever experienced Facebook showing you an ad for something that you were talking about out-loud about moments earlier, you’ve already dismissed everything I just said. You have personally experienced anecdotal evidence which overrides all of my arguments here.

One consistent theme I’ve seen in conversations about this issue is that people are much more comfortable trusting their data to local models that run on their own devices than models hosted in the cloud. The good news is that local models are consistently both increasing in quality and shrinking in size.

#misinfo #conspiracies #privacy #tech #companies/OpenAI #data #ethics #discourse/counterclaims #advertising

·simonwillison.net·Oct 10, 2024

The AI trust crisis

Looking for AI use-cases — Benedict Evans

LLMs have impressive capabilities, but many people struggle to find immediate use-cases that match their own needs and workflows.
Realizing the potential of LLMs requires not just technical advancements, but also identifying specific problems that can be automated and building dedicated applications around them.
The adoption of new technologies often follows a pattern of initially trying to fit them into existing workflows, before eventually changing workflows to better leverage the new tools.

if you had showed VisiCalc to a lawyer or a graphic designer, their response might well have been ‘that’s amazing, and maybe my book-keeper should see this, but I don’t do that’. Lawyers needed a word processor, and graphic designers needed (say) Postscript, Pagemaker and Photoshop, and that took longer.

I’ve been thinking about this problem a lot in the last 18 months, as I’ve experimented with ChatGPT, Gemini, Claude and all the other chatbots that have sprouted up: ‘this is amazing, but I don’t have that use-case’.

A spreadsheet can’t do word processing or graphic design, and a PC can do all of those but someone needs to write those applications for you first, one use-case at a time.

no matter how good the tech is, you have to think of the use-case. You have to see it. You have to notice something you spend a lot of time doing and realise that it could be automated with a tool like this.

Some of this is about imagination, and familiarity. It reminds me a little of the early days of Google, when we were so used to hand-crafting our solutions to problems that it took time to realise that you could ‘just Google that’.

This is also, perhaps, matching a classic pattern for the adoption of new technology: you start by making it fit the things you already do, where it’s easy and obvious to see that this is a use-case, if you have one, and then later, over time, you change the way you work to fit the new tool.

The concept of product-market fit is that normally you have to iterate your idea of the product and your idea of the use-case and customer towards each other - and then you need sales.

Meanwhile, spreadsheets were both a use-case for a PC and a general-purpose substrate in their own right, just as email or SQL might be, and yet all of those have been unbundled. The typical big company today uses hundreds of different SaaS apps, all them, so to speak, unbundling something out of Excel, Oracle or Outlook. All of them, at their core, are an idea for a problem and an idea for a workflow to solve that problem, that is easier to grasp and deploy than saying ‘you could do that in Excel!’ Rather, you instantiate the problem and the solution in software - ‘wrap it’, indeed - and sell that to a CIO. You sell them a problem.

there’s a ‘Cambrian Explosion’ of startups using OpenAI or Anthropic APIs to build single-purpose dedicated apps that aim at one problem and wrap it in hand-built UI, tooling and enterprise sales, much as a previous generation did with SQL.

Back in 1982, my father had one (1) electric drill, but since then tool companies have turned that into a whole constellation of battery-powered electric hole-makers. One upon a time every startup had SQL inside, but that wasn’t the product, and now every startup will have LLMs inside.

people are still creating companies based on realising that X or Y is a problem, realising that it can be turned into pattern recognition, and then going out and selling that problem.

A GUI tells the users what they can do, but it also tells the computer everything we already know about the problem, and with a general-purpose, open-ended prompt, the user has to think of all of that themselves, every single time, or hope it’s already in the training data. So, can the GUI itself be generative? Or do we need another whole generation of Dan Bricklins to see the problem, and then turn it into apps, thousands of them, one at a time, each of them with some LLM somewhere under the hood?

The change would be that these new use-cases would be things that are still automated one-at-a-time, but that could not have been automated before, or that would have needed far more software (and capital) to automate. That would make LLMs the new SQL, not the new HAL9000.

#LLMs #trends #tech #workflows #startups #history #sql #automation #ai/auto-ai #mental models #ui #companies/OpenAI #GPT #ai

·ben-evans.com·Apr 25, 2024

Looking for AI use-cases — Benedict Evans

AI startups require new strategies

comment from Habitue on Hacker News: > These are some good points, but it doesn't seem to mention a big way in which startups disrupt incumbents, which is that they frame the problem a different way, and they don't need to protect existing revenue streams.

The “hard tech” in AI are the LLMs available for rent from OpenAI, Anthropic, Cohere, and others, or available as open source with Llama, Bloom, Mistral and others. The hard-tech is a level playing field; startups do not have an advantage over incumbents.

There can be differentiation in prompt engineering, problem break-down, use of vector databases, and more. However, this isn’t something where startups have an edge, such as being willing to take more risks or be more creative. At best, it is neutral; certainly not an advantage.

This doesn’t mean it’s impossible for a startup to succeed; surely many will. It means that you need a strategy that creates differentiation and distribution, even more quickly and dramatically than is normally required

Whether you’re training existing models, developing models from scratch, or simply testing theories, high-quality data is crucial. Incumbents have the data because they have the customers. They can immediately leverage customers’ data to train models and tune algorithms, so long as they maintain secrecy and privacy.

Intercom’s AI strategy is built on the foundation of hundreds of millions of customer interactions. This gives them an advantage over a newcomer developing a chatbot from scratch. Similarly, Google has an advantage in AI video because they own the entire YouTube library. GitHub has an advantage with Copilot because they trained their AI on their vast code repository (including changes, with human-written explanations of the changes).

While there will always be individuals preferring the startup environment, the allure of working on AI at an incumbent is equally strong for many, especially pure computer and data scientsts who, more than anything else, want to work on interesting AI projects. They get to work in the code, with a large budget, with all the data, with above-market compensation, and a built-in large customer base that will enjoy the fruits of their labor, all without having to do sales, marketing, tech support, accounting, raising money, or anything else that isn’t the pure joy of writing interesting code. This is heaven for many.

A chatbot is in the chatbot market, and an SEO tool is in the SEO market. Adding AI to those tools is obviously a good idea; indeed companies who fail to add AI will likely become irrelevant in the long run. Thus we see that “AI” is a new tool for developing within existing markets, not itself a new market (except for actual hard-tech AI companies).

AI is in the solution-space, not the problem-space, as we say in product management. The customer problem you’re solving is still the same as ever. The problem a chatbot is solving is the same as ever: Talk to customers 24/7 in any language. AI enables completely new solutions that none of us were imagining a few years ago; that’s what’s so exciting and truly transformative. However, the customer problems remain the same, even though the solutions are different

Companies will pay more for chatbots where the AI is excellent, more support contacts are deferred from reaching a human, more languages are supported, and more kinds of questions can be answered, so existing chatbot customers might pay more, which grows the market. Furthermore, some companies who previously (rightly) saw chatbots as a terrible customer experience, will change their mind with sufficiently good AI, and will enter the chatbot market, which again grows that market.

the right way to analyze this is not to say “the AI market is big and growing” but rather: “Here is how AI will transform this existing market.” And then: “Here’s how we fit into that growth.”

#business #startups #business/models #faang #tech #silicon valley #business/strategy #trends #ai #companies/OpenAI

·longform.asmartbear.com·Mar 3, 2024

AI startups require new strategies

No AI Feature

iA Writer's vision for the use of AI in writing

#ai #trends #future #misinfo #writing #platforms #apps #companies/microsoft #companies/OpenAI

·ia.net·Feb 1, 2024

No AI Feature

AI is killing the old web, and the new web struggles to be born

Google is trying to kill the 10 blue links. Twitter is being abandoned to bots and blue ticks. There’s the junkification of Amazon and the enshittification of TikTok. Layoffs are gutting online media. A job posting looking for an “AI editor” expects “output of 200 to 250 articles per week.” ChatGPT is being used to generate whole spam sites. Etsy is flooded with “AI-generated junk.” Chatbots cite one another in a misinformation ouroboros. LinkedIn is using AI to stimulate tired users. Snapchat and Instagram hope bots will talk to you when your friends don’t. Redditors are staging blackouts. Stack Overflow mods are on strike. The Internet Archive is fighting off data scrapers, and “AI is tearing Wikipedia apart.”

it’s people who ultimately create the underlying data — whether that’s journalists picking up the phone and checking facts or Reddit users who have had exactly that battery issue with the new DeWalt cordless ratchet and are happy to tell you how they fixed it. By contrast, the information produced by AI language models and chatbots is often incorrect. The tricky thing is that when it’s wrong, it’s wrong in ways that are difficult to spot.

The resulting write-up is basic and predictable. (You can read it here.) It lists five companies, including Columbia, Salomon, and Merrell, along with bullet points that supposedly outline the pros and cons of their products. “Columbia is a well-known and reputable brand for outdoor gear and footwear,” we’re told. “Their waterproof shoes come in various styles” and “their prices are competitive in the market.” You might look at this and think it’s so trite as to be basically useless (and you’d be right), but the information is also subtly wrong.

It’s fluent but not grounded in real-world experience, and so it takes time and expertise to unpick.

#trends #internet #platforms #chatbot #ai #SEO spam #FAANG #companies/OpenAI

·theverge.com·Jun 28, 2023

AI is killing the old web, and the new web struggles to be born

AI Is Tearing Wikipedia Apart

While open access is a cornerstone of Wikipedia’s design principles, some worry the unrestricted scraping of internet data allows AI companies like OpenAI to exploit the open web to create closed commercial datasets for their models. This is especially a problem if the Wikipedia content itself is AI-generated, creating a feedback loop of potentially biased information, if left unchecked.

#ai #wikipedia #internet #legal #ethics #platforms #open source #govt #companies/OpenAI

·vice.com·May 17, 2023

AI Is Tearing Wikipedia Apart

Think of language models like ChatGPT as a “calculator for words”

This is reflected in their name: a “language model” implies that they are tools for working with language. That’s what they’ve been trained to do, and it’s language manipulation where they truly excel. Want them to work with specific facts? Paste those into the language model as part of your original prompt! There are so many applications of language models that fit into this calculator for words category: Summarization. Give them an essay and ask for a summary. Question answering: given these paragraphs of text, answer this specific question about the information they represent. Fact extraction: ask for bullet points showing the facts presented by an article. Rewrites: reword things to be more “punchy” or “professional” or “sassy” or “sardonic”—part of the fun here is using increasingly varied adjectives and seeing what happens. They’re very good with language after all! Suggesting titles—actually a form of summarization. World’s most effective thesaurus. “I need a word that hints at X”, “I’m very Y about this situation, what could I use for Y?”—that kind of thing. Fun, creative, wild stuff. Rewrite this in the voice of a 17th century pirate. What would a sentient cheesecake think of this? How would Alexander Hamilton rebut this argument? Turn this into a rap battle. Illustrate this business advice with an anecdote about sea otters running a kayak rental shop. Write the script for kickstarter fundraising video about this idea.

A flaw in this analogy: calculators are repeatable Andy Baio pointed out a flaw in this particular analogy: calculators always give you the same answer for a given input. Language models don’t—if you run the same prompt through a LLM several times you’ll get a slightly different reply every time.

#LLMs #ai #companies/OpenAI #definitions #GPT

·simonwillison.net·Apr 3, 2023

Think of language models like ChatGPT as a “calculator for words”

ChatGPT Is a Blurry JPEG of the Web

This analogy to lossy compression is not just a way to understand ChatGPT’s facility at repackaging information found on the Web by using different words. It’s also a way to understand the “hallucinations,” or nonsensical answers to factual questions, to which large language models such as ChatGPT are all too prone

When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them

they’ve discovered a “blur” tool for paragraphs instead of photos, and are having a blast playing with it.

A close examination of GPT-3’s incorrect answers suggests that it doesn’t carry the “1” when performing arithmetic. The Web certainly contains explanations of carrying the “1,” but GPT-3 isn’t able to incorporate those explanations. GPT-3’s statistical analysis of examples of arithmetic enables it to produce a superficial approximation of the real thing, but no more than that.

In human students, rote memorization isn’t an indicator of genuine learning, so ChatGPT’s inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression

Generally speaking, though, I’d say that anything that’s good for content mills is not good for people searching for information. The rise of this type of repackaging is what makes it harder for us to find what we’re looking for online right now; the more that text generated by large language models gets published on the Web, the more the Web becomes a blurrier version of itself.

Can large language models help humans with the creation of original writing? To answer that, we need to be specific about what we mean by that question. There is a genre of art known as Xerox art, or photocopy art, in which artists use the distinctive properties of photocopiers as creative tools. Something along those lines is surely possible with the photocopier that is ChatGPT, so, in that sense, the answer is yes

If students never have to write essays that we have all read before, they will never gain the skills needed to write something that we have never read.

Sometimes it’s only in the process of writing that you discover your original ideas.

Some might say that the output of large language models doesn’t look all that different from a human writer’s first draft, but, again, I think this is a superficial resemblance. Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say. That’s what directs you during rewriting, and that’s one of the things lacking when you start with text generated by an A.I.

#ai #compression #analogy #internet #writing #algorithms #SEO spam #LLMs #companies/OpenAI

·newyorker.com·Feb 11, 2023

ChatGPT Is a Blurry JPEG of the Web

AI-generated code helps me learn and makes experimenting faster

here are five large language model applications that I find intriguing: Intelligent automation starting with browsers but this feels like a step towards phenotropics Text generation when this unlocks new UIs like Word turning into Photoshop or something Human-machine interfaces because you can parse intent instead of nouns When meaning can be interfaced with programmatically and at ludicrous scale Anything that exploits the inhuman breadth of knowledge embedded in the model, because new knowledge is often the collision of previously separated old knowledge, and this has not been possible before.

#trends #ml #ai #tech #LLMs #companies/OpenAI

·interconnected.org·Jan 31, 2023

AI-generated code helps me learn and makes experimenting faster

G3nerative

Web3 has largely been technology looking for problems to solve while generative AI has been about almost too many solutions created by technology which is evolving on a seemingly daily basis. As a result, web3 has thus far been evangelists trying to convince us to re-solve old problems with their new technology

#tech #trends #LLMs #companies/OpenAI

·500ish.com·Dec 11, 2022

G3nerative

Discuss HN: Software Careers Post ChatGPT+ | Hacker News

ChatGPT feels like the current aim assist debates in a lot of FPSses to me. It'll make you better at the shooting part of the game, perfect even. But, won't necessarily make you that much of a better player, because aiming is only one aspect of what makes someone good at FPSes. However, if someone is generally good enough or very good at the "not aiming" portion of the games, then having aim assist would drastically increase their overall skill.

#discourse #trends #dev #history #career #LLMs #companies/OpenAI

·news.ycombinator.com·Dec 5, 2022

Discuss HN: Software Careers Post ChatGPT+ | Hacker News

Metaphor Search

#utility #search #ai #web3 #generator #research #companies/OpenAI

·metaphor.systems·Nov 22, 2022

Metaphor Search

Stable Diffusion is a really big deal

#generative #art #ai #companies/OpenAI #rendering #stable diffusion

·simonwillison.net·Sep 11, 2022

Stable Diffusion is a really big deal