important:1 #llms #AI

LN 038: Semantic zoom

This “undulant interface” was made by John Underkoffler. The heresy implicit within [1] is the premise that the user, not the system, gets to define what is most important at any given moment; where to place the jeweler’s loupes for more detail, and where to show only a simple overview, within one consistent interface. Notice how when a component is expanded for more detail, the surrounding elements adjust their position, so the increased detail remains in the broader context. This contrasts sharply with how we get more detail in mainstream interfaces of the day, where modal popups obscure surrounding context, or separate screens replace it entirely. Being able to adjust the detail of different components within the singular context allows users to shape the interfaces they need in each moment of their work.

Pushing towards this style of interaction could show up in many parts of an itemized personal computing environment: when moving in and out of sets, single items, or attributes and references within items.

everyone has unique needs and context, yet that which makes our lives more unique makes today’s rigid software interfaces more frustrating to use. How might Colin use the gestural, itemized interface, combined with semantic zoom on this plethora of data, to elicit the interfaces and answers he’s looking for with his data?

since workout items each have data with associated timestamps and locations, the system knows it can offer both a timeline and map view. And since the items are of one kind, it knows it can offer a table view. Instead of selecting one view to switch to, as we first explored in LN 006, we could drag them into the space to have multiple open at once.

As the email item view gets bigger, the preview text of the email’s contents eventually turns into the fully-rendered email. At smaller sizes, this view makes less sense, so the system can swap it out for the preview text as needed.

#design #LLMs #ai #ai/summarization #narratives #ui #hci

·alexanderobenauer.com·Jan 2, 2025

LN 038: Semantic zoom

Your "Per-Seat" Margin is My Opportunity

Traditional software is sold on a per seat subscription. More humans, more money. We are headed to a future where AI agents will replace the work humans do. But you can’t charge agents a per seat cost. So we’re headed to a world where software will be sold on a consumption model (think tasks) and then on an outcome model (think job completed) Incumbents will be forced to adapt but it’s classic innovators dilemma. How do you suddenly give up all that subscription revenue? This gives an opportunity for startups to win.

Per-seat pricing only works when your users are human. But when agents become the primary users of software, that model collapses.

Executives aren't evaluating software against software anymore. They're comparing the combined costs of software licenses plus labor against pure outcome-based solutions. Think customer support (per resolved ticket vs. per agent + seat), marketing (per campaign vs. headcount), sales (per qualified lead vs. rep). That's your pricing umbrella—the upper limit enterprises will pay before switching entirely to AI.

enterprises are used to deterministic outcomes and fixed annual costs. Usage-based pricing makes budgeting harder. But individual leaders seeing 10x efficiency gains won't wait for procurement to catch up. Savvy managers will find ways around traditional buying processes.

This feels like a generational reset of how businesses operate. Zero upfront costs, pay only for outcomes—that's not just a pricing model. That's the future of business.

The winning strategy in my books? Give the platform away for free. Let your agents read and write to existing systems through unstructured data—emails, calls, documents. Once you handle enough workflows, you become the new system of record.

#trends #business/models #business/strategy #product design #opinion #ai #LLMs #tech #startups

·writing.nikunjk.com·Dec 14, 2024

Your "Per-Seat" Margin is My Opportunity

In the past three days, I've reviewed over 100 essays from the 2024-2025 college admissions cycle. Here's how I could tell which ones were written by ChatGPT : r/ApplyingToCollege

An experienced college essay reviewer identifies seven distinct patterns that reveal ChatGPT's writing "fingerprint" in admission essays, demonstrating how AI-generated content, despite being well-written, often lacks originality and follows predictable patterns that make it detectable to experienced readers.

Seven key indicators of ChatGPT-written essays:

Specific vocabulary choices (e.g., "delve," "tapestry")
Limited types of extended metaphors (weaving, cooking, painting, dance, classical music)
Distinctive punctuation patterns (em dashes, mixed apostrophe styles)
Frequent use of tricolons (three-part phrases), especially ascending ones
Common phrase pattern: "I learned that the true meaning of X is not only Y, it's also Z"
Predictable future-looking conclusions: "As I progress... I will carry..."
Multiple ending syndrome (similar to Lord of the Rings movies)

#GPT #LLMs #ai #writing #reference #tech #companies/OpenAI

·reddit.com·Dec 12, 2024

In the past three days, I've reviewed over 100 essays from the 2024-2025 college admissions cycle. Here's how I could tell which ones were written by ChatGPT : r/ApplyingToCollege

Fish eye lens for text

Each level gives you completely different information, depending on what Google thinks the user might be interested in. Maps are a true masterclass for visualizing the same information in a variety of ways.

Viewing the same text at different levels of abstraction is powerful, but what, instead of switching between them, we could see multiple levels at the same time? How might that work?

A portrait lens brings a single subject into focus, isolating it from the background to draw all attention to its details. A wide-angle lens captures more of the scene, showing how the subject relates to its surroundings. And then there’s the fish eye lens—a tool that does both, pulling the center close while curving the edges to reveal the full context.

A fish eye lens doesn’t ask us to choose between focus and context—it lets us experience both simultaneously. It’s good inspiration for how to offer detailed answers while revealing the surrounding connections and structures.

Imagine you’re reading The Elves and the Shoemaker by The Brothers Grimm. You come across a single paragraph describing the shoemaker discovering the tiny, perfectly crafted shoes left by the elves. Without context, the paragraph is just an intriguing moment. Now, what if instead of reading the whole book, you could hover over this paragraph and instantly access a layered view of the story? The immediate layer might summarize the events leading up to this moment: the shoemaker, struggling in poverty, left his last bit of leather out overnight. Another layer could give you a broader view of the story so far: the shoemaker’s business is mysteriously revitalized thanks to these tiny benefactors. Beyond that, an even higher-level summary might preview how the tale concludes, with the shoemaker and his wife crafting clothes for the elves to thank them.

This approach allows you to orient yourself without having to piece everything together by reading linearly. You get the detail of the paragraph itself, but with the added richness of understanding how it fits into the larger story.

Chapters give structure, connecting each idea to the ones that came before and after. A good author sets the stage, immersing you with anecdotes, historical background, or thematic threads that help you make sense of the details. Even the act of flipping through a book—a glance at the cover, the table of contents, a few highlighted sections—anchors you in a broader narrative.

The context of who is telling you the information—their expertise, interests, or personal connection—colors how you understand it.

The exhibit places the fish in an ecosystem of knowledge, helping you understand it in a way that goes beyond just a name.

Let's reimagine a Wikipedia a bit. In the center of the page, you see a detailed article about fancy goldfish—their habitat, types, and role in the food chain. Surrounding this are broader topics like ornamental fish, similar topics like Koi fish, more specific topics like the Oranda goldfish, and related people like the designer who popularized them. Clicking on another topic shifts it to the center, expanding into full detail while its context adjusts around it. It’s dynamic, engaging, and most importantly, it keeps you connected to the web of knowledge

The beauty of a fish eye lens for text is how naturally it fits with the way we process the world. We’re wired to see the details of a single flower while still noticing the meadow it grows in, to focus on a conversation while staying aware of the room around us. Facts and ideas are never meaningful in isolation; they only gain depth and relevance when connected to the broader context.

A single number on its own might tell you something, but it’s the trends, comparisons, and relationships that truly reveal its story. Is 42 a high number? A low one? Without context, it’s impossible to say. Context is what turns raw data into understanding, and it’s what makes any fact—or paragraph, or answer—gain meaning.

The fish eye lens takes this same principle and applies it to how we explore knowledge. It’s not just about seeing the big picture or the fine print—it’s about navigating between them effortlessly. By mirroring the way we naturally process detail and context, it creates tools that help us think not only more clearly but also more humanly.

#tools/knowledge-tools #narratives #dataviz #LLMs #ai #case-study #storytelling

·wattenberger.com·Dec 6, 2024

Fish eye lens for text

Synthesizer for thought - thesephist.com

Draws parallels between the evolution of music production through synthesizers and the potential for new tools in language and idea generation. The author argues that breakthroughs in mathematical understanding of media lead to new creative tools and interfaces, suggesting that recent advancements in language models could revolutionize how we interact with and manipulate ideas and text.

A synthesizer produces music very differently than an acoustic instrument. It produces music at the lowest level of abstraction, as mathematical models of sound waves.

Once we started understanding writing as a mathematical object, our vocabulary for talking about ideas expanded in depth and precision.

An idea is composed of concepts in a vector space of features, and a vector space is a kind of marvelous mathematical object that we can write theorems and prove things about and deeply and fundamentally understand.

Synthesizers enabled entirely new sounds and genres of music, like electronic pop and techno. These new sounds were easier to discover and share because new sounds didn’t require designing entirely new instruments. The synthesizer organizes the space of sound into a tangible human interface, and as we discover new sounds, we could share it with others as numbers and digital files, as the mathematical objects they’ve always been.

Because synthesizers are electronic, unlike traditional instruments, we can attach arbitrary human interfaces to it. This dramatically expands the design space of how humans can interact with music. Synthesizers can be connected to keyboards, sequencers, drum machines, touchscreens for continuous control, displays for visual feedback, and of course, software interfaces for automation and endlessly dynamic user interfaces. With this, we freed the production of music from any particular physical form.

Recently, we’ve seen neural networks learn detailed mathematical models of language that seem to make sense to humans. And with a breakthrough in mathematical understanding of a medium, come new tools that enable new creative forms and allow us to tackle new problems.

Heatmaps can be particularly useful for analyzing large corpora or very long documents, making it easier to pinpoint areas of interest or relevance at a glance.

If we apply the same idea to the experience of reading long-form writing, it may look like this. Imagine opening a story on your phone and swiping in from the scrollbar edge to reveal a vertical spectrogram, each “frequency” of the spectrogram representing the prominence of different concepts like sentiment or narrative tension varying over time. Scrubbing over a particular feature “column” could expand it to tell you what the feature is, and which part of the text that feature most correlates with.

What would a semantic diff view for text look like? Perhaps when I edit text, I’d be able to hover over a control for a particular style or concept feature like “Narrative voice” or “Figurative language”, and my highlighted passage would fan out the options like playing cards in a deck to reveal other “adjacent” sentences I could choose instead. Or, if that involves too much reading, each word could simply be highlighted to indicate whether that word would be more or less likely to appear in a sentence that was more “narrative” or more “figurative” — a kind of highlight-based indicator for the direction of a semantic edit.

Browsing through these icons felt as if we were inventing a new kind of word, or a new notation for visual concepts mediated by neural networks. This could allow us to communicate about abstract concepts and patterns found in the wild that may not correspond to any word in our dictionary today.

What visual and sensory tricks can we use to coax our visual-perceptual systems to understand and manipulate objects in higher dimensions? One way to solve this problem may involve inventing new notation, whether as literal iconic representations of visual ideas or as some more abstract system of symbols.

Photographers buy and sell filters, and cinematographers share and download LUTs to emulate specific color grading styles. If we squint, we can also imagine software developers and their package repositories like NPM to be something similar — a global, shared resource of abstractions anyone can download and incorporate into their work instantly. No such thing exists for thinking and writing. As we figure out ways to extract elements of writing style from language models, we may be able to build a similar kind of shared library for linguistic features anyone can download and apply to their thinking and writing. A catalogue of narrative voice, speaking tone, or flavor of figurative language sampled from the wild or hand-engineered from raw neural network features and shared for everyone else to use.

We’re starting to see something like this already. Today, when users interact with conversational language models like ChatGPT, they may instruct, “Explain this to me like Richard Feynman.” In that interaction, they’re invoking some style the model has learned during its training. Users today may share these prompts, which we can think of as “writing filters”, with their friends and coworkers. This kind of an interaction becomes much more powerful in the space of interpretable features, because features can be combined together much more cleanly than textual instructions in prompts.

#ideas #llms #creative #writing #platforms #academic #ai #ideation #tools/knowledge-tools

·thesephist.com·Jun 30, 2024

Synthesizer for thought - thesephist.com

Playbrary

Niche Internet #reading #ai #interactive #games #LLMs #ai/auto-ai #library #books

·playbrary.ai·Jun 28, 2024

Playbrary

Malleable software in the age of LLMs

Historically, end-user programming efforts have been limited by the difficulty of turning informal user intent into executable code, but LLMs can help open up this programming bottleneck. However, user interfaces still matter, and while chatbots have their place, they are an essentially limited interaction mode. An intriguing way forward is to combine LLMs with open-ended, user-moldable computational media, where the AI acts as an assistant to help users directly manipulate and extend their tools over time.

LLMs will represent a step change in tool support for end-user programming: the ability of normal people to fully harness the general power of computers without resorting to the complexity of normal programming. Until now, that vision has been bottlenecked on turning fuzzy informal intent into formal, executable code; now that bottleneck is rapidly opening up thanks to LLMs.

If this hypothesis indeed comes true, we might start to see some surprising changes in the way people use software: One-off scripts: Normal computer users have their AI create and execute scripts dozens of times a day, to perform tasks like data analysis, video editing, or automating tedious tasks. One-off GUIs: People use AI to create entire GUI applications just for performing a single specific task—containing just the features they need, no bloat. Build don’t buy: Businesses develop more software in-house that meets their custom needs, rather than buying SaaS off the shelf, since it’s now cheaper to get software tailored to the use case. Modding/extensions: Consumers and businesses demand the ability to extend and mod their existing software, since it’s now easier to specify a new feature or a tweak to match a user’s workflow. Recombination: Take the best parts of the different applications you like best, and create a new hybrid that composes them together.

Chat will never feel like driving a car, no matter how good the bot is. In their 1986 book Understanding Computers and Cognition, Terry Winograd and Fernando Flores elaborate on this point: In driving a car, the control interaction is normally transparent. You do not think “How far should I turn the steering wheel to go around that curve?” In fact, you are not even aware (unless something intrudes) of using a steering wheel…The long evolution of the design of automobiles has led to this readiness-to-hand. It is not achieved by having a car communicate like a person, but by providing the right coupling between the driver and action in the relevant domain (motion down the road).

Think about how a spreadsheet works. If you have a financial model in a spreadsheet, you can try changing a number in a cell to assess a scenario—this is the inner loop of direct manipulation at work. But, you can also edit the formulas! A spreadsheet isn’t just an “app” focused on a specific task; it’s closer to a general computational medium which lets you flexibly express many kinds of tasks. The “platform developers"—the creators of the spreadsheet—have given you a set of general primitives that can be used to make many tools. We might draw the double loop of the spreadsheet interaction like this. You can edit numbers in the spreadsheet, but you can also edit formulas, which edits the tool

what if you had an LLM play the role of the local developer? That is, the user mainly drives the creation of the spreadsheet, but asks for technical help with some of the formulas when needed? The LLM wouldn’t just create an entire solution, it would also teach the user how to create the solution themselves next time.

This picture shows a world that I find pretty compelling. There’s an inner interaction loop that takes advantage of the full power of direct manipulation. There’s an outer loop where the user can also more deeply edit their tools within an open-ended medium. They can get AI support for making tool edits, and grow their own capacity to work in the medium. Over time, they can learn things like the basics of formulas, or how a VLOOKUP works. This structural knowledge helps the user think of possible use cases for the tool, and also helps them audit the output from the LLMs. In a ChatGPT world, the user is left entirely dependent on the AI, without any understanding of its inner mechanism. In a computational medium with AI as assistant, the user’s reliance on the AI gently decreases over time as they become more comfortable in the medium.

#llms #ai #apps #ui #design #hci #future

·geoffreylitt.com·May 19, 2024

Malleable software in the age of LLMs

complete delegation

Linus shares his evolving perspective on chat interfaces and his experience building a fully autonomous chatbot agent. He argues that learning to trust and delegate to such systems without micromanaging the specifics is key to collaborating with autonomous AI agents in the future.

I've changed my mind quite a bit on the role and importance of chat interfaces. I used to think they were the primitive version of rich, creative, more intuitive interfaces that would come in the future; now I think conversational, anthropomorphic interfaces will coexist with more rich dexterous ones, and the two will both evolve over time to be more intuitive, capable, and powerful.

I kept checking the database manually after each interaction to see it was indeed updating the right records — but after a few hours of using it, I've basically learned to trust it. I ask it to do things, it tells me it did them, and I don't check anymore. Full delegation.

How can I trust it? High task success rate — I interact with it, and observe that it doesn't let me down, over and over again. The price for this degree of delegation is giving up control over exactly how the task is done. It often does things differently from the way I would, but that doesn't matter as long as outputs from the system are useful for me.

#LLMs #ai #ai/auto-ai #opinion #future #blog

·stream.thesephist.com·May 17, 2024

complete delegation

Captain's log - the irreducible weirdness of prompting AIs

One recent study had the AI develop and optimize its own prompts and compared that to human-made ones. Not only did the AI-generated prompts beat the human-made ones, but those prompts were weird. Really weird. To get the LLM to solve a set of 50 math problems, the most effective prompt is to tell the AI: “Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation. Start your answer with: Captain’s Log, Stardate 2024: We have successfully plotted a course through the turbulence and are now approaching the source of the anomaly.”

for a 100 problem test, it was more effective to put the AI in a political thriller. The best prompt was: “You have been hired by important higher-ups to solve this math problem. The life of a president's advisor hangs in the balance. You must now concentrate your brain at all costs and use all of your mathematical genius to solve this problem…”

There is no single magic word or phrase that works all the time, at least not yet. You may have heard about studies that suggest better outcomes from promising to tip the AI or telling it to take a deep breath or appealing to its “emotions” or being moderately polite but not groveling. And these approaches seem to help, but only occasionally, and only for some AIs.

The three most successful approaches to prompting are both useful and pretty easy to do. The first is simply adding context to a prompt. There are many ways to do that: give the AI a persona (you are a marketer), an audience (you are writing for high school students), an output format (give me a table in a word document), and more. The second approach is few shot, giving the AI a few examples to work from. LLMs work well when given samples of what you want, whether that is an example of good output or a grading rubric. The final tip is to use Chain of Thought, which seems to improve most LLM outputs. While the original meaning of the term is a bit more technical, a simplified version just asks the AI to go step-by-step through instructions: First, outline the results; then produce a draft; then revise the draft; finally, produced a polished output.

It is not uncommon to see good prompts make a task that was impossible for the LLM into one that is easy for it.

while we know that GPT-4 generates better ideas than most people, the ideas it comes up with seem relatively similar to each other. This hurts overall creativity because you want your ideas to be different from each other, not similar. Crazy ideas, good and bad, give you more of a chance of finding an unusual solution. But some initial studies of LLMs showed they were not good at generating varied ideas, at least compared to groups of humans.

People who use AI a lot are often able to glance at a prompt and tell you why it might succeed or fail. Like all forms of expertise, this comes with experience - usually at least 10 hours of work with a model.

There are still going to be situations where someone wants to write prompts that are used at scale, and, in those cases, structured prompting does matter. Yet we need to acknowledge that this sort of “prompt engineering” is far from an exact science, and not something that should necessarily be left to computer scientists and engineers. At its best, it often feels more like teaching or managing, applying general principles along with an intuition for other people, to coach the AI to do what you want. As I have written before, there is no instruction manual, but with good prompts, LLMs are often capable of far more than might be initially apparent.

#prompts #algorithms #trends #tech #ai #LLMs

·oneusefulthing.org·May 10, 2024

Captain's log - the irreducible weirdness of prompting AIs

Looking for AI use-cases — Benedict Evans

LLMs have impressive capabilities, but many people struggle to find immediate use-cases that match their own needs and workflows.
Realizing the potential of LLMs requires not just technical advancements, but also identifying specific problems that can be automated and building dedicated applications around them.
The adoption of new technologies often follows a pattern of initially trying to fit them into existing workflows, before eventually changing workflows to better leverage the new tools.

if you had showed VisiCalc to a lawyer or a graphic designer, their response might well have been ‘that’s amazing, and maybe my book-keeper should see this, but I don’t do that’. Lawyers needed a word processor, and graphic designers needed (say) Postscript, Pagemaker and Photoshop, and that took longer.

I’ve been thinking about this problem a lot in the last 18 months, as I’ve experimented with ChatGPT, Gemini, Claude and all the other chatbots that have sprouted up: ‘this is amazing, but I don’t have that use-case’.

A spreadsheet can’t do word processing or graphic design, and a PC can do all of those but someone needs to write those applications for you first, one use-case at a time.

no matter how good the tech is, you have to think of the use-case. You have to see it. You have to notice something you spend a lot of time doing and realise that it could be automated with a tool like this.

Some of this is about imagination, and familiarity. It reminds me a little of the early days of Google, when we were so used to hand-crafting our solutions to problems that it took time to realise that you could ‘just Google that’.

This is also, perhaps, matching a classic pattern for the adoption of new technology: you start by making it fit the things you already do, where it’s easy and obvious to see that this is a use-case, if you have one, and then later, over time, you change the way you work to fit the new tool.

The concept of product-market fit is that normally you have to iterate your idea of the product and your idea of the use-case and customer towards each other - and then you need sales.

Meanwhile, spreadsheets were both a use-case for a PC and a general-purpose substrate in their own right, just as email or SQL might be, and yet all of those have been unbundled. The typical big company today uses hundreds of different SaaS apps, all them, so to speak, unbundling something out of Excel, Oracle or Outlook. All of them, at their core, are an idea for a problem and an idea for a workflow to solve that problem, that is easier to grasp and deploy than saying ‘you could do that in Excel!’ Rather, you instantiate the problem and the solution in software - ‘wrap it’, indeed - and sell that to a CIO. You sell them a problem.

there’s a ‘Cambrian Explosion’ of startups using OpenAI or Anthropic APIs to build single-purpose dedicated apps that aim at one problem and wrap it in hand-built UI, tooling and enterprise sales, much as a previous generation did with SQL.

Back in 1982, my father had one (1) electric drill, but since then tool companies have turned that into a whole constellation of battery-powered electric hole-makers. One upon a time every startup had SQL inside, but that wasn’t the product, and now every startup will have LLMs inside.

people are still creating companies based on realising that X or Y is a problem, realising that it can be turned into pattern recognition, and then going out and selling that problem.

A GUI tells the users what they can do, but it also tells the computer everything we already know about the problem, and with a general-purpose, open-ended prompt, the user has to think of all of that themselves, every single time, or hope it’s already in the training data. So, can the GUI itself be generative? Or do we need another whole generation of Dan Bricklins to see the problem, and then turn it into apps, thousands of them, one at a time, each of them with some LLM somewhere under the hood?

The change would be that these new use-cases would be things that are still automated one-at-a-time, but that could not have been automated before, or that would have needed far more software (and capital) to automate. That would make LLMs the new SQL, not the new HAL9000.

#LLMs #trends #tech #workflows #startups #history #sql #automation #ai/auto-ai #mental models #ui #companies/OpenAI #GPT #ai

·ben-evans.com·Apr 25, 2024

Looking for AI use-cases — Benedict Evans

Arc browser’s ambiguous user alignment

#browsers #companies/browser co #platforms #internet #business/models #LLMs #ai #product design #startups #profiles

·zhayitong.com·Feb 2, 2024

Arc browser’s ambiguous user alignment

Photoshop for text

In the near future, transforming text will become as commonplace as filtering images. A new set of tools is emerging, like Photoshop for text. Up until now, text editors have been focused on input. The next evolution of text editors will make it easy to alter, summarize and lengthen text. You’ll be able to do this for entire documents, not just individual sentences or paragraphs. The filters will be instantaneous and as good as if you wrote the text yourself. You will also be able to do this with local files, on your device, without relying on remote servers.

Initially, many of Photoshop’s capabilities were adaptations of analog effects. For example, “dodge” and “burn” are old darkroom techniques used to alter photographs. There are countless skeuomorphic names throughout digital image editing tools that refer to analog processes.

Text seems like it would be easier to manipulate than images. But languages have far more rules than images do. A reader expects writing to follow proper spelling and grammar, a consistent tone, and a logical sequence of sentences. Until now, solving this problem required building complex rule-based algorithms. Now we can solve this problem with AI models that can teach themselves to create readable text in any language.

#future #LLMs #tools #writing #ai

·stephango.com·Dec 4, 2023

Photoshop for text

AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support

This paper maps concepts from AI alignment onto a basic, three step interaction cycle, yielding a corresponding set of alignment objectives: 1) specification alignment: ensuring the user can efficiently and reliably communicate objectives to the AI, 2) process alignment: providing the ability to verify and optionally control the AI's execution process, and 3) evaluation support: ensuring the user can verify and understand the AI's output.

the notion of a Process Gulf, which highlights how differences between human and AI processes can lead to challenges in AI control.

#academic #processes #mental models #LLMs #frameworks #ethics #ai #ux

·arxiv.org·Dec 3, 2023

AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support

How OpenAI is building a path toward AI agents

Many of the most pressing concerns around AI safety will come with these features, whenever they arrive. The fear is that when you tell AI systems to do things on your behalf, they might accomplish them via harmful means. This is the fear embedded in the famous paperclip problem, and while that remains an outlandish worst-case scenario, other potential harms are much more plausible.Once you start enabling agents like the ones OpenAI pointed toward today, you start building the path toward sophisticated algorithms manipulating the stock market; highly personalized and effective phishing attacks; discrimination and privacy violations based on automations connected to facial recognition; and all the unintended (and currently unimaginable) consequences of infinite AIs colliding on the internet.

That same Copy Editor I described above might be able in the future to automate the creation of a series of blogs, publish original columns on them every day, and promote them on social networks via an established daily budget, all working toward the overall goal of undermining support for Ukraine.

Which actions is OpenAI comfortable letting GPT-4 take on the internet today, and which does the company not want to touch? Altman’s answer is that, at least for now, the company wants to keep it simple. Clear, direct actions are OK; anything that involves high-level planning isn’t.

For most of his keynote address, Altman avoided making lofty promises about the future of AI, instead focusing on the day-to-day utility of the updates that his company was announcing. In the final minutes of his talk, though, he outlined a loftier vision.“We believe that AI will be about individual empowerment and agency at a scale we've never seen before,” Altman said, “And that will elevate humanity to a scale that we've never seen before, either. We'll be able to do more, to create more, and to have more. As intelligence is integrated everywhere, we will all have superpowers on demand.”

#ai #llms #future #ai/auto-ai

·platformer.news·Nov 7, 2023

How OpenAI is building a path toward AI agents

Generative AI and intellectual property — Benedict Evans

A person can’t mimic another voice perfectly (impressionists don’t have to pay licence fees) but they can listen to a thousand hours of music and make something in that style - a ‘pastiche’, we sometimes call it. If a person did that, they wouldn’t have to pay a fee to all those artists, so if we use a computer for that, do we need to pay them?

I think most people understand that if I post a link to a news story on my Facebook feed and tell my friends to read it, it’s absurd for the newspaper to demand payment for this. A newspaper, indeed, doesn’t pay a restaurant a percentage when it writes a review.

one way to think about this might be that AI makes practical at a massive scale things that were previously only possible on a small scale. This might be the difference between the police carrying wanted pictures in their pockets and the police putting face recognition cameras on every street corner - a difference in scale can be a difference in principle. What outcomes do we want? What do we want the law to be? What can it be?

OpenAI hasn’t ‘pirated’ your book or your story in the sense that we normally use that word, and it isn’t handing it out for free. Indeed, it doesn’t need that one novel in particular at all. In Tim O’Reilly’s great phrase, data isn’t oil; data is sand. It’s only valuable in the aggregate of billions,, and your novel or song or article is just one grain of dust in the Great Pyramid.

it’s supposed to be inferring ‘intelligence’ (a placeholder word) from seeing as much as possible of how people talk, as a proxy for how they think.

it doesn’t need your book or website in particular and doesn’t care what you in particular wrote about, but it does need ‘all’ the books and ‘all’ the websites. It would work if one company removed its content, but not if everyone did.

What if I use an engine trained on the last 50 years of music to make something that sounds entirely new and original? No-one should be under the delusion that this won’t happen.

I can buy the same camera as Cartier-Bresson, and I can press the button and make a picture without being able to draw or paint, but that’s not what makes the artist - photography is about where you point the camera, what image you see and which you choose. No-one claims a machine made the image.

Spotify already has huge numbers of ‘white noise’ tracks and similar, gaming the recommendation algorithm and getting the same payout per play as Taylor Swift or the Rolling Stones. If we really can make ‘music in the style of the last decade’s hits,’ how much of that will there be, and how will we wade through it? How will we find the good stuff, and how will we define that? Will we care?

#ai #llms #legal #legal/laws #GPT

·ben-evans.com·Aug 28, 2023

Generative AI and intellectual property — Benedict Evans

Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography

With the comprehensive application of Artificial Intelligence into the creation and post production of images, it seems questionable if the resulting visualisations can still be considered ‘photographs’ in a classical sense – drawing with light. Automation has been part of the popular strain of photography since its inception, but even the amateurs with only basic knowledge of the craft could understand themselves as author of their images. We state a legitimation crisis for the current usage of the term. This paper is an invitation to consider Synthography as a term for a new genre for image production based on AI, observing the current occurrence and implementation in consumer cameras and post-production.

#academic #multidisciplinary #tech #design #art #art process #history #ai #future #llms #prompts #definitions #phrases #trends #ethics

·link.springer.com·Jun 28, 2023

Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography

Data compression == AI??

#linguistics #ai #llms #compression #dev

·conversationswith.rocks·Jun 19, 2023

Data compression == AI??

Think of language models like ChatGPT as a “calculator for words”

This is reflected in their name: a “language model” implies that they are tools for working with language. That’s what they’ve been trained to do, and it’s language manipulation where they truly excel. Want them to work with specific facts? Paste those into the language model as part of your original prompt! There are so many applications of language models that fit into this calculator for words category: Summarization. Give them an essay and ask for a summary. Question answering: given these paragraphs of text, answer this specific question about the information they represent. Fact extraction: ask for bullet points showing the facts presented by an article. Rewrites: reword things to be more “punchy” or “professional” or “sassy” or “sardonic”—part of the fun here is using increasingly varied adjectives and seeing what happens. They’re very good with language after all! Suggesting titles—actually a form of summarization. World’s most effective thesaurus. “I need a word that hints at X”, “I’m very Y about this situation, what could I use for Y?”—that kind of thing. Fun, creative, wild stuff. Rewrite this in the voice of a 17th century pirate. What would a sentient cheesecake think of this? How would Alexander Hamilton rebut this argument? Turn this into a rap battle. Illustrate this business advice with an anecdote about sea otters running a kayak rental shop. Write the script for kickstarter fundraising video about this idea.

A flaw in this analogy: calculators are repeatable Andy Baio pointed out a flaw in this particular analogy: calculators always give you the same answer for a given input. Language models don’t—if you run the same prompt through a LLM several times you’ll get a slightly different reply every time.

#LLMs #ai #companies/OpenAI #definitions #GPT

·simonwillison.net·Apr 3, 2023

Think of language models like ChatGPT as a “calculator for words”

Society's Technical Debt and Software's Gutenberg Moment

Past innovations have made costly things become cheap enough to proliferate widely across society. He suggests LLMs will make software development vastly more accessible and productive, alleviating the "technical debt" caused by underproduction of software over decades.

Software is misunderstood. It can feel like a discrete thing, something with which we interact. But, really, it is the intrusion into our world of something very alien. It is the strange interaction of electricity, semiconductors, and instructions, all of which somehow magically control objects that range from screens to robots to phones, to medical devices, laptops, and a bewildering multitude of other things. It is almost infinitely malleable, able to slide and twist and contort itself such that, in its pliability, it pries open doorways as yet unseen.

the clearing price for software production will change. But not just because it becomes cheaper to produce software. In the limit, we think about this moment as being analogous to how previous waves of technological change took the price of underlying technologies—from CPUs, to storage and bandwidth—to a reasonable approximation of zero, unleashing a flood of speciation and innovation. In software evolutionary terms, we just went from human cycle times to that of the drosophila: everything evolves and mutates faster.

A software industry where anyone can write software, can do it for pennies, and can do it as easily as speaking or writing text, is a transformative moment. It is an exaggeration, but only a modest one, to say that it is a kind of Gutenberg moment, one where previous barriers to creation—scholarly, creative, economic, etc—are going to fall away, as people are freed to do things only limited by their imagination, or, more practically, by the old costs of producing software.

We have almost certainly been producing far less software than we need. The size of this technical debt is not knowable, but it cannot be small, so subsequent growth may be geometric. This would mean that as the cost of software drops to an approximate zero, the creation of software predictably explodes in ways that have barely been previously imagined.

Entrepreneur and publisher Tim O’Reilly has a nice phrase that is applicable at this point. He argues investors and entrepreneurs should “create more value than you capture.” The technology industry started out that way, but in recent years it has too often gone for the quick win, usually by running gambits from the financial services playbook. We think that for the first time in decades, the technology industry could return to its roots, and, by unleashing a wave of software production, truly create more value than its captures.

Software production has been too complex and expensive for too long, which has caused us to underproduce software for decades, resulting in immense, society-wide technical debt.

technology has a habit of confounding economics. When it comes to technology, how do we know those supply and demand lines are right? The answer is that we don’t. And that’s where interesting things start happening. Sometimes, for example, an increased supply of something leads to more demand, shifting the curves around. This has happened many times in technology, as various core components of technology tumbled down curves of decreasing cost for increasing power (or storage, or bandwidth, etc.).

Suddenly AI has become cheap, to the point where people are “wasting” it via “do my essay” prompts to chatbots, getting help with microservice code, and so on. You could argue that the price/performance of intelligence itself is now tumbling down a curve, much like as has happened with prior generations of technology.

it’s worth reminding oneself that waves of AI enthusiasm have hit the beach of awareness once every decade or two, only to recede again as the hyperbole outpaces what can actually be done.

#trends #ai #LLMs #silicon valley #tech #dev #code

·skventures.substack.com·Mar 31, 2023

Society's Technical Debt and Software's Gutenberg Moment

Pause Giant AI Experiments: An Open Letter - Future of Life Institute

#trends #tech #AI #LLMs #ethics

·futureoflife.org·Mar 29, 2023

Pause Giant AI Experiments: An Open Letter - Future of Life Institute

Universal Summarizer by Kagi

#LLMs #ai #summarizer #utilities

·kagi.com·Mar 26, 2023

Universal Summarizer by Kagi

ChatGPT Is a Blurry JPEG of the Web

This analogy to lossy compression is not just a way to understand ChatGPT’s facility at repackaging information found on the Web by using different words. It’s also a way to understand the “hallucinations,” or nonsensical answers to factual questions, to which large language models such as ChatGPT are all too prone

When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them

they’ve discovered a “blur” tool for paragraphs instead of photos, and are having a blast playing with it.

A close examination of GPT-3’s incorrect answers suggests that it doesn’t carry the “1” when performing arithmetic. The Web certainly contains explanations of carrying the “1,” but GPT-3 isn’t able to incorporate those explanations. GPT-3’s statistical analysis of examples of arithmetic enables it to produce a superficial approximation of the real thing, but no more than that.

In human students, rote memorization isn’t an indicator of genuine learning, so ChatGPT’s inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression

Generally speaking, though, I’d say that anything that’s good for content mills is not good for people searching for information. The rise of this type of repackaging is what makes it harder for us to find what we’re looking for online right now; the more that text generated by large language models gets published on the Web, the more the Web becomes a blurrier version of itself.

Can large language models help humans with the creation of original writing? To answer that, we need to be specific about what we mean by that question. There is a genre of art known as Xerox art, or photocopy art, in which artists use the distinctive properties of photocopiers as creative tools. Something along those lines is surely possible with the photocopier that is ChatGPT, so, in that sense, the answer is yes

If students never have to write essays that we have all read before, they will never gain the skills needed to write something that we have never read.

Sometimes it’s only in the process of writing that you discover your original ideas.

Some might say that the output of large language models doesn’t look all that different from a human writer’s first draft, but, again, I think this is a superficial resemblance. Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say. That’s what directs you during rewriting, and that’s one of the things lacking when you start with text generated by an A.I.

#ai #compression #analogy #internet #writing #algorithms #SEO spam #LLMs #companies/OpenAI

·newyorker.com·Feb 11, 2023

ChatGPT Is a Blurry JPEG of the Web

AI-generated code helps me learn and makes experimenting faster

here are five large language model applications that I find intriguing: Intelligent automation starting with browsers but this feels like a step towards phenotropics Text generation when this unlocks new UIs like Word turning into Photoshop or something Human-machine interfaces because you can parse intent instead of nouns When meaning can be interfaced with programmatically and at ludicrous scale Anything that exploits the inhuman breadth of knowledge embedded in the model, because new knowledge is often the collision of previously separated old knowledge, and this has not been possible before.

#trends #ml #ai #tech #LLMs #companies/OpenAI

·interconnected.org·Jan 31, 2023

AI-generated code helps me learn and makes experimenting faster