Found 15 bookmarks
Custom sorting
Captain's log - the irreducible weirdness of prompting AIs
Captain's log - the irreducible weirdness of prompting AIs
One recent study had the AI develop and optimize its own prompts and compared that to human-made ones. Not only did the AI-generated prompts beat the human-made ones, but those prompts were weird. Really weird. To get the LLM to solve a set of 50 math problems, the most effective prompt is to tell the AI: “Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation. Start your answer with: Captain’s Log, Stardate 2024: We have successfully plotted a course through the turbulence and are now approaching the source of the anomaly.”
for a 100 problem test, it was more effective to put the AI in a political thriller. The best prompt was: “You have been hired by important higher-ups to solve this math problem. The life of a president's advisor hangs in the balance. You must now concentrate your brain at all costs and use all of your mathematical genius to solve this problem…”
There is no single magic word or phrase that works all the time, at least not yet. You may have heard about studies that suggest better outcomes from promising to tip the AI or telling it to take a deep breath or appealing to its “emotions” or being moderately polite but not groveling. And these approaches seem to help, but only occasionally, and only for some AIs.
The three most successful approaches to prompting are both useful and pretty easy to do. The first is simply adding context to a prompt. There are many ways to do that: give the AI a persona (you are a marketer), an audience (you are writing for high school students), an output format (give me a table in a word document), and more. The second approach is few shot, giving the AI a few examples to work from. LLMs work well when given samples of what you want, whether that is an example of good output or a grading rubric. The final tip is to use Chain of Thought, which seems to improve most LLM outputs. While the original meaning of the term is a bit more technical, a simplified version just asks the AI to go step-by-step through instructions: First, outline the results; then produce a draft; then revise the draft; finally, produced a polished output.
It is not uncommon to see good prompts make a task that was impossible for the LLM into one that is easy for it.
while we know that GPT-4 generates better ideas than most people, the ideas it comes up with seem relatively similar to each other. This hurts overall creativity because you want your ideas to be different from each other, not similar. Crazy ideas, good and bad, give you more of a chance of finding an unusual solution. But some initial studies of LLMs showed they were not good at generating varied ideas, at least compared to groups of humans.
People who use AI a lot are often able to glance at a prompt and tell you why it might succeed or fail. Like all forms of expertise, this comes with experience - usually at least 10 hours of work with a model.
There are still going to be situations where someone wants to write prompts that are used at scale, and, in those cases, structured prompting does matter. Yet we need to acknowledge that this sort of “prompt engineering” is far from an exact science, and not something that should necessarily be left to computer scientists and engineers. At its best, it often feels more like teaching or managing, applying general principles along with an intuition for other people, to coach the AI to do what you want. As I have written before, there is no instruction manual, but with good prompts, LLMs are often capable of far more than might be initially apparent.
·oneusefulthing.org·
Captain's log - the irreducible weirdness of prompting AIs
Looking for AI use-cases — Benedict Evans
Looking for AI use-cases — Benedict Evans
  • LLMs have impressive capabilities, but many people struggle to find immediate use-cases that match their own needs and workflows.
  • Realizing the potential of LLMs requires not just technical advancements, but also identifying specific problems that can be automated and building dedicated applications around them.
  • The adoption of new technologies often follows a pattern of initially trying to fit them into existing workflows, before eventually changing workflows to better leverage the new tools.
if you had showed VisiCalc to a lawyer or a graphic designer, their response might well have been ‘that’s amazing, and maybe my book-keeper should see this, but I don’t do that’. Lawyers needed a word processor, and graphic designers needed (say) Postscript, Pagemaker and Photoshop, and that took longer.
I’ve been thinking about this problem a lot in the last 18 months, as I’ve experimented with ChatGPT, Gemini, Claude and all the other chatbots that have sprouted up: ‘this is amazing, but I don’t have that use-case’.
A spreadsheet can’t do word processing or graphic design, and a PC can do all of those but someone needs to write those applications for you first, one use-case at a time.
no matter how good the tech is, you have to think of the use-case. You have to see it. You have to notice something you spend a lot of time doing and realise that it could be automated with a tool like this.
Some of this is about imagination, and familiarity. It reminds me a little of the early days of Google, when we were so used to hand-crafting our solutions to problems that it took time to realise that you could ‘just Google that’.
This is also, perhaps, matching a classic pattern for the adoption of new technology: you start by making it fit the things you already do, where it’s easy and obvious to see that this is a use-case, if you have one, and then later, over time, you change the way you work to fit the new tool.
The concept of product-market fit is that normally you have to iterate your idea of the product and your idea of the use-case and customer towards each other - and then you need sales.
Meanwhile, spreadsheets were both a use-case for a PC and a general-purpose substrate in their own right, just as email or SQL might be, and yet all of those have been unbundled. The typical big company today uses hundreds of different SaaS apps, all them, so to speak, unbundling something out of Excel, Oracle or Outlook. All of them, at their core, are an idea for a problem and an idea for a workflow to solve that problem, that is easier to grasp and deploy than saying ‘you could do that in Excel!’ Rather, you instantiate the problem and the solution in software - ‘wrap it’, indeed - and sell that to a CIO. You sell them a problem.
there’s a ‘Cambrian Explosion’ of startups using OpenAI or Anthropic APIs to build single-purpose dedicated apps that aim at one problem and wrap it in hand-built UI, tooling and enterprise sales, much as a previous generation did with SQL.
Back in 1982, my father had one (1) electric drill, but since then tool companies have turned that into a whole constellation of battery-powered electric hole-makers. One upon a time every startup had SQL inside, but that wasn’t the product, and now every startup will have LLMs inside.
people are still creating companies based on realising that X or Y is a problem, realising that it can be turned into pattern recognition, and then going out and selling that problem.
A GUI tells the users what they can do, but it also tells the computer everything we already know about the problem, and with a general-purpose, open-ended prompt, the user has to think of all of that themselves, every single time, or hope it’s already in the training data. So, can the GUI itself be generative? Or do we need another whole generation of Dan Bricklins to see the problem, and then turn it into apps, thousands of them, one at a time, each of them with some LLM somewhere under the hood?
The change would be that these new use-cases would be things that are still automated one-at-a-time, but that could not have been automated before, or that would have needed far more software (and capital) to automate. That would make LLMs the new SQL, not the new HAL9000.
·ben-evans.com·
Looking for AI use-cases — Benedict Evans
Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography
Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography
With the comprehensive application of Artificial Intelligence into the creation and post production of images, it seems questionable if the resulting visualisations can still be considered ‘photographs’ in a classical sense – drawing with light. Automation has been part of the popular strain of photography since its inception, but even the amateurs with only basic knowledge of the craft could understand themselves as author of their images. We state a legitimation crisis for the current usage of the term. This paper is an invitation to consider Synthography as a term for a new genre for image production based on AI, observing the current occurrence and implementation in consumer cameras and post-production.
·link.springer.com·
Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography
This time, it feels different
This time, it feels different
In the past several months, I have come across people who do programming, legal work, business, accountancy and finance, fashion design, architecture, graphic design, research, teaching, cooking, travel planning, event management etc., all of whom have started using the same tool, ChatGPT, to solve use cases specific to their domains and problems specific to their personal workflows. This is unlike everyone using the same messaging tool or the same document editor. This is one tool, a single class of technology (LLM), whose multi-dimensionality has achieved widespread adoption across demographics where people are discovering how to solve a multitude of problems with no technical training, in the one way that is most natural to humans—via language and conversations.
I cannot recall the last time a single tool gained such widespread acceptance so swiftly, for so many use cases, across entire demographics.
there is significant substance beneath the hype. And that is what is worrying; the prospect of us starting to depend indiscriminately on poorly understood blackboxes, currently offered by megacorps, that actually work shockingly well.
If a single dumb, stochastic, probabilistic, hallucinating, snake oil LLM with a chat UI offered by one organisation can have such a viral, organic, and widespread adoption—where large disparate populations, people, corporations, and governments are integrating it into their daily lives for use cases that they are discovering themselves—imagine what better, faster, more “intelligent” systems to follow in the wake of what exists today would be capable of doing.
A policy for “AI anxiety” We ended up codifying this into an actual AI policy to bring clarity to the organisation.[10] It states that no one at Zerodha will lose their job if a technology implementation (AI or non-AI) directly renders their existing responsibilities and tasks obsolete. The goal is to prevent unexpected rug-pulls from underneath the feet of humans. Instead, there will be efforts to create avenues and opportunities for people to upskill and switch between roles and responsibilities
To those who believe that new jobs will emerge at meaningful rates to absorb the losses and shocks, what exactly are those new jobs? To those who think that governments will wave magic wands to regulate AI technologies, one just has to look at how well governments have managed to regulate, and how well humanity has managed to self-regulate, human-made climate change and planetary destruction. It is not then a stretch to think that the unraveling of our civilisation and its socio-politico-economic systems that are built on extracting, mass producing, and mass consuming garbage, might be exacerbated. Ted Chiang’s recent essay is a grim, but fascinating exploration of this. Speaking of grim, we can always count on us to ruin nice things! Along the lines of Murphy’s Law,[11] I present: Anything that can be ruined, will be ruined — Grumphy’s law
I asked GPT-4 to summarise this post and write five haikus on it. I have always operated a piece of software, but never asked it anything—that is, until now. Anyway, here is the fifth one. Future’s tangled web, Offloading choices to black boxes, Humanity’s voice fades
·nadh.in·
This time, it feels different
Society's Technical Debt and Software's Gutenberg Moment
Society's Technical Debt and Software's Gutenberg Moment
Past innovations have made costly things become cheap enough to proliferate widely across society. He suggests LLMs will make software development vastly more accessible and productive, alleviating the "technical debt" caused by underproduction of software over decades.
Software is misunderstood. It can feel like a discrete thing, something with which we interact. But, really, it is the intrusion into our world of something very alien. It is the strange interaction of electricity, semiconductors, and instructions, all of which somehow magically control objects that range from screens to robots to phones, to medical devices, laptops, and a bewildering multitude of other things. It is almost infinitely malleable, able to slide and twist and contort itself such that, in its pliability, it pries open doorways as yet unseen.
the clearing price for software production will change. But not just because it becomes cheaper to produce software. In the limit, we think about this moment as being analogous to how previous waves of technological change took the price of underlying technologies—from CPUs, to storage and bandwidth—to a reasonable approximation of zero, unleashing a flood of speciation and innovation. In software evolutionary terms, we just went from human cycle times to that of the drosophila: everything evolves and mutates faster.
A software industry where anyone can write software, can do it for pennies, and can do it as easily as speaking or writing text, is a transformative moment. It is an exaggeration, but only a modest one, to say that it is a kind of Gutenberg moment, one where previous barriers to creation—scholarly, creative, economic, etc—are going to fall away, as people are freed to do things only limited by their imagination, or, more practically, by the old costs of producing software.
We have almost certainly been producing far less software than we need. The size of this technical debt is not knowable, but it cannot be small, so subsequent growth may be geometric. This would mean that as the cost of software drops to an approximate zero, the creation of software predictably explodes in ways that have barely been previously imagined.
Entrepreneur and publisher Tim O’Reilly has a nice phrase that is applicable at this point. He argues investors and entrepreneurs should “create more value than you capture.” The technology industry started out that way, but in recent years it has too often gone for the quick win, usually by running gambits from the financial services playbook. We think that for the first time in decades, the technology industry could return to its roots, and, by unleashing a wave of software production, truly create more value than its captures.
Software production has been too complex and expensive for too long, which has caused us to underproduce software for decades, resulting in immense, society-wide technical debt.
technology has a habit of confounding economics. When it comes to technology, how do we know those supply and demand lines are right? The answer is that we don’t. And that’s where interesting things start happening. Sometimes, for example, an increased supply of something leads to more demand, shifting the curves around. This has happened many times in technology, as various core components of technology tumbled down curves of decreasing cost for increasing power (or storage, or bandwidth, etc.).
Suddenly AI has become cheap, to the point where people are “wasting” it via “do my essay” prompts to chatbots, getting help with microservice code, and so on. You could argue that the price/performance of intelligence itself is now tumbling down a curve, much like as has happened with prior generations of technology.
it’s worth reminding oneself that waves of AI enthusiasm have hit the beach of awareness once every decade or two, only to recede again as the hyperbole outpaces what can actually be done.
·skventures.substack.com·
Society's Technical Debt and Software's Gutenberg Moment
AI-generated code helps me learn and makes experimenting faster
AI-generated code helps me learn and makes experimenting faster
here are five large language model applications that I find intriguing: Intelligent automation starting with browsers but this feels like a step towards phenotropics Text generation when this unlocks new UIs like Word turning into Photoshop or something Human-machine interfaces because you can parse intent instead of nouns When meaning can be interfaced with programmatically and at ludicrous scale Anything that exploits the inhuman breadth of knowledge embedded in the model, because new knowledge is often the collision of previously separated old knowledge, and this has not been possible before.
·interconnected.org·
AI-generated code helps me learn and makes experimenting faster
Why Google Missed ChatGPT
Why Google Missed ChatGPT
Even if chatbots were to fix their accuracy issues, Google would still have a business model problem to contend with. The company makes money when people click ads next to search results, and it’s awkward to fit ads into conversational replies. Imagine receiving a response and then immediately getting pitched to go somewhere else — it feels slimy, and unhelpful. Google thus has little incentive to move us beyond traditional search, at least not in a paradigm-shifting way, until it figures out how to make the money aspect work. In the meantime, it’ll stick with the less impressive Google Assistant.
“Google doesn’t inherently want you, at an inherent level, to just get the answer to every problem. Because that might reduce the need to go click around the web, which would then reduce the need for us to go to Google.”
·bigtechnology.com·
Why Google Missed ChatGPT
G3nerative
G3nerative
Web3 has largely been technology looking for problems to solve while generative AI has been about almost too many solutions created by technology which is evolving on a seemingly daily basis. As a result, web3 has thus far been evangelists trying to convince us to re-solve old problems with their new technology
·500ish.com·
G3nerative