Search Saved

Captain's log - the irreducible weirdness of prompting AIs

One recent study had the AI develop and optimize its own prompts and compared that to human-made ones. Not only did the AI-generated prompts beat the human-made ones, but those prompts were weird. Really weird. To get the LLM to solve a set of 50 math problems, the most effective prompt is to tell the AI: “Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation. Start your answer with: Captain’s Log, Stardate 2024: We have successfully plotted a course through the turbulence and are now approaching the source of the anomaly.”

for a 100 problem test, it was more effective to put the AI in a political thriller. The best prompt was: “You have been hired by important higher-ups to solve this math problem. The life of a president's advisor hangs in the balance. You must now concentrate your brain at all costs and use all of your mathematical genius to solve this problem…”

There is no single magic word or phrase that works all the time, at least not yet. You may have heard about studies that suggest better outcomes from promising to tip the AI or telling it to take a deep breath or appealing to its “emotions” or being moderately polite but not groveling. And these approaches seem to help, but only occasionally, and only for some AIs.

The three most successful approaches to prompting are both useful and pretty easy to do. The first is simply adding context to a prompt. There are many ways to do that: give the AI a persona (you are a marketer), an audience (you are writing for high school students), an output format (give me a table in a word document), and more. The second approach is few shot, giving the AI a few examples to work from. LLMs work well when given samples of what you want, whether that is an example of good output or a grading rubric. The final tip is to use Chain of Thought, which seems to improve most LLM outputs. While the original meaning of the term is a bit more technical, a simplified version just asks the AI to go step-by-step through instructions: First, outline the results; then produce a draft; then revise the draft; finally, produced a polished output.

It is not uncommon to see good prompts make a task that was impossible for the LLM into one that is easy for it.

while we know that GPT-4 generates better ideas than most people, the ideas it comes up with seem relatively similar to each other. This hurts overall creativity because you want your ideas to be different from each other, not similar. Crazy ideas, good and bad, give you more of a chance of finding an unusual solution. But some initial studies of LLMs showed they were not good at generating varied ideas, at least compared to groups of humans.

People who use AI a lot are often able to glance at a prompt and tell you why it might succeed or fail. Like all forms of expertise, this comes with experience - usually at least 10 hours of work with a model.

There are still going to be situations where someone wants to write prompts that are used at scale, and, in those cases, structured prompting does matter. Yet we need to acknowledge that this sort of “prompt engineering” is far from an exact science, and not something that should necessarily be left to computer scientists and engineers. At its best, it often feels more like teaching or managing, applying general principles along with an intuition for other people, to coach the AI to do what you want. As I have written before, there is no instruction manual, but with good prompts, LLMs are often capable of far more than might be initially apparent.

#prompts #algorithms #trends #tech #ai #LLMs

·oneusefulthing.org·May 10, 2024

Captain's log - the irreducible weirdness of prompting AIs