Search Saved

Natural Language Is an Unnatural Interface

On the user experience of interacting with LLMs

Prompt engineers not only need to get the model to respond to a given question but also structure the output in a parsable way (such as JSON), in case it needs to be rendered in some UI components or be chained into the input of a future LLM query. They scaffold the raw input that is fed into an LLM so the end user doesn’t need to spend time thinking about prompting at all.

From the user’s side, it’s hard to decide what to ask while providing the right amount of context.From the developer’s side, two problems arise. It’s hard to monitor natural language queries and understand how users are interacting with your product. It’s also hard to guarantee that an LLM can successfully complete an arbitrary query. This is especially true for agentic workflows, which are incredibly brittle in practice.

When we speak to other people, there is a shared context that we communicate under. We’re not just exchanging words, but a larger information stream that also includes intonation while speaking, hand gestures, memories of each other, and more. LLMs unfortunately cannot understand most of this context and therefore, can only do as much as is described by the prompt

most people use LLMs for ~4 basic natural language tasks, rarely taking advantage of the conversational back-and-forth built into chat systems:Summarization: Summarizing a large amount of information or text into a concise yet comprehensive summary. This is useful for quickly digesting information from long articles, documents or conversations. An AI system needs to understand the key ideas, concepts and themes to produce a good summary.ELI5 (Explain Like I'm 5): Explaining a complex concept in a simple, easy-to-understand manner without any jargon. The goal is to make an explanation clear and simple enough for a broad, non-expert audience.Perspectives: Providing multiple perspectives or opinions on a topic. This could include personal perspectives from various stakeholders, experts with different viewpoints, or just a range of ways a topic can be interpreted based on different experiences and backgrounds. In other words, “what would ___ do?”Contextual Responses: Responding to a user or situation in an appropriate, contextualized manner (via email, message, etc.). Contextual responses should feel organic and on-topic, as if provided by another person participating in the same conversation.

Prompting nearly always gets in the way because it requires the user to think. End users ultimately do not wish to confront an empty text box in accomplishing their goals. Buttons and other interactive design elements make life easier.The interface makes all the difference in crafting an AI system that augments and amplifies human capabilities rather than adding additional cognitive load.Similar to standup comedy, delightful LLM-powered experiences require a subversion of expectation.

Users will expect the usual drudge of drafting an email or searching for a nearby restaurant, but instead will be surprised by the amount of work that has already been done for them from the moment that their intent is made clear. For example, it would a great experience to discover pre-written email drafts or carefully crafted restaurant and meal recommendations that match your personal taste.If you still need to use a text input box, at a minimum, also provide some buttons to auto-fill the prompt box. The buttons can pass LLM-generated questions to the prompt box.

#llms #ui #chatbot #UX #mental models #GPT #product strategy #dev #best practices

·varunshenoy.substack.com·Jun 28, 2023

Natural Language Is an Unnatural Interface

The Case of The Traveling Text Message — Michele Tepper

John Watson looks down at his screen, and we see the message he’s reading on our screen as well. Now, we’re used to seeing extradiegetic text appear on screen with the characters: titles like “Three Years Earlier” or “Lisbon” serve to orient us in a scene. Those titles even can help set the tone of the narrative - think of the snarky humor of the character introduction chyrons on Burn Notice. But this is different: this is capturing the viewer’s screen as part of the narrative itself [1] It’s a remarkably elegant solution from director Paul McGuigan. And it works because we, the viewing audience, have been trained to understand it by the last several years of service-driven, multi-platform, multi-screen applications.

The connection between Sherlock’s intellect and a computer’s becomes more explicit in one of my favorite scenes, later in the episode. Sherlock is called to the scene of the murder from which the episode takes its title.[3] We watch him process the clues from the scene and as he takes them in, that same titling style appears, now employed in a more conventional-seeming expositional mode

But then the shot reverses, and it’s not quite so conventional after all. The titling isn’t just what Sherlock is understanding, it’s what he’s seeing. In the same way that text-message titling can take over our screens because whatever we’re watching TV on is just another screen in a multiplatform computing system, this scene tells us that Sherlock views the whole world through the head-up display of his own genius.

#multidisciplinary #ui #design #best practices #films

·micheletepper.com·Jun 22, 2023

The Case of The Traveling Text Message — Michele Tepper