Search Saved

LN 038: Semantic zoom

This “undulant interface” was made by John Underkoffler. The heresy implicit within [1] is the premise that the user, not the system, gets to define what is most important at any given moment; where to place the jeweler’s loupes for more detail, and where to show only a simple overview, within one consistent interface. Notice how when a component is expanded for more detail, the surrounding elements adjust their position, so the increased detail remains in the broader context. This contrasts sharply with how we get more detail in mainstream interfaces of the day, where modal popups obscure surrounding context, or separate screens replace it entirely. Being able to adjust the detail of different components within the singular context allows users to shape the interfaces they need in each moment of their work.

Pushing towards this style of interaction could show up in many parts of an itemized personal computing environment: when moving in and out of sets, single items, or attributes and references within items.

everyone has unique needs and context, yet that which makes our lives more unique makes today’s rigid software interfaces more frustrating to use. How might Colin use the gestural, itemized interface, combined with semantic zoom on this plethora of data, to elicit the interfaces and answers he’s looking for with his data?

since workout items each have data with associated timestamps and locations, the system knows it can offer both a timeline and map view. And since the items are of one kind, it knows it can offer a table view. Instead of selecting one view to switch to, as we first explored in LN 006, we could drag them into the space to have multiple open at once.

As the email item view gets bigger, the preview text of the email’s contents eventually turns into the fully-rendered email. At smaller sizes, this view makes less sense, so the system can swap it out for the preview text as needed.

#design #LLMs #ai #ai/summarization #narratives #ui #hci

·alexanderobenauer.com·Jan 2, 2025

LN 038: Semantic zoom

Malleable software in the age of LLMs

Historically, end-user programming efforts have been limited by the difficulty of turning informal user intent into executable code, but LLMs can help open up this programming bottleneck. However, user interfaces still matter, and while chatbots have their place, they are an essentially limited interaction mode. An intriguing way forward is to combine LLMs with open-ended, user-moldable computational media, where the AI acts as an assistant to help users directly manipulate and extend their tools over time.

LLMs will represent a step change in tool support for end-user programming: the ability of normal people to fully harness the general power of computers without resorting to the complexity of normal programming. Until now, that vision has been bottlenecked on turning fuzzy informal intent into formal, executable code; now that bottleneck is rapidly opening up thanks to LLMs.

If this hypothesis indeed comes true, we might start to see some surprising changes in the way people use software: One-off scripts: Normal computer users have their AI create and execute scripts dozens of times a day, to perform tasks like data analysis, video editing, or automating tedious tasks. One-off GUIs: People use AI to create entire GUI applications just for performing a single specific task—containing just the features they need, no bloat. Build don’t buy: Businesses develop more software in-house that meets their custom needs, rather than buying SaaS off the shelf, since it’s now cheaper to get software tailored to the use case. Modding/extensions: Consumers and businesses demand the ability to extend and mod their existing software, since it’s now easier to specify a new feature or a tweak to match a user’s workflow. Recombination: Take the best parts of the different applications you like best, and create a new hybrid that composes them together.

Chat will never feel like driving a car, no matter how good the bot is. In their 1986 book Understanding Computers and Cognition, Terry Winograd and Fernando Flores elaborate on this point: In driving a car, the control interaction is normally transparent. You do not think “How far should I turn the steering wheel to go around that curve?” In fact, you are not even aware (unless something intrudes) of using a steering wheel…The long evolution of the design of automobiles has led to this readiness-to-hand. It is not achieved by having a car communicate like a person, but by providing the right coupling between the driver and action in the relevant domain (motion down the road).

Think about how a spreadsheet works. If you have a financial model in a spreadsheet, you can try changing a number in a cell to assess a scenario—this is the inner loop of direct manipulation at work. But, you can also edit the formulas! A spreadsheet isn’t just an “app” focused on a specific task; it’s closer to a general computational medium which lets you flexibly express many kinds of tasks. The “platform developers"—the creators of the spreadsheet—have given you a set of general primitives that can be used to make many tools. We might draw the double loop of the spreadsheet interaction like this. You can edit numbers in the spreadsheet, but you can also edit formulas, which edits the tool

what if you had an LLM play the role of the local developer? That is, the user mainly drives the creation of the spreadsheet, but asks for technical help with some of the formulas when needed? The LLM wouldn’t just create an entire solution, it would also teach the user how to create the solution themselves next time.

This picture shows a world that I find pretty compelling. There’s an inner interaction loop that takes advantage of the full power of direct manipulation. There’s an outer loop where the user can also more deeply edit their tools within an open-ended medium. They can get AI support for making tool edits, and grow their own capacity to work in the medium. Over time, they can learn things like the basics of formulas, or how a VLOOKUP works. This structural knowledge helps the user think of possible use cases for the tool, and also helps them audit the output from the LLMs. In a ChatGPT world, the user is left entirely dependent on the AI, without any understanding of its inner mechanism. In a computational medium with AI as assistant, the user’s reliance on the AI gently decreases over time as they become more comfortable in the medium.

#llms #ai #apps #ui #design #hci #future

·geoffreylitt.com·May 19, 2024

Malleable software in the age of LLMs