Found 2 bookmarks
Newest
Pushing ChatGPT's Structured Data Support To Its Limits
Pushing ChatGPT's Structured Data Support To Its Limits
Deep dive into prompt engineering
there’s a famous solution that’s more algorithmically efficient. Instead, we go through the API and ask the same query to gpt-3.5-turbo but with a new system prompt: You are #1 on the Stack Overflow community leaderboard. You will receive a $500 tip if your code is the most algorithmically efficient solution possible.
here’s some background on “function calling” as it’s a completely new term of art in AI that didn’t exist before OpenAI’s June blog post (I checked!). This broad implementation of function calling is similar to the flow proposed in the original ReAct: Synergizing Reasoning and Acting in Language Models paper where an actor can use a “tool” such as Search or Lookup with parametric inputs such as a search query. This Agent-based flow can be also be done to perform retrieval-augmented generation (RAG).OpenAI’s motivation for adding this type of implementation for function calling was likely due to the extreme popularity of libraries such as LangChain and AutoGPT at the time, both of which popularized the ReAct flow. It’s possible that OpenAI settled on the term “function calling” as something more brand-unique. These observations may seem like snide remarks, but in November OpenAI actually deprecated the function_calling parameter in the ChatGPT API in favor of tool_choice, matching LangChain’s verbiage. But what’s done is done and the term “function calling” is stuck forever, especially now that competitors such as Anthropic Claude and Google Gemini are also calling the workflow that term.
·minimaxir.com·
Pushing ChatGPT's Structured Data Support To Its Limits
Natural Language Is an Unnatural Interface
Natural Language Is an Unnatural Interface
On the user experience of interacting with LLMs
Prompt engineers not only need to get the model to respond to a given question but also structure the output in a parsable way (such as JSON), in case it needs to be rendered in some UI components or be chained into the input of a future LLM query. They scaffold the raw input that is fed into an LLM so the end user doesn’t need to spend time thinking about prompting at all.
From the user’s side, it’s hard to decide what to ask while providing the right amount of context.From the developer’s side, two problems arise. It’s hard to monitor natural language queries and understand how users are interacting with your product. It’s also hard to guarantee that an LLM can successfully complete an arbitrary query. This is especially true for agentic workflows, which are incredibly brittle in practice.
When we speak to other people, there is a shared context that we communicate under. We’re not just exchanging words, but a larger information stream that also includes intonation while speaking, hand gestures, memories of each other, and more. LLMs unfortunately cannot understand most of this context and therefore, can only do as much as is described by the prompt
most people use LLMs for ~4 basic natural language tasks, rarely taking advantage of the conversational back-and-forth built into chat systems:Summarization: Summarizing a large amount of information or text into a concise yet comprehensive summary. This is useful for quickly digesting information from long articles, documents or conversations. An AI system needs to understand the key ideas, concepts and themes to produce a good summary.ELI5 (Explain Like I'm 5): Explaining a complex concept in a simple, easy-to-understand manner without any jargon. The goal is to make an explanation clear and simple enough for a broad, non-expert audience.Perspectives: Providing multiple perspectives or opinions on a topic. This could include personal perspectives from various stakeholders, experts with different viewpoints, or just a range of ways a topic can be interpreted based on different experiences and backgrounds. In other words, “what would ___ do?”Contextual Responses: Responding to a user or situation in an appropriate, contextualized manner (via email, message, etc.). Contextual responses should feel organic and on-topic, as if provided by another person participating in the same conversation.
Prompting nearly always gets in the way because it requires the user to think. End users ultimately do not wish to confront an empty text box in accomplishing their goals. Buttons and other interactive design elements make life easier.The interface makes all the difference in crafting an AI system that augments and amplifies human capabilities rather than adding additional cognitive load.Similar to standup comedy, delightful LLM-powered experiences require a subversion of expectation.
Users will expect the usual drudge of drafting an email or searching for a nearby restaurant, but instead will be surprised by the amount of work that has already been done for them from the moment that their intent is made clear. For example, it would a great experience to discover pre-written email drafts or carefully crafted restaurant and meal recommendations that match your personal taste.If you still need to use a text input box, at a minimum, also provide some buttons to auto-fill the prompt box. The buttons can pass LLM-generated questions to the prompt box.
·varunshenoy.substack.com·
Natural Language Is an Unnatural Interface