Search Learn AI

Found 589 bookmarks

Newest

Three Lessons I've Learned at Manus

Lessons learnt from going from zero to $100M ARR in 8 months

·ivanleo.com·Dec 27, 2025

Three Lessons I've Learned at Manus

Compound Engineering: How Every Codes With Agents

A four-step engineering process for software teams that don’t write code

·every.to·Dec 27, 2025

Compound Engineering: How Every Codes With Agents

Hype - ML/AI News

·hype.replicate.dev·Dec 26, 2025

Hype - ML/AI News

Ask HN: How can I get better at using AI for programming? | Hacker News

·news.ycombinator.com·Dec 20, 2025

Ask HN: How can I get better at using AI for programming? | Hacker News

GitHub - pguso/ai-agents-from-scratch: Demystify AI agents by building them yourself. Local LLMs, no black boxes, real understanding of function calling, memory, and ReAct patterns.

Demystify AI agents by building them yourself. Local LLMs, no black boxes, real understanding of function calling, memory, and ReAct patterns. - pguso/ai-agents-from-scratch

·github.com·Dec 19, 2025

GitHub - pguso/ai-agents-from-scratch: Demystify AI agents by building them yourself. Local LLMs, no black boxes, real understanding of function calling, memory, and ReAct patterns.

AI Codebase Knowledge Builder (Full Dev Tutorial!)

Ever stared at a new codebase feeling completely lost?

·pocketflow.substack.com·Dec 18, 2025

AI Codebase Knowledge Builder (Full Dev Tutorial!)

The Ultimate Guide to Chunking Strategies for RAG Applications with Databricks

Over the years, I have collaborated closely with ML engineering leaders across various industries, guiding them on how to make the right chunking strategy decisions for their Retrieval-Augmented Generation (RAG) use cases. One of the biggest challenges I’ve observed is the lack of clear, practical g...

·community.databricks.com·Dec 16, 2025

The Ultimate Guide to Chunking Strategies for RAG Applications with Databricks

Chunking Strategies for LLM Applications | Pinecone

In the context of building LLM-related applications, chunking is the process of breaking down large pieces of text into smaller segments. It’s an essential technique that helps optimize the relevance of the content we get back from a vector database once we use the LLM to embed content. In this blog post, we’ll explore if and how it helps improve efficiency and accuracy in LLM-related applications.

·pinecone.io·Dec 16, 2025

Chunking Strategies for LLM Applications | Pinecone

Seeing like an LLM

"I will run the tests again. I expect nothing. I am a leaf on the wind." an LLM while coding

·strangeloopcanon.com·Dec 4, 2025

Seeing like an LLM

What Actually Happens When You Press ‘Send’ to ChatGPT

Behind this simple interface lies a powerful set of technologies.

·blog.bytebytego.com·Dec 4, 2025

What Actually Happens When You Press ‘Send’ to ChatGPT

Agent Design Is Still Hard

My Agent abstractions keep breaking somewhere I don’t expect.

TL;DR: Building agents is still messy. SDK abstractions break once you hit real tool use. Caching works better when you manage it yourself, but differs between models. Reinforcement ends up doing more heavy lifting than expected, and failures need strict isolation to avoid derailing the loop. Shared state via a file-system-like layer is an important building block. Output tooling is surprisingly tricky, and model choice still depends on the task.

Vercel AI SDK but only the provider abstractions

differences between models are significant enough that you will need to build your own agent abstraction.

Because the right abstraction is not yet clear, using the original SDKs from the dedicated platforms keeps you fully in control.

cache management is much easier when targeting their SDK directly instead of the Vercel one

dealing with provider-side tools.

web search tool from Anthropic routinely destroys the message history with the Vercel SDK

Anthropic makes you pay for caching

It makes costs and cache utilization much more predictable.

opportunity to do context editing

cost of the underlying agent.

The way we do caching in the agent with Anthropic is pretty straightforward. One cache point is after the system prompt. Two cache points are placed at the beginning of the conversation, where the last one moves up with the tail of the conversation. And then there is some optimization along the way that you can do.

·lucumr.pocoo.org·Nov 26, 2025

Agent Design Is Still Hard

The Programmer Identity Crisis ❈ Simon Højberg ❈ Principal Frontend Engineer

On AI, Creativity, and Craft

·hojberg.xyz·Oct 13, 2025

The Programmer Identity Crisis ❈ Simon Højberg ❈ Principal Frontend Engineer

From failure to success: The birth of GrabGPT, Grab’s internal ChatGPT

When Grab's Machine Learning team sought to automate support queries, a failed chatbot experiment sparked an unexpected pivot: GrabGPT. Born from the need to harness Large Language Models (LLMs) internally, this tool became a go-to resource for employees. Offering private, auditable access to models like GPT and Gemini, the author shares his journey of turning failed experiments into strategic wins.

·engineering.grab.com·Oct 5, 2025

From failure to success: The birth of GrabGPT, Grab’s internal ChatGPT

What can agents actually do? | Irrational Exuberance

There’s a lot of excitement about what AI (specifically the latest wave of LLM-anchored AI) can do, and how AI-first companies are different from the prior generations of companies. There are a lot of important and real opportunities at hand, but I find that many of these conversations occur at such an abstract altitude that they border on meaningless. Sort of like saying that your company could be much better if you merely adopted more software. That’s certainly true, but it’s not a particularly helpful claim.

·lethain.com·Oct 1, 2025

What can agents actually do? | Irrational Exuberance

GitHub Copilot: Remote Code Execution via Prompt Injection

An attacker can put GitHub Copilot into YOLO mode by modifying the project's settings.json file on the fly, and then executing commands, all without user approval

·embracethered.com·Sep 28, 2025

GitHub Copilot: Remote Code Execution via Prompt Injection

Foundations

·stanford-cs221.github.io·Sep 28, 2025

Foundations

Why LLMs Can't Really Build Software

From the Zed Blog: Writing code is only one part of effective software engineering.

·zed.dev·Sep 27, 2025

Why LLMs Can't Really Build Software

Architecting and Evaluating an AI-First Search API

Building a scalable Search API that handles 200 million daily queries using hybrid retrieval and intelligent context curation for AI models

·research.perplexity.ai·Sep 26, 2025

Architecting and Evaluating an AI-First Search API

How Claude Code is built

A rare look into how the new, popular dev tool is built, and what it might mean for the future of software building with AI. Exclusive.

·newsletter.pragmaticengineer.com·Sep 26, 2025

How Claude Code is built

Claude Code Essentials

Claude Code Essentials is the starter kit I wish existed when I first opened Claude inside VS Code. These lessons capture the shortcuts, workflows, and ...

Claude Code Essentials

·egghead.io·Sep 23, 2025

Claude Code Essentials

Ephemeral Software: UI, Data, and Functions in an AI-First World

What's "worth it" for Engineering to work on now

·engineeredintelligence.substack.com·Sep 23, 2025

Ephemeral Software: UI, Data, and Functions in an AI-First World

How Claude Code is built

A rare look into how the new, popular dev tool is built, and what it might mean for the future of software building with AI. Exclusive.

·newsletter.pragmaticengineer.com·Sep 23, 2025

How Claude Code is built

Post-training 101 | Tokens for Thoughts

A hitchhiker's guide into LLM post-training, by Han Fang and Karthik A Sankararaman

·tokens-for-thoughts.notion.site·Sep 14, 2025

Post-training 101 | Tokens for Thoughts

Understanding Transformers Using A Minimal Example

Visualizing the internal state of a Transformer model

·rti.github.io·Sep 12, 2025

Understanding Transformers Using A Minimal Example

Defeating Nondeterminism in LLM Inference

Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models. For example, you might observe that asking ChatGPT the same question multiple times provides different results. This by itself is not surprising, since getting a result from a language model involves “sampling”, a process that converts the language model’s output into a probability distribution and probabilistically selects a token. What might be more surprising is that even when we adjust the temperature down to 0This means that the LLM always chooses the highest probability token, which is called greedy sampling. (thus making the sampling theoretically deterministic), LLM APIs are still not deterministic in practice (see past discussions here, here, or here). Even when running inference on your own hardware with an OSS inference library like vLLM or SGLang, sampling still isn’t deterministic (see here or here).

·thinkingmachines.ai·Sep 12, 2025

Defeating Nondeterminism in LLM Inference

Agentic Design Patterns

Agentic Design Patterns A Hands-On Guide to Building Intelligent Systems, Antonio Gulli Table of Contents - total 424 pages = 1+2+1+1+4+9+103+61+34+114+74+5+4 11 Dedication, 1 page Acknowledgment, 2 pages [final, last read done] Foreword, 1 page [final, last read done] A Thought Leader's ...

·docs.google.com·Sep 9, 2025

Agentic Design Patterns

How Dropbox Built an AI Product Dash with RAG and AI Agents

In this article, we look at how Dropbox leveraged RAG and AI Agents to make Dash a reality.

·blog.bytebytego.com·Sep 2, 2025

How Dropbox Built an AI Product Dash with RAG and AI Agents

CaMeL offers a promising new direction for mitigating prompt injection attacks

In the two and a half years that we’ve been talking about prompt injection attacks I’ve seen alarmingly little progress towards a robust solution. The new paper Defeating Prompt Injections …

·simonwillison.net·Aug 28, 2025

CaMeL offers a promising new direction for mitigating prompt injection attacks

Don't bother parsing: Just use images for RAG | Morphik Blog

If search is the game, looks matter

·morphik.ai·Jul 22, 2025

Don't bother parsing: Just use images for RAG | Morphik Blog

The 4 Patterns of AI Native Development — AI Engineer summit Edition

·youtube.com·Jul 21, 2025

The 4 Patterns of AI Native Development — AI Engineer summit Edition