Agent Quality
GenAI
Agent Engineering: A New Discipline
If you’ve built an agent, you know that the delta between “it works on my machine” and “it works in production” can be huge. Traditional software assumes you mostly know the inputs and can define the outputs. Agents give you neither: users can say literally anything, and the space
Building Durable AI Agents
Evals Flashcards – Hamel’s Blog - Hamel Husain
Notes on applied AI engineering, machine learning, and data science.
Evaluating Deep Agents: Our Learnings
Over the past month at LangChain, we shipped four applications on top of the Deep Agents harness:
* DeepAgents CLI: a coding agent
* LangSmith Assist: an in-app agent to help with various things in LangSmith
* Personal Email Assistant: an email assistant that learns from interactions with each user
* Agent Builder: a no-code agent building platform powered by meta deep agents
Building and shipping these agents meant adding evals for each of them, and we learned a lot along the way! In this
Event-driven architecture coupled with Domain-driven design
Why does EDA fit so well with DDD principles?
Parloa's Bayesian Framework to A/B Test AI Agents
Learn about our hierarchical Bayesian model for A/B testing AI agents. It combines deterministic binary metrics and LLM-judge scores into a single framework that accounts for variation across different groups
Learning DSPy (3): Working with optimizers
A walkthrough of using the bootstrap fewshot and GEPA optimizers in DSPy
Agents Should Be More Opinionated | vtrivedy
The best agent products aren't the most flexible, they're the most opinionated. Learn why agents need fewer knobs, not more, and how to design around model intelligence spikes.
Domain-Driven Design (DDD) Demystified
Most software doesn’t break because of syntax errors or flawed if-else logic.
Technical Deflation — Benjamin Anderson
Let's buy the fridge next month, honey.
How to Correctly Report LLM-as-a-Judge Evaluations
Beyond-Naive-RAG--Practical-Advanced-Methods.pdf
Introducing langchain-azure-storage: Azure Storage integrations for LangChain | Microsoft Community Hub
We're excited to introduce langchain-azure-storage, the first official Azure Storage integration package built by Microsoft for LangChain 1.0. As part...
Estimating AI productivity gains \ Anthropic
Anthropic economic research on productivity gains
Making Sense of Memory in AI Agents – Leonie Monigatti
Learn how to make stateless LLM agents remember conversations, manage context, and implement memory banks.
Practical Guide on how to build an Agent from scratch with Gemini 3
A step-by-step practical guide on building AI agents using Gemini 3 Pro, covering tool integration, context management, and best practices for creating effective and reliable agents.
How I Use Every Claude Code Feature
A brain dump of all the ways I've been using Claude Code.
Context Engineering_ Sessions & Memory.pdf
Prototype to Production.pdf
Agent Quality.pdf
Agent Tools & Interoperability with Model Context Protocol (MCP).pdf
Introduction to Agents.pdf
AI21 Maestro’s accuracy fix for RAG’s blind spots
AI21 Maestro’s Structured RAG fixes RAG’s accuracy gaps with hybrid retrieval—delivering reliable, auditable answers for enterprise compliance and reporting.
Deploy bidirectional streaming agents with Vertex AI Agent Engine and Live API - Build with AI / Agents - Google Developer forums
This blog has been co-authored by Hanfei Sun, Vertex AI Agent Engine, Software Engineer, and Huang Xia, Vertex AI Agent Engine, Software Engineer. TL;DR: Vertex AI Agent Engine now integrates with the Live API to enable real-time, bidirectional streaming agents. This allows for low-latency, human-like conversations using text and audio. This post demonstrates how to quickly build a streaming agent with the Agent Development Kit (ADK), leveraging a fully managed, serverless platform that hand...
Context Management in Amp
Learn how to get the most out of the context window in Amp, with the least amount of work.
Deepseek ocr
Building a Coding Agent in Rust: Introduction | 0xshadow's Blog
Setting up the coding agent rust project with Gemini API
Self-Evolving Agents - A Cookbook for Autonomous Agent Retraining
Agentic systems often reach a plateau after proof-of-concept because they depend on humans to diagnose edge cases and correct failures. T...
Prompt Learning Playbook