Prompt Learning: Using English Feedback to Optimize LLM Systems
Applications of reinforcement learning (RL) in AI model building has been a growing topic over the past few months. From Deepseek models incorporating RL mechanics into their training processes to...
world customer deployments, internal synthetic data instruction learning tests, and well known benchmarks like Big Bench Hard.
Learn why agent infrastructure is essential to handling stateful, long-running tasks — and how LangGraph Platform provides the runtime support needed to build and scale reliable agents.
Context Engineering for AI Agents: Lessons from Building Manus
This post shares the local optima Manus arrived at through our own "SGD". If you're building your own AI agent, we hope these principles help you converge faster.
Discover the key differences between RAG and fine-tuning, what each approach can bring, and how to choose the right AI approach for your business goals.
Learn all about Reinforcement Learning (RL) and how to train your own DeepSeek-R1 reasoning model with Unsloth using GRPO. A complete guide from beginner to advanced.
A2A (Agent2Agent Protocol) and ACP (Agent Communication Protocol) represent two mainstream technical approaches in AI multi-agent system communication: 'cross-platform interoperability' and 'local/edge autonomy' respectively. A2A, with its powerful cross-vendor interconnection capabilities and rich task collaboration mechanisms, has become the preferred choice for cloud-based and distributed multi-agent scenarios; while ACP, with its low-latency, local-first, cloud-independent characteristics, is suitable for privacy-sensitive, bandwidth-constrained, or edge computing environments. Both protocols have their own focus in protocol design, ecosystem construction, and standardization governance, and are expected to further converge in openness in the future. Developers are advised to choose the most suitable protocol stack based on actual business needs.
Anthropic Academy: Claude API Development Guide \ Anthropic
Learn to build applications with Claude's API. Find detailed documentation, integration guides, code examples, and best practices for developing with our AI capabilities.
GitHub - TencentQQGYLab/AppAgent: AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps. - TencentQQGYLab/AppAgent
LangGraph Rollout: Evolving VeRL’s Multi-Turn Capabilities for Agent RL
After completing our multi-turn tokenization and masking refactoring, we eliminated a critical bottleneck that was preventing us from building a more consistent and flexible rollout system for our Agent RL research. This breakthrough enabled us to implement a LangGraph-based rollout for VeRL in just a few days, which we’ve already successfully deployed in our Agent RL experiments. In this article, I’ll share our journey from VeRL’s native multi-turn implementation to our new LangGraph-based solution, explaining both the motivations driving this evolution and the technical details of our implementation.
Context Engineering Guide By DAIR.AI Academy Table of Contents What is Context Engineering? Context Engineering is Action System Prompt Instructions User Input Structured Inputs and Outputs Tool Calling RAG & Memory State & Historical Context Advanced Context Engineering Resources What is Co...
TL;DR
Agents need context to perform tasks. Context engineering is the art and science of filling the context window with just the right information at each step of an agent’s trajectory. In this post, we break down some common strategies — write, select, compress, and isolate — for context engineering