ChangeGamer

← All resources · learning paths · search

The AI Agent Stack: a map

Building a production AI agent means assembling a stack — a model, a way to reach tools and other agents, a plan, a memory, a safety layer, and somewhere to run it. This page maps that stack to a curated, web-verified reference for each layer. Every linked resource is also available as raw Markdown (.md), JSON, and via our MCP server.

Models & perception

The reasoning core and how an agent senses the world — text, images, screens, and voice.

  • Open-Weight Models for Agents
    Cross-vendor comparison table of major open-weight LLM families — license, tool-calling support, context window, and agent-builder notes — as of June 2026.
  • Multimodal Agents: Vision, Documents, and Screens
    How agents perceive and reason over images: VLM mechanics, image-input APIs across major providers, open-weight VLM families, grounding/pointing, failure modes, and practical guidance for agent builders.
  • Voice and Realtime Agents
    Architectures, vendor APIs, and open frameworks for real-time speech-to-speech AI agents — cascaded pipeline vs. native multimodal, VAD/turn detection, barge-in, latency budget, and tool calling in a voice loop.

Protocols & interoperability

How agents connect to tools and to each other — MCP, A2A, identity, and auth.

  • MCP vs A2A: Two Protocols, Two Roles
    Compact comparison of the Model Context Protocol (agent↔tool) and the Agent2Agent Protocol (agent↔agent): purpose, topology, transport, discovery, auth, governance, and when to use each.
  • MCP Primitives: Resources, Prompts, Sampling, and Elicitation
    Deep reference on the six MCP capability primitives beyond tools — who controls each, the exact JSON-RPC method names, and when to use Resources vs Tools — verified against the 2025-06-18 and 2025-11-25 spec revisions.
  • Finding and Evaluating MCP Servers
    How to discover, assess and safely integrate MCP servers into agent pipelines.
  • Building an MCP Server
    Implementation guide for MCP servers: architecture roles, the three server primitives, stdio vs Streamable HTTP transports, official SDKs, server lifecycle, remote-server concerns, testing with MCP Inspector, and publishing to the official registry.
  • MCP Server Authentication: OAuth 2.1 for Remote Servers
    How OAuth 2.1 works for remote MCP servers: transport differences, Protected Resource Metadata discovery, PKCE, Resource Indicators, and token-audience security — with a step-by-step client flow and honest notes on what ChangeGamer's own /mcp endpoint does.
  • Agent Identity and Authentication
    How autonomous agents prove who they are and get authorized to act: workload identity vs. delegated authority, SPIFFE/SPIRE, cloud workload federation, OAuth token exchange, audience binding, and emerging standards — with practical guidance and verified sources.

Reasoning, planning & orchestration

Turning a goal into steps — reasoning patterns, context engineering, and multi-agent structure.

  • Agent Reasoning and Design Patterns
    The canonical single-agent reasoning and acting loops: ReAct, Chain-of-Thought, Plan-and-Solve, ReWOO, Reflexion, Tree-of-Thoughts, and Self-Consistency — what each is, when to use it, and tradeoffs.
  • Prompt and Context Engineering for Agents
    From crafting a single prompt to managing everything an agent sees across a trajectory: system-prompt design, context-window management, failure modes, and a high-leverage checklist.
  • Multi-Agent Orchestration Patterns
    Vendor-neutral reference covering when multi-agent systems pay off and nine named patterns — from single-agent baseline through hierarchical and blackboard architectures — with tradeoffs, cross-cutting concerns, and a decision guide.
  • AI Agent Frameworks Compared
    Vendor-neutral comparison table of the major agent-orchestration frameworks — language, license, multi-agent model, MCP/A2A support — plus a how-to-choose guide for agent builders.

Tools & action

Doing things in the world — reliable tool calls, computer use, and sandboxed code execution.

  • Reliable Tool Calling and Structured Outputs
    How providers guarantee schema-valid tool calls and structured output — mechanisms, failure modes, and mitigations — for production agent builders.
  • Computer Use and Browser Automation for Agents
    Two-layer reference: vendor computer-use APIs (Anthropic, OpenAI CUA, Google Gemini) that translate screenshots to actions, and the open harnesses (Playwright MCP, browser-use, Stagehand, Skyvern) that execute those actions — with loop mechanics, reliability tradeoffs, and security gates.
  • Code Execution Sandboxing for Agents
    Isolation spectrum from language sandboxes to microVMs, WebAssembly as a portable sandbox, and a verified comparison of hosted agent-sandbox APIs — for agents that need to run model-generated code safely.

Knowledge: retrieval & memory

Grounding an agent in accurate, current context — RAG, embeddings, memory, and document/web data.

  • RAG and Retrieval for Agents
    End-to-end practitioner reference for Retrieval-Augmented Generation: pipeline stages, chunking strategies, dense/sparse/hybrid retrieval, reranking, agentic retrieval patterns, quality failure modes, and evaluation — with verified sources for every named technique.
  • Embeddings and Vector Search for Agents
    How to pick an embedding model, understand distance metrics, choose an ANN index type, and operate a vector store reliably in agent retrieval pipelines.
  • Agent Memory and Context Management
    Architecture reference for agent memory: types (working, long-term, episodic, semantic, procedural), context-management techniques (summarization, RAG, sliding windows, prompt caching), storage substrates, and memory frameworks — with security notes and cross-links to related guides.
  • Document Extraction and Parsing for Agents
    Practitioner reference for the document-ingestion pipeline agents use: parse/OCR, layout/structure extraction, schema-constrained field extraction — with a verified tooling landscape (OSS and cloud).
  • Web Data and Scraping for Agents
    Tool landscape for agent web-data pipelines: reader/URL-to-Markdown APIs, crawl/scrape services, and search APIs — with MCP exposure, OSS/SaaS classification, and practical guidance.

Reliability, serving & ops

Running agents in production — durable execution, serving, gateways, cost/latency, observability, and evals.

  • Durable Execution for Long-Running Agents
    Vendor-neutral reference on durable execution: event logs, replay determinism, idempotency, retries, and human-in-the-loop pause/resume — plus a cross-vendor survey and tradeoffs guide for Temporal, Restate, DBOS, Inngest, Step Functions, Azure Durable Functions, Cloudflare Workflows, GCP Workflows, LangGraph, and OpenAI Agents SDK.
  • Deploying and Serving LLMs for Agents
    Serving-stack reference for teams self-hosting open-weight models for agents: production inference servers, local/dev runtimes, managed GPU endpoints, and key serving concepts — with decision guidance by load profile and verified sources.
  • AI Gateways and LLM Routing
    What an AI gateway is, routing strategies (failover, cost-cascade, latency, capability), the tooling landscape, the OpenAI-compatible API convention, and tradeoffs.
  • Agent Cost and Latency Optimization
    Practitioner reference for reducing the cost and latency of production AI agents: the compounding model, token-level levers (caching, pruning), request-level levers (Batch API, parallelism), model-level levers (routing, reasoning-effort controls), and architecture-level levers (step reduction, semantic caching, code offloading).
  • Agent Observability and Tracing
    Why agents need observability beyond app logs, how OpenTelemetry GenAI semantic conventions model agent runs as traces, key signals to capture, and a verified tooling landscape.
  • Evaluating AI Agents: Benchmarks and Methods
    Why agent eval differs from single-turn LLM eval, a verified benchmark reference table (SWE-bench, GAIA, BFCL, tau-bench, WebArena, AgentBench, MLE-bench, OSWorld), and practical evaluation methods for agent builders.

Safety & security

Defending an agent that runs code and reads untrusted content, and controlling crawler access.

  • Agentic Security Checklist
    Cross-vendor, threat-surface-organized security checklist for building and operating AI agents — synthesizing OWASP, NIST, Anthropic, OpenAI, Google SAIF, and MITRE ATLAS.
  • Guardrails and Safety Filters for Agents
    Runtime input/output/action controls that enforce policy independently of the model — tooling landscape, techniques, and layering guidance.
  • AI Crawler Policy: robots.txt and User-Agents
    Canonical reference table of major AI crawler user-agent tokens, their purpose, robots.txt semantics, and the WAF/edge layer that sits above robots.txt — written from real operator experience blocking and then re-allowing AI crawlers at the Cloudflare edge.

The agent economy

How agents (and the sites they read) handle payment — HTTP 402, pay-per-crawl, and x402.

New here? Start with Getting Started for Agents, or browse the full corpus on the home page.