A detailed mapping between Anthropic’s architectural blueprint for production-ready multi-agent systems and SPINE’s implementation.
Reference Document: Multi-Agent Playbook (PDF)
The Multi-Agent Playbook is an architectural blueprint that outlines production-ready patterns for multi-agent AI systems. It addresses the fundamental challenge: How do you manage delegation, state, execution, and failure without creating chaos?
SPINE (v0.3.17) implements many of these patterns as a practical orchestration framework. This document maps each blueprint concept to its SPINE implementation.
The blueprint establishes a closed-loop system led by an Orchestrator:
“You do not prompt your sub-agents. You prompt your primary agent, and the primary agent prompts your sub-agents. They respond back to the primary agent, which synthesizes the information for you.”
This is the General Contractor Model:
| Blueprint Concept | SPINE Component | Location |
|---|---|---|
| Orchestrator | AgenticLoop |
spine/orchestrator/loop.py |
| Sub-Agent spawning | fan_out() |
spine/patterns/fan_out.py |
| Closed-loop reporting | ToolEnvelope result wrapping |
spine/core/envelope.py |
| Master Plan | Context Stack scenarios | scenarios/*.yaml |
User
│
▼
┌─────────────────────────────────────────────┐
│ SPINE Orchestrator │
│ AgenticLoop + ToolEnvelope instrumentation │
└──────────────────┬──────────────────────────┘
│ fan_out() or pipeline()
┌───────────┼───────────┐
▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐
│Worker │ │Worker │ │Worker │
│Agent 1│ │Agent 2│ │Agent 3│
└───┬───┘ └───┬───┘ └───┬───┘
│ │ │
└───────────┼───────────┘
│ Results via ToolEnvelope
▼
┌─────────────────────────────────────────────┐
│ Synthesized Response to User │
└─────────────────────────────────────────────┘
Four essential components:
| Component | Role |
|---|---|
| Claude Opus 4.5 | Both Orchestrator and all Sub-Agents (uniform high-intelligence) |
| Claude Code (CLI) | The “agent harness” execution environment |
| Built-in Task Tool | Native tool for spawning sub-agents (parallel=true) |
| Markdown File | Immutable “master plan” with task-based lists |
| Blueprint Component | SPINE Equivalent | Notes |
|---|---|---|
| Claude Opus 4.5 | InstrumentedLLMClient |
Multi-provider support (Anthropic, OpenAI, Gemini) |
| Claude Code | run_scenario.py + CLI |
Entry point for orchestrated workflows |
| Task Tool | fan_out(), pipeline() |
Custom parallel/sequential execution |
| Markdown File | Context Stacks + NEXT.md |
6-layer hierarchical context + task tracking |
SPINE extends the blueprint by supporting multiple LLM providers through a unified interface:
from spine.client import InstrumentedLLMClient
client = InstrumentedLLMClient(
provider="anthropic", # or "openai", "gemini"
model="claude-opus-4-5-20251101"
)
The blueprint identifies five architectural pillars that support robust multi-agent systems. Here’s how SPINE implements each:
Blueprint Principle: Agents communicate through closed loops and verifiable artifacts.
“Sub-agents must report exclusively to the Orchestrator, transmitting either high-level summaries or unaltered, verifiable artifacts. Direct user communication is forbidden.”
The Practice:
SPINE Implementation:
| Practice | SPINE Component | Details |
|---|---|---|
| Summary synthesis | ConflictResolver |
Extracts consensus from multi-agent outputs |
| Artifact preservation | ToolEnvelope.result |
Full response wrapped with metadata |
| Error signals | LoopVerdict.REJECT |
Error state triggers routing |
| Test logs | spine/logging/json_logger.py |
All calls logged with traces |
# Every LLM call produces a verifiable artifact
envelope = ToolEnvelope.create(
tool="anthropic:claude-sonnet-4-5",
arguments={"prompt": "..."}
)
# Result includes full metadata for verification
result = envelope.complete(
output="...",
status=ResultStatus.OK,
metrics=MCPMetrics(latency_ms=1200, tokens_in=500, tokens_out=800)
)
Payoff: The Orchestrator confirms task completion based on artifact presence and can route based on clear error signals.
Blueprint Principle: Execution is a choice—parallel for speed, sequential for logic.
“The orchestration pattern must adapt to the task’s dependencies. Independent tasks should be parallelized for throughput; interdependent tasks must be sequential to ensure logical integrity.”
The Practice:
| Mode | When | Use Cases |
|---|---|---|
| Parallel | Independent tasks | 5 agents testing 5 user stories; simultaneous data aggregation |
| Sequential | Strict dependencies | Plan → Build → Host → Test workflow |
| Hybrid | Advanced workflows | Sequential master plan with parallel bursts |
SPINE Implementation:
# Parallel Execution - fan_out()
from spine.patterns import fan_out, FanOutTask
tasks = [
FanOutTask(user_message="Review security", component="security-reviewer"),
FanOutTask(user_message="Review style", component="style-reviewer"),
FanOutTask(user_message="Review logic", component="logic-reviewer"),
]
result = fan_out(parent_envelope, client, tasks, max_workers=5)
# Sequential Execution - pipeline()
from spine.patterns import pipeline, PipelineStep
steps = [
PipelineStep(name="plan", system_prompt="You are an architect."),
PipelineStep(name="build", transform=lambda prev: extract_spec(prev)),
PipelineStep(name="test", system_prompt="You are a tester."),
]
result = pipeline(parent_envelope, client, steps)
Hybrid Approach in SPINE:
┌─────────────────────────────────────────────────────────────┐
│ Sequential Master Plan │
│ │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌─────────────────┐ │
│ │ Plan │───▶│Build │───▶│ Host │───▶│ Test │ │
│ └──────┘ └──────┘ └──────┘ │ ┌───┐ ┌───┐ │ │
│ │ │ 1 │ │ 2 │ │ │
│ │ ├───┤ ├───┤ │ │
│ │ │ 3 │ │ 4 │ │ │
│ │ └───┘ └───┘ │ │
│ │ (parallel burst)│ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Blueprint Principle: Empower agents with tools, but isolate them in sandboxes.
“Instead of restricting an agent’s tools to ensure safety, provide the ‘right tooling’ to maximize capability and restrict the environment to contain risk.”
The Practice:
SPINE Implementation:
| Concept | SPINE Component | Details |
|---|---|---|
| Right Tooling | MCP server integration | browser-mcp, research-agent-mcp, etc. |
| Tool suites | InstrumentedLLMClient |
Full provider capabilities |
| Isolation | TraceScope boundaries |
Logical isolation via trace hierarchy |
| Token optimization | TokenOptimizedClient |
57-87% savings for 1-6 tools |
# Comprehensive tooling via MCP integration
from spine.integration import TokenOptimizedClient
if should_use_code_execution(tool_count=3):
client = TokenOptimizedClient()
result = client.execute_workflow(
workflow_code="""
const session = await clarifyQuestion({topic: "AI agents"});
return await searchSources({research_id: session.id});
""",
required_tools=["clarify_question", "search_sources"]
)
Current Gap: SPINE doesn’t yet implement per-agent E2B sandboxes. Sub-agents share the execution environment. This is identified as a future enhancement opportunity.
Blueprint Principle: State lives in the environment, not the agent’s memory.
“To achieve persistence and enable handoffs between agent sessions, you must preserve the work artifacts and the execution environment, not the agent’s conversational history.”
The Practice - Four Artifacts:
| Artifact | Description |
|---|---|
| Environment Artifact | The hosted sandbox (E2B) where the app runs |
| Instructional Artifact | The “Master Plan” markdown file |
| Result Artifact | Summary logs, named assets, error logs |
| Codebase Artifact | The source code itself (“brownfield state”) |
The Factory Shift Change: A new worker doesn’t read the outgoing worker’s mind. They look at the machinery, check the SOP, and read the logbook.
SPINE Implementation:
| Blueprint Artifact | SPINE Equivalent | Location |
|---|---|---|
| Master Plan | Context Stack scenarios | scenarios/*.yaml |
| Result Artifacts | Structured logs | logs/YYYY-MM-DD/*.json |
| Session state | NEXT.md integration | spine/orchestrator/next_integration.py |
| Codebase state | CLAUDE.md | Auto-generated by smart-inventory |
# Automatic session notes on loop completion
from spine.orchestrator import create_next_callbacks
callbacks = create_next_callbacks(project_path)
loop = AgenticLoop(
project_path=project_path,
task_queue=queue,
on_loop_end=callbacks.on_loop_end, # Appends session notes to NEXT.md
)
Payoff: Perfect recall and seamless multi-session handoffs. An agent can resume work on a “brownfield codebase” instantly without needing to “remember” what it did previously.
Blueprint Principle: Resilience is architected through isolation and error routing.
“A single sub-agent failure must not derail the entire orchestration. The system must contain the ‘blast radius’ of a failure and have built-in logic for recovery.”
The Practice:
SPINE Implementation:
| Concept | SPINE Component | Details |
|---|---|---|
| Blast radius containment | OscillationTracker |
Detects stuck states before runaway |
| Error routing | LoopVerdict |
ACCEPT/REVISE/REJECT decision logic |
| Self-correcting loop | AgenticLoop retry logic |
REVISE triggers retry with feedback |
| Failure detection | Pattern matching | A-B-A-B oscillation, repeated errors |
from spine.orchestrator import (
AgenticLoop, LoopVerdict, OscillationTracker
)
# Oscillation detection prevents infinite loops
tracker = OscillationTracker()
result = tracker.record_error(error_output, files_modified)
if result.oscillating:
print(f"Oscillation detected: {result.reason}")
# System escalates or stops
# Verdict system enables self-correction
# REVISE: Retry with feedback (up to max_revisions)
# REJECT: Skip task, escalate
# ACCEPT: Task complete, proceed
┌──────┐ ┌───────┐ ┌──────┐ ┌──────┐
│ Plan │───▶│ Build │───▶│ Host │───▶│ Test │
└──────┘ └───┬───┘ └──────┘ └──┬───┘
│ │
│ ┌──────────────┐ │
└────│ REVISE Loop │◀───┘
│ (on failure) │
└──────────────┘
“The Orchestrator holds the signal, Sub-Agents absorb the noise.”
The primary strategy for preventing context window overflow is distributed computing. The token load is split across many distinct agents, with the Orchestrator acting as a compression filter.
| Agent Type | Context Type | Examples |
|---|---|---|
| Orchestrator | Executive Signal (Low Context) | Master Plan, operational metrics, synthesized outputs, error signals |
| Sub-Agents | Execution Noise (High Context) | Raw PDF text, full DOM, detailed logs, environment variables |
This is the CEO and Department Heads model: A CEO doesn’t read every employee email. They rely on department heads to manage the noise and report back only critical signals.
SPINE implements this through its Context Stack structure and Tiered Enforcement Protocol:
Context Stack (Orchestrator Signal)
├── global → operator, brand identity
├── character → speaker persona, target audience
├── command → task specification, success criteria
├── constraints → tone, format, do/don't rules
├── context → background info (compressed)
└── input → user request
Sub-Agent Context (Execution Noise)
├── Full document content
├── Raw API responses
├── Detailed logs
└── Environment state
Tiered Enforcement ensures appropriate context distribution:
| Tier | Context Strategy |
|---|---|
| Tier 1 | Direct handling (no distribution needed) |
| Tier 2 | Orchestrator + Sub-Agents with summary reporting |
| Tier 3 | Full distributed computing with artifact handoffs |
Four key metrics for agentic systems:
| Metric | Description |
|---|---|
| Review Velocity | How quickly can a human confirm the work is correct? |
| Tool Calls to Completion | Smarter models are cheaper (Opus in 5 calls beats Sonnet in 10) |
| Artifact Validation | Did the required artifact get created? (check filesystem, not chat) |
| Resource Observability | “Five Agent Summary” tracks tool uses and token usage |
| Blueprint Metric | SPINE Component | Details |
|---|---|---|
| Review Velocity | spine/review/ module |
AI-powered code review with verdicts |
| Tool Calls | spine/logging/json_logger.py |
Full call tracking with timing |
| Artifact Validation | File-based task completion | Check ai-coordination/tasks/completed/ |
| Observability | spine/reports/generator.py |
HTML reports with Chart.js visualizations |
# Generate observability report
python -m spine.reports generate --title "Sprint Report" --days 7
# Output: spine_report_TIMESTAMP.html with:
# - Token usage breakdown
# - Cost estimates per provider
# - Trace tree visualization
# - Tool call frequency charts
“The goal is not to build the application. It is to build the system that builds the application.”
This represents a fundamental shift in perspective:
The agent becomes the new compositional unit of engineering.
SPINE embodies this philosophy. It’s not a tool for building one application—it’s a backbone framework that enables building many applications through orchestrated multi-agent workflows.
Projects built using SPINE:
SPINE’s value proposition:
┌─────────────────────────────────────────────────────────────┐
│ SPINE │
│ (The system that builds the application) │
├─────────────────────────────────────────────────────────────┤
│ • ToolEnvelope instrumentation │
│ • TraceScope hierarchical tracing │
│ • fan_out() / pipeline() orchestration │
│ • AgenticLoop autonomous execution │
│ • Multi-provider LLM support │
│ • Structured logging & observability │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Applications Built │
│ • Golden Thread System │
│ • spine-dashboard │
│ • monitoring-solution │
│ • (your next project) │
└─────────────────────────────────────────┘
The blueprint identifies patterns that SPINE could further develop:
| Blueprint Pattern | SPINE Status | Enhancement Opportunity |
|---|---|---|
| E2B Sandbox per Agent | ❌ Not implemented | True blast radius containment via isolated containers |
| Five Agent Summary | 🟡 Partial | Standardized per-agent health metrics format |
| Built-in Task Tool | 🟡 Conceptual | Expose SPINE as MCP tool (spine.dispatch) |
| Live Link (Hosted App) | ❌ Not implemented | Auto-deploy for parallel testing |
| Uniform Model Stack | ✅ Implemented | InstrumentedLLMClient with multi-provider support |
| Blueprint Concept | SPINE Implementation |
|---|---|
| Orchestrator | AgenticLoop |
| Sub-Agent spawning | fan_out(), pipeline() |
| Closed-loop reporting | ToolEnvelope |
| Master Plan | Context Stack (scenarios/*.yaml) |
| Verifiable artifacts | Structured logs (logs/YYYY-MM-DD/) |
| Error routing | LoopVerdict (ACCEPT/REVISE/REJECT) |
| Oscillation detection | OscillationTracker |
| State handoffs | NEXT.md integration |
| Context compression | Context Stack layers |
| Observability | spine/reports/ + spine/api/ |
| Multi-provider | InstrumentedLLMClient |
KB/Multi-Agent-Playbook-Blueprint.pdfDocument created: 2025-12-30
SPINE Version: 0.3.17