SPINE Architecture Overview
SPINE is a context engineering and multi-agent backbone framework built for long-running, complex software development workflows. It handles instrumentation, multi-provider LLM access, and orchestration patterns that tie agentic projects together.
The goal: coordinated multi-agent development with full traceability, cost tracking, and reproducible context stacks.
Three Capability Layers
SPINE works across three layers:
| Layer | What | Why |
|---|---|---|
| 1. Host Agent | Built-in subagent types (Explore, Plan, code-architect, visual-tester) | Parallel agents without MCP overhead |
| 2. MCP Servers | browser-mcp, next-conductor, research-agent-mcp, smart-inventory | External tool integration |
| 3. SPINE Python | fan_out(), pipeline(), ToolEnvelope, TraceScope | Custom instrumented orchestration |
Layer 1: Host Agent Subagents
These are available via Task(subagent_type="..."):
| Subagent | Tier | What it does |
|---|---|---|
Explore |
Fast | Codebase exploration, file discovery |
Plan |
Standard | Architecture planning, implementation design |
research-coordinator |
Flagship | Multi-source research with synthesis |
code-architect |
Standard | System design, architectural decisions |
visual-tester |
Fast | UI verification via browser automation |
context-engineer |
Standard | Context stack design and optimization |
general-purpose |
Default | Complex multi-step tasks |
Tier mapping is configurable: Flagship = Opus/GPT-5.1/Gemini Pro, Standard = Sonnet/GPT-5.1-mini/Gemini Flash, Fast = Haiku/GPT-5-nano/Gemini Flash.
Works with any compatible agent harness (e.g., Claude Code, custom CLI). Subagents can access conversation context and run in parallel.
Layer 2: MCP Servers
| Server | Tools | What for |
|---|---|---|
browser-mcp |
navigate, screenshot, click, type | Visual UI testing |
next-conductor |
read_next, update_next, init_next | Task tracking |
research-agent-mcp |
clarify, search, evaluate, synthesize | Research workflows |
research-notes-mcp |
parse, cluster, contradictions | Note processing |
research-log-mcp |
create, log, cite | Citation management |
smart-inventory |
analyze_project | CLAUDE.md generation |
minna-memory |
store, recall, search, who | Persistent cross-session memory |
graph LR
AH[Agent Harness] <-->|MCP Protocol| ET[External Tools]
ET --- FS[File Systems]
ET --- BR[Browsers / Playwright]
ET --- TM[Task Management]
ET --- RW[Research Workflows]
style AH fill:#7c3aed,stroke:#a78bfa,color:#fff
style ET fill:#0d9488,stroke:#5eead4,color:#fff
Layer 3: SPINE Python
| Component | What it does |
|---|---|
run_scenario.py |
Execute reproducible context stack scenarios |
fan_out() |
Parallel subagent execution with aggregation |
pipeline() |
Sequential multi-step processing |
ToolEnvelope |
Instrumented LLM calls with trace correlation |
TraceScope |
Hierarchical context management |
LogAggregator |
Query and analyze execution logs |
Directory Layout
spine/
├── _protocols/ # Usage protocols - read first
│ └── tiered-spine-usage.md
├── _templates/ # Templates for other projects
├── .claude/agents/ # Subagent definitions
├── KB/ # Knowledge base
├── spine/ # Core Python package
│ ├── core/ # ToolEnvelope, TraceScope
│ ├── logging/ # Structured JSON logging
│ ├── client/ # Provider configs
│ └── patterns/ # fan_out, pipeline
├── tools/ # Applications built on SPINE
├── scripts/ # Scripts and tests
├── scenarios/ # Scenario configs (.yaml)
├── logs/ # Structured logs
└── run_scenario.py # Main entry point
Context Stacks
SPINE uses hierarchical context stacks for consistent LLM interactions.
| Layer | What it holds |
|---|---|
global |
System-level config - operator, brand identity |
character |
Agent persona and target audience |
command |
Task specification and success criteria |
constraints |
Tone, format, do/don’t rules |
context |
Background info and references |
input |
The actual user request |
Core Components
ToolEnvelope
Every LLM call gets wrapped in a ToolEnvelope with:
- id - unique correlation ID
- tool - provider and model (e.g., “anthropic:claude-opus-4-5”)
- trace - hierarchical linking (root_id → parent_id → span_id)
- metadata - tags, sandbox profile, retry policy
envelope = create_envelope(tool: "claude-opus", prompt: "...")
child = envelope.create_child(tool: "claude-sonnet") // auto-linked
TraceScope
Automatic trace propagation across agent hierarchies:
with TraceScope("orchestrator"):
// calls inherit this trace
with TraceScope("subagent-research"):
// linked as child of orchestrator
Multi-Provider Support
SPINE abstracts away provider differences:
| Provider | Models | Capabilities |
|---|---|---|
| Anthropic | Claude Opus 4.5, Sonnet 4.5, Haiku 4.5 | Full tool use, vision |
| Gemini 3 Pro, Gemini 3 Flash | Tool use, vision | |
| OpenAI | GPT-5.1, GPT-5 mini | Tool use, vision |
| xAI | Grok 4.1 | Tool use |
Observability
Logs go to ./logs/YYYY-MM-DD/*.json with:
- Full envelope with trace hierarchy
- Token usage and cost estimates
- Timing (started_at, finished_at, duration_ms)
- Experiment tracking IDs
Trace Hierarchy
graph TB
R["root_id: session-abc123"] --> O1["parent_id: orchestrator-001"]
R --> O2["parent_id: orchestrator-002"]
O1 --> S1["span_id: subagent-research-1"]
O1 --> S2["span_id: subagent-research-2"]
O1 --> S3["span_id: subagent-research-3"]
O2 --> S4["span_id: synthesis-001"]
style R fill:#2563eb,stroke:#93c5fd,color:#fff
style O1 fill:#7c3aed,stroke:#a78bfa,color:#fff
style O2 fill:#7c3aed,stroke:#a78bfa,color:#fff
style S1 fill:#0d9488,stroke:#5eead4,color:#fff
style S2 fill:#0d9488,stroke:#5eead4,color:#fff
style S3 fill:#0d9488,stroke:#5eead4,color:#fff
style S4 fill:#0d9488,stroke:#5eead4,color:#fff
Memory System (v0.3.29)
SPINE provides a 5-tier memory architecture unified by MemoryFacade:
| Tier | Component | Scope | Backend |
|---|---|---|---|
| 1 | KVStore |
Namespace-scoped key-value | SQLite / File |
| 2 | Scratchpad |
Short-term task notes | In-memory |
| 3 | EphemeralMemory |
Session-scoped with decay | In-memory |
| 4 | VectorStore |
Hybrid semantic + keyword search | LanceDB + keyword |
| 5 | EpisodicMemory |
Goal-based episode recall | SQLite + FTS5 |
MemoryFacade provides unified search across all tiers with score normalization. VerdictRouter routes AgenticLoop accept/reject/revise decisions to the appropriate tier.
Persistence backends: SQLitePersistence and FilePersistence.
Embedding providers: 7 providers (Local/SentenceTransformers, OpenAI, Voyage AI, ONNX Runtime, Gemini, Keyword fallback, Placeholder).
OODA Loop (v0.3.29)
Agent OS 2026 introduces an OODA-based execution loop that composes existing SPINE components into a structured cognition cycle:
(facade)
Stack
Router
Framework
Memory
- LoopContext tracks phase, iteration count, and cycle history
- WorldState provides a unified facade over environment data; WorldSnapshot captures immutable point-in-time state
- Outcome is the canonical result schema from any action
- OscillationTracker detects stuck states during Reflect
System Scale (IE Cypher Metrics)
| Metric | Value |
|---|---|
| Total nodes | 3,131 |
| Total edges | 6,615 |
| Subsystems | 15 |
| Modules | 177 |
Hub Classes (highest fan-in)
| Class | Fan-in |
|---|---|
ContentPipelineExecutor |
68 |
MCPSessionPool |
50 |
Task |
47 |
Related Docs
- Tiered Enforcement Protocol - when to use each capability level
- Pattern Guide - fan-out and pipeline usage
- Agent OS 2026 - OODA loop, episodic memory, agent processes
- Memory System - 5-tier memory architecture
- Minna Memory Integration - persistent cross-session memory