SPINE Architecture Overview

SPINE is a context engineering and multi-agent backbone framework built for long-running, complex software development workflows. It handles instrumentation, multi-provider LLM access, and orchestration patterns that tie agentic projects together.

The goal: coordinated multi-agent development with full traceability, cost tracking, and reproducible context stacks.


Three Capability Layers

SPINE works across three layers:

Layer What Why
1. Host Agent Built-in subagent types (Explore, Plan, code-architect, visual-tester) Parallel agents without MCP overhead
2. MCP Servers browser-mcp, next-conductor, research-agent-mcp, smart-inventory External tool integration
3. SPINE Python fan_out(), pipeline(), ToolEnvelope, TraceScope Custom instrumented orchestration
Layer 1
Host Agent
Explore Plan code-architect visual-tester research-coordinator context-engineer
Layer 2
MCP Servers
browser-mcp next-conductor research-agent-mcp smart-inventory minna-memory
Layer 3
SPINE Python
run_scenario.py fan_out() pipeline() ToolEnvelope TraceScope

Layer 1: Host Agent Subagents

These are available via Task(subagent_type="..."):

Subagent Tier What it does
Explore Fast Codebase exploration, file discovery
Plan Standard Architecture planning, implementation design
research-coordinator Flagship Multi-source research with synthesis
code-architect Standard System design, architectural decisions
visual-tester Fast UI verification via browser automation
context-engineer Standard Context stack design and optimization
general-purpose Default Complex multi-step tasks

Tier mapping is configurable: Flagship = Opus/GPT-5.1/Gemini Pro, Standard = Sonnet/GPT-5.1-mini/Gemini Flash, Fast = Haiku/GPT-5-nano/Gemini Flash.

Works with any compatible agent harness (e.g., Claude Code, custom CLI). Subagents can access conversation context and run in parallel.


Layer 2: MCP Servers

Server Tools What for
browser-mcp navigate, screenshot, click, type Visual UI testing
next-conductor read_next, update_next, init_next Task tracking
research-agent-mcp clarify, search, evaluate, synthesize Research workflows
research-notes-mcp parse, cluster, contradictions Note processing
research-log-mcp create, log, cite Citation management
smart-inventory analyze_project CLAUDE.md generation
minna-memory store, recall, search, who Persistent cross-session memory
graph LR
    AH[Agent Harness] <-->|MCP Protocol| ET[External Tools]
    ET --- FS[File Systems]
    ET --- BR[Browsers / Playwright]
    ET --- TM[Task Management]
    ET --- RW[Research Workflows]

    style AH fill:#7c3aed,stroke:#a78bfa,color:#fff
    style ET fill:#0d9488,stroke:#5eead4,color:#fff

Layer 3: SPINE Python

Component What it does
run_scenario.py Execute reproducible context stack scenarios
fan_out() Parallel subagent execution with aggregation
pipeline() Sequential multi-step processing
ToolEnvelope Instrumented LLM calls with trace correlation
TraceScope Hierarchical context management
LogAggregator Query and analyze execution logs

Directory Layout

spine/
├── _protocols/               # Usage protocols - read first
│   └── tiered-spine-usage.md
├── _templates/               # Templates for other projects
├── .claude/agents/           # Subagent definitions
├── KB/                       # Knowledge base
├── spine/                    # Core Python package
│   ├── core/                 # ToolEnvelope, TraceScope
│   ├── logging/              # Structured JSON logging
│   ├── client/               # Provider configs
│   └── patterns/             # fan_out, pipeline
├── tools/                    # Applications built on SPINE
├── scripts/                  # Scripts and tests
├── scenarios/                # Scenario configs (.yaml)
├── logs/                     # Structured logs
└── run_scenario.py           # Main entry point

Context Stacks

SPINE uses hierarchical context stacks for consistent LLM interactions.

CONTEXT STACK
global operator, brand identity
character persona, target audience
command task spec, success criteria
constraints tone, format, do/don't rules
context background info, references
input user request

Try the interactive Context Stack Builder →

Layer What it holds
global System-level config - operator, brand identity
character Agent persona and target audience
command Task specification and success criteria
constraints Tone, format, do/don’t rules
context Background info and references
input The actual user request

Core Components

ToolEnvelope

Every LLM call gets wrapped in a ToolEnvelope with:

  • id - unique correlation ID
  • tool - provider and model (e.g., “anthropic:claude-opus-4-5”)
  • trace - hierarchical linking (root_id → parent_id → span_id)
  • metadata - tags, sandbox profile, retry policy
envelope = create_envelope(tool: "claude-opus", prompt: "...")
child = envelope.create_child(tool: "claude-sonnet")  // auto-linked

TraceScope

Automatic trace propagation across agent hierarchies:

with TraceScope("orchestrator"):
    // calls inherit this trace

    with TraceScope("subagent-research"):
        // linked as child of orchestrator

Multi-Provider Support

SPINE abstracts away provider differences:

Provider Models Capabilities
Anthropic Claude Opus 4.5, Sonnet 4.5, Haiku 4.5 Full tool use, vision
Google Gemini 3 Pro, Gemini 3 Flash Tool use, vision
OpenAI GPT-5.1, GPT-5 mini Tool use, vision
xAI Grok 4.1 Tool use

Observability

Logs go to ./logs/YYYY-MM-DD/*.json with:

  • Full envelope with trace hierarchy
  • Token usage and cost estimates
  • Timing (started_at, finished_at, duration_ms)
  • Experiment tracking IDs

Trace Hierarchy

graph TB
    R["root_id: session-abc123"] --> O1["parent_id: orchestrator-001"]
    R --> O2["parent_id: orchestrator-002"]
    O1 --> S1["span_id: subagent-research-1"]
    O1 --> S2["span_id: subagent-research-2"]
    O1 --> S3["span_id: subagent-research-3"]
    O2 --> S4["span_id: synthesis-001"]

    style R fill:#2563eb,stroke:#93c5fd,color:#fff
    style O1 fill:#7c3aed,stroke:#a78bfa,color:#fff
    style O2 fill:#7c3aed,stroke:#a78bfa,color:#fff
    style S1 fill:#0d9488,stroke:#5eead4,color:#fff
    style S2 fill:#0d9488,stroke:#5eead4,color:#fff
    style S3 fill:#0d9488,stroke:#5eead4,color:#fff
    style S4 fill:#0d9488,stroke:#5eead4,color:#fff

Memory System (v0.3.29)

SPINE provides a 5-tier memory architecture unified by MemoryFacade:

Tier Component Scope Backend
1 KVStore Namespace-scoped key-value SQLite / File
2 Scratchpad Short-term task notes In-memory
3 EphemeralMemory Session-scoped with decay In-memory
4 VectorStore Hybrid semantic + keyword search LanceDB + keyword
5 EpisodicMemory Goal-based episode recall SQLite + FTS5

MemoryFacade provides unified search across all tiers with score normalization. VerdictRouter routes AgenticLoop accept/reject/revise decisions to the appropriate tier.

Persistence backends: SQLitePersistence and FilePersistence.

Embedding providers: 7 providers (Local/SentenceTransformers, OpenAI, Voyage AI, ONNX Runtime, Gemini, Keyword fallback, Placeholder).

Full Memory System Guide


OODA Loop (v0.3.29)

Agent OS 2026 introduces an OODA-based execution loop that composes existing SPINE components into a structured cognition cycle:

OBSERVE
WorldState
(facade)
ORIENT
Context
Stack
DECIDE
TaskType
Router
ACT
Executor
Framework
REFLECT
Episodic
Memory

Try the interactive OODA Explorer →

  • LoopContext tracks phase, iteration count, and cycle history
  • WorldState provides a unified facade over environment data; WorldSnapshot captures immutable point-in-time state
  • Outcome is the canonical result schema from any action
  • OscillationTracker detects stuck states during Reflect

Full Agent OS Guide


System Scale (IE Cypher Metrics)

Metric Value
Total nodes 3,131
Total edges 6,615
Subsystems 15
Modules 177

Hub Classes (highest fan-in)

Class Fan-in
ContentPipelineExecutor 68
MCPSessionPool 50
Task 47

Back to Main


Back to top

SPINE Showcase - Multi-Agent Orchestration Framework

This site uses Just the Docs, a documentation theme for Jekyll.