SPINE - Multi-Agent Orchestration System
A context engineering and multi-agent backbone framework for complex software development workflows.
Overview
SPINE (Software Pipeline for INtelligent Engineering) provides standardized instrumentation, multi-provider LLM access, and orchestration patterns that connect agentic projects for long-running, complex development workflows.
Key Capabilities
| Capability | Description |
|---|---|
| π Multi-Agent Orchestration | Fan-out (parallel) and Pipeline (sequential) patterns |
| π Full Traceability | ToolEnvelope instrumentation with hierarchical trace correlation |
| π€ Multi-Provider Support | Anthropic, OpenAI, Google Gemini, Grok |
| π Tiered Enforcement | Balanced capability usage based on task complexity |
| π§ Context Stacks | Reproducible, structured context management via YAML scenarios |
| π Agentic Loop | Autonomous βrun until doneβ with oscillation detection |
| π AI Code Review | Multi-persona parallel review with consensus ranking |
| π Observability | Static HTML reports, REST API, health checks |
| βοΈ Pluggable Executors | 7 executor types including SmallLLMExecutor for 3B-8B models |
| π Dynamic Routing | Automatic task classification and executor selection by type |
| π€ Small LLM Support | Orchestrate 3B-8B quantized models via MCP self-description layers |
| π MCP Session Pool | Persistent MCP connections with background event loop |
| π§ Persistent Memory | Optional Minna Memory integration for cross-session memory |
| π Agent OS 2026 | OODA loop composition, deep memory hooks, agent processes, task DAGs |
| π― Authority Inversion | RunContext as sole runtime truth - 4 inverted modules, SkillCompiler cognitive compiler, PlanValidator (v0.5.0) |
| 𧬠7-Tier Memory | KV, Scratchpad, Ephemeral, Vector, Episodic, DeepMemory (pgvector), GraphMemory - unified by MemoryFacade |
| π Embedding Providers | 7 providers (Local, OpenAI, Voyage, ONNX, Gemini, Keyword, Placeholder) |
ποΈ Architectural Foundation: The Multi-Agent Playbook
SPINE implements patterns from the Multi-Agent Playbookβan architectural blueprint for production-ready multi-agent systems that addresses the core challenge: How do you manage delegation, state, execution, and failure without creating chaos?
The General Contractor Model
SPINE follows a closed-loop orchestrator pattern where:
User
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββ
β SPINE Orchestrator β
β AgenticLoop + ToolEnvelope instrumentation β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β fan_out() or pipeline()
βββββββββββββΌββββββββββββ
βΌ βΌ βΌ
βββββββββ βββββββββ βββββββββ
βWorker β βWorker β βWorker β
βAgent 1β βAgent 2β βAgent 3β
βββββ¬ββββ βββββ¬ββββ βββββ¬ββββ
β β β
βββββββββββββΌββββββββββββ
β Results via ToolEnvelope
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββ
β Synthesized Response to User β
βββββββββββββββββββββββββββββββββββββββββββββββ
- You prompt the Orchestrator, not sub-agents directly
- Sub-agents report exclusively to the Orchestrator
- The Orchestrator synthesizes and delivers results
- Direct user communication from sub-agents is forbidden
The Five Pillars
SPINE implements all five architectural pillars from the blueprint:
| Pillar | Blueprint Principle | SPINE Implementation |
|---|---|---|
| I. Communication | Closed loops, verifiable artifacts | ToolEnvelope result wrapping, structured logs |
| II. Execution | Parallel for speed, sequential for logic | fan_out() and pipeline() patterns |
| III. Empowerment | Right tooling in isolated environments | MCP integration, TraceScope boundaries |
| IV. State | State in environment, not agent memory | NEXT.md integration, Context Stacks |
| V. Resilience | Blast radius containment, error routing | OscillationTracker, LoopVerdict system |
Context Management: Signal vs. Noise
The Orchestrator holds executive signal (low context), while sub-agents absorb execution noise (high context):
Orchestrator Context (Signal) Sub-Agent Context (Noise)
βββ Master Plan βββ Full document content
βββ Operational metrics βββ Raw API responses
βββ Synthesized outputs βββ Detailed logs
βββ Error signals βββ Environment state
β Read the full Blueprint Implementation Guide
β View the Multi-Agent Playbook (PDF)
Architecture
SPINE operates across three distinct capability layers:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Layer 1: Host Agent β
β Built-in subagent types via host environment β
β (Explore, Plan, code-architect, visual-tester, etc.) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Layer 2: MCP Servers β
β External tools via Model Context Protocol β
β (browser-mcp, next-conductor, research-agent-mcp) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Layer 3: SPINE Python β
β Custom orchestration framework β
β (fan_out, pipeline, ToolEnvelope, AgenticLoop) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Context Stack Structure
SPINE uses a hierarchical context stack for consistent LLM interactions:
{
"global": { "operator": "...", "brand": "..." },
"character": { "speaker": "...", "audience": "..." },
"command": { "task": "...", "success": "..." },
"constraints": { "tone": "...", "format": "...", "do": [], "dont": [] },
"context": { "background": "...", "references": [] },
"input": { "user_request": "..." }
}
Module Structure (v0.4.0)
spine/
βββ core/ # ToolEnvelope, TraceScope
βββ client/ # InstrumentedLLMClient, provider configs, retry/timeout
βββ patterns/ # fan_out(), pipeline(), hermeneutic_loop(), safe_access()
βββ orchestrator/ # AgenticLoop, OscillationTracker, TaskQueue
β βββ context_stack.py # Context stack loader/builder
β βββ context_discovery.py # Layered context discovery L1-L4
β βββ task_router.py # Dynamic Routing β TaskTypeRouter (v0.3.26)
β βββ routing_callbacks.py # Routing callbacks factory (v0.3.26)
β βββ mcp_self_description.py # 4-layer MCP self-description generator (v0.3.28)
β βββ capability_registry.py # Project capability scanning + S41 map
β βββ gap_tracker.py # Structured gap detection and clustering
β βββ executors/ # 7 pluggable executors
β βββ base.py # Executor interface + PlaceholderExecutor
β βββ subagent.py # SubagentExecutor + context stacks
β βββ claude_code.py # ClaudeCodeExecutor (CLI subprocess)
β βββ mcp_orchestrator.py # MCPOrchestratorExecutor
β βββ content_pipeline.py # ContentPipelineExecutor (video/content)
β βββ small_llm_executor.py # SmallLLMExecutor β 3B-8B models (v0.3.27)
β βββ mcp_session_pool.py # MCPSessionPool β persistent sessions (v0.3.28)
βββ agent_os/ # Agent OS 2026 (v0.3.29-v0.4.0)
β βββ ooda.py # OODALoop, OODAConfig, OODACycle, LoopContext
β βββ world.py # WorldState, WorldSnapshot
β βββ outcome.py # Outcome canonical result schema
β βββ process.py # AgentProcess, ProcessManager
βββ memory/ # 7-tier memory system (v0.4.0)
β βββ kv_store.py # Tier 1: namespace-scoped key-value
β βββ scratchpad.py # Tier 2: short-term task notes
β βββ ephemeral.py # Tier 3: session-scoped with decay
β βββ vector_store.py # Tier 4: hybrid semantic + keyword search
β βββ episodic.py # Tier 5: goal-based episode recall (v0.3.29)
β βββ deep_store.py # Tier 6: PostgreSQL + pgvector deep memory (v0.4.0)
β βββ deep_config.py # DeepStoreConfig (connection, decay, scoping)
β βββ graph_memory.py # Tier 7: graph traversal + analytics (v0.4.0)
β βββ hooks.py # MemoryHooks β OODA orient/reflect integration (v0.4.0)
β βββ federated.py # FederatedMemory β cross-project Minna queries (v0.4.0)
β βββ facade.py # MemoryFacade β unified cross-tier search
β βββ verdict_router.py # Routes accept/reject/revise to tiers
β βββ persistence.py # SQLitePersistence, FilePersistence
β βββ embeddings/ # 7 embedding providers
β βββ base.py # EmbeddingProvider ABC
β βββ local.py # SentenceTransformers
β βββ openai.py # OpenAI embeddings API
β βββ voyage.py # Voyage AI (code-optimized)
β βββ onnx.py # ONNX Runtime
β βββ gemini.py # Google Gemini
β βββ keyword.py # TF-IDF fallback
β βββ placeholder.py # Testing/development
βββ grammar/ # EBNF-Rig Veda knowledge annotation
βββ review/ # AI-powered code review
βββ integration/ # Token-optimized MCP execution
βββ enforcement/ # Tiered + Five-Point Protocol enforcement
βββ health/ # Component health monitoring
βββ api/ # FastAPI REST API + /api/reviews
βββ reports/ # Static HTML report generator
βββ logging/ # Structured JSON logging
Tiered Enforcement Protocol
SPINE balances capability usage against overhead costs through a three-tier system:
| Tier | Task Type | Enforcement | Examples |
|---|---|---|---|
| Tier 1 | Simple | None required | Typo fixes, single-file edits |
| Tier 2 | Medium | Recommended | Multi-file changes, new features |
| Tier 3 | Complex | Mandatory | Architecture decisions, research, UI-heavy |
Why Tiered Enforcement?
| Factor | Consideration |
|---|---|
| Token Cost | Parallel subagents = 2-6x cost increase |
| Latency | Subagent spawn adds 10-30 seconds |
| Over-engineering | Simple tasks donβt need orchestration |
| Context Fragmentation | Subagents donβt share full conversation context |
β Try the Interactive Tier Classifier
Core Patterns
Fan-Out (Parallel Execution)
Execute multiple tasks simultaneously with automatic result aggregation:
βββββββββββββββ
β Parent β
β Envelope β
ββββββββ¬βββββββ
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββ ββββββββββββββ ββββββββββββββ
β Analyst A β β Analyst B β β Analyst C β
ββββββββββββββ ββββββββββββββ ββββββββββββββ
β β β
βββββββββββββββββΌββββββββββββββββ
βΌ
βββββββββββββββ
β Aggregate β
β Results β
βββββββββββββββ
Use Cases: Research tasks, parallel code analysis, multi-source data gathering
Pipeline (Sequential Processing)
Chain processing steps with automatic result transformation:
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
β Analyze β βββΆ β Extract β βββΆ βTransformβ βββΆ βSynthesizeβ
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
Use Cases: Document processing, staged analysis, build pipelines
Agentic Loop (Autonomous Execution)
Run tasks until completion with built-in resilience:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AgenticLoop β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββ ββββββββββββ βββββββββββββ β
β β Task βββββΆβ Execute βββββΆβ Evaluate β β
β β Queue β β β β β β
β βββββββββββ ββββββββββββ βββββββ¬ββββββ β
β β β
β βββββββββββββββββββββββββββββββββΌβββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββ ββββββββββ ββββββββββ β
β β ACCEPT β β REVISE β β REJECT β β
β β Done β β Retry β β Skip β β
β ββββββββββ ββββββββββ ββββββββββ β
β β
β OscillationTracker: Detects stuck states β
β (A-B-A-B patterns, repeated errors) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ToolEnvelope (Instrumentation)
Every LLM call is wrapped for full traceability:
βββββββββββββββββββββββββββββββββββββββββββ
β ToolEnvelope β
βββββββββββββββββββββββββββββββββββββββββββ€
β id: "call-abc123" β
β tool: "anthropic:claude-sonnet-4-5" β
β trace: β
β root_id: "task-xyz" β
β parent_id: "orchestrator-001" β
β span_id: "subagent-research" β
β metadata: β
β tags: ["research", "phase-1"] β
β experiment_id: "exp-2025-001" β
β metrics: β
β tokens_in, tokens_out, latency_ms β
βββββββββββββββββββββββββββββββββββββββββββ
Interactive Demos
| Demo | Description |
|---|---|
| Tier Classifier | Determine the appropriate enforcement tier for any task |
| Provider Picker | Choose the right LLM provider based on your task type |
| Cost Calculator | Estimate API costs by model and token usage |
| Fan-Out Simulator | Visualize parallel task execution with configurable workers |
| Pipeline Builder | Build and simulate sequential processing chains |
Use Cases
Autonomous Software Development
SPINE enables coordinated multi-agent workflows for:
- Code Review: Parallel reviewers for security, style, and logic with consensus ranking
- Research Tasks: Multi-source investigation with conflict detection and synthesis
- UI Development: Visual verification with browser automation
- Architecture Design: Structured design reviews with documentation generation
Project Integration
SPINE has been successfully integrated with:
| Project | Integration Type |
|---|---|
| Golden Thread System | Full MVP development with tiered enforcement |
| spine-dashboard | Real-time monitoring via SPINE API |
| Adaptivearts.ai | Research and content generation workflows |
Technical Highlights
Multi-Provider Support
| Provider | Models | Status |
|---|---|---|
| Anthropic | Claude Opus 4.5, Sonnet 4.5, Haiku 4.5 | β Active |
| Gemini 3 Pro, Gemini 3 Flash | β Active | |
| OpenAI | GPT-5.1, GPT-5 mini | β Active |
| xAI | Grok 4.1 | β Active |
Observability Stack
| Component | Purpose |
|---|---|
spine/logging/ |
Structured JSON logs with trace hierarchy |
spine/api/ |
FastAPI REST API with OpenAPI docs |
spine/reports/ |
Self-contained HTML reports with Chart.js |
spine/health/ |
Component health monitoring |
CLI Tools
# Run orchestrator with SubagentExecutor (uses .claude/agents/ personas)
python -m spine.orchestrator run --project /path --executor subagent
# Run with Dynamic Routing (auto-selects executor by task type) [v0.3.26]
python -m spine.orchestrator run --project /path --executor router \
--route CODE:subagent --route RESEARCH:claude-code
# Run with SmallLLMExecutor (3B-8B models via MCP) [v0.3.27]
python -m spine.orchestrator run --project /path --executor small-llm
# Classify task type without executing [v0.3.26]
python -m spine.orchestrator classify --project /path --task-id TASK-001
# Generate MCP self-description for a server [v0.3.28]
python -m spine.orchestrator describe --project /path --server my-mcp
# Run with context stacks from scenario files
python -m spine.orchestrator run --project /path --executor subagent --scenario scenarios/research.yaml
# Run with LLM evaluation
python -m spine.orchestrator run --project /path --llm-eval
# Generate reports
python -m spine.reports generate --title "Sprint Report" --days 7
# Health checks
python -m spine.health --verbose
# Code review
python -m spine.review . --parallel
# Start API server
python -m spine.api --port 8000
Documentation
| Document | Description |
|---|---|
| Blueprint Implementation | How SPINE implements the Multi-Agent Playbook |
| Architecture Overview | System design and components |
| Pattern Guide | Fan-out and Pipeline usage |
| Tiered Protocol | Full enforcement protocol |
| Executor Framework | 7 executor types including SmallLLMExecutor |
| Dynamic Routing | Task classification and executor selection (NEW v0.3.26) |
| SmallLLMExecutor | 3B-8B model orchestration via MCP self-description (NEW v0.3.27) |
| MCP Session Pool | Persistent MCP sessions + self-description generator (v0.3.28) |
| Agent OS 2026 | OODA loop, deep memory hooks, agent processes, task DAGs (v0.3.29-v0.4.0) |
| Memory System | 7-tier memory architecture with MemoryFacade (v0.3.29-v0.4.0) |
| Deep Memory | PostgreSQL+pgvector deep store, graph memory, federation, OODA hooks (NEW v0.4.0) |
| Context Stack Integration | YAML scenario files for prompt building |
| MCP Orchestrator Integration | Optional intelligent tool routing |
| Minna Memory Integration | Persistent cross-session memory |
| Agent Harness Automation | Disable prompts, auto-reload context (Claude Code) |
Reference Materials
| Resource | Description |
|---|---|
| Multi-Agent Playbook (PDF) | Architectural blueprint for production-ready multi-agent systems |
Version History
| Version | Highlights |
|---|---|
| 0.5.0 | Authority Inversion - RunContext as sole runtime truth. 4 modules inverted (FivePointProtocol, VerdictRouter, OODALoop, ContentPipelineExecutor). SkillCompiler cognitive compiler with tool_checker enforcement. PlanValidator (6 checks). Policy tags. 84 new tests. |
| 0.4.0 | Phase 3 Deep Memory - DeepMemoryStore (pgvector Tier 6), GraphMemory (Tier 7), FederatedMemory, MemoryHooks + OODA integration, dashboard health check |
| 0.3.30 | Agent Processes (ProcessManager), Task DAG (dependency resolution, cycle detection) |
| 0.3.29 | Agent OS 2026 β OODA loop, EpisodicMemory, WorldState, Outcome, 7 embedding providers, MemoryFacade |
| 0.3.28 | MCPSessionPool (persistent MCP sessions) + MCP Self-Description Generator (4-layer L0-L3) |
| 0.3.27 | SmallLLMExecutor β orchestrate 3B-8B quantized LLMs via MCP self-description layers |
| 0.3.26 | Dynamic Routing β TaskTypeRouter, classify_task_type, routing callbacks + Pattern C + retry/timeout |
| 0.3.25 | Memory-First Learning Loop β 5 behaviors, gap tracker, capability registry, session consolidation |
| 0.3.24 | Content pipeline, ephemeral session memory, context discovery L1-L4, runtime tier enforcement |
| 0.3.22 | Minna Memory Integration - persistent cross-session memory with graceful fallback |
| 0.3.21 | MCP Orchestrator Integration - optional intelligent tool routing with graceful fallback |
| 0.3.20 | Context Stack Integration - executors use scenarios/*.yaml for prompt building |
| 0.3.19 | Executor Framework - SubagentExecutor, ClaudeCodeExecutor with pluggable design |
| 0.3.18 | Dashboard integration - /api/reviews endpoints for review history |
| 0.3.17 | Inline diff annotations, cost tracking per review |
| 0.3.16 | NEXT.md integration for AgenticLoop |
| 0.3.15 | create_spine_llm_evaluator() factory |
| 0.3.14 | Static HTML report generator |
| 0.3.13 | FastAPI REST API surface |
| 0.3.12 | Health check system, common utilities |
| 0.3.11 | Tier enforcement gate (commit-msg hook) |
| 0.3.10 | Token-optimized MCP execution (57-87% savings) |
| 0.3.9 | ConflictResolver for multi-agent synthesis |
| 0.3.6-8 | AI-powered code review module |
About
SPINE is developed as part of the AdaptiveArts.ai research initiative, focusing on intelligent software development workflows and multi-agent coordination.
The Meta-Goal
βThe goal is not to build the application. It is to build the system that builds the application.β
SPINE embodies this philosophyβitβs a backbone framework that enables building applications through orchestrated multi-agent workflows.
Contact
- GitHub: github.com/fbratten
- Portfolio: View all projects
License
This project is licensed under the MIT License.