Introducing PumpStation: The Runtime Harness for LLM-Driven Workloads

PumpStation ships

Ten Trillion Triangles TPipe has shipped the runtime harness. AI orchestration in 2026 is a ReAct loop. Every framework in production runs the same pattern: LangGraph, CrewAI, AutoGen, the OpenAI Agents SDK, Google ADK, LlamaIndex, Codex CLI, Claude Code, OpenClaw, Hermes Agent. One LLM. A flat tool list. A while loop. The LLM is the brain. The tools are the hands. The conversation is the memory.

Ten Trillion Triangles TPipe is the first framework to ship an orchestration layer that is not a ReAct loop. PumpStation is a runtime agentic harness that treats the LLM as a controller, not a brain. The state machine is the substrate. Three LLMs in three roles. A path system that abstracts turns. A kill switch that caps cost per phase. A separate LLM verifier.

The harness is the eighth container. The loop is the substrate. The LLM is the controller. The rest of this post is the specifics.

The shape of a ReAct loop vs the shape of PumpStation

Here is what every ReAct loop looks like, distilled from the Hermes Agent skill, the Claude Code source leak, the Codex CLI documentation, and the OpenClaw source:

run_conversation():
  build_system_prompt
  loop while iterations < max:
    call_llm(messages, tool_schemas)
    if tool_calls:
      dispatch_each(handle_function_call)
      append_results
    else:
      return

One LLM. One tool list. One loop. The LLM emits text or a tool call. The runtime executes the call. The result goes into the conversation. Repeat.

The LLM is the brain. The tools are the hands. The conversation is the memory. The architecture is the simplest possible shape.

Here is what PumpStation looks like:

runHarnessLoop (outer):
  runPreInitPhase()
  while turnIndex < maxTurns:
    runTurn() (inner)
      health_check -> judge -> dispatch -> path
        -> foreground_agents -> background_agents
        -> memory_update -> compaction
    return Continue | Halt(reason)
  runFinalizationPhase()

Three LLMs in three roles. A judge that evaluates completion. A dispatch that picks the next path. A goal that verifies the work. Paths that abstract multi-step turns. Memory agents that run async. A compaction phase with v3 cursor-based pre-emption. A finalization phase that returns the deliverable.

The loop is the substrate. The LLMs are controllers at specific phase boundaries.

That is the difference in one paragraph. The rest of this post is what that difference buys you.

The seven design choices that have no ReAct equivalent

Every ReAct loop hits the same failure modes in production. PumpStation solves each one as a TPipe native primitive. The seven choices compose into a system.

1. Personality is a first-class, auto-injectable variable

PumpStation’s personality is a property of the harness, not a property of the LLM. The four-tier explicit priority is personality > systemTask > userGuidelines > entryUserPrompt. The personality is injected into the judge and dispatch system prompts at content-build time, in this order, before any other instruction. It travels with the harness across LLM swaps.

When a pipe is configured with RolePlay reasoning, the personality is automatically applied. Every pipe in the harness that uses RolePlay embodies the same persona. A small judge model on one provider and a frontier dispatch model on another share the same character. The persona is structural, not cosmetic.

Hermes Agent’s persona is a one-time system prompt overlay. Codex’s is a flat system prompt. OpenClaw’s is a base persona for the messaging assistant. None of them propagate. None of them enforce priority over system task, user guidelines, and entry prompt. None of them apply to every pipe in the runtime.

Game changer. PumpStation’s personality is structural, not cosmetic. Hermes Agent’s persona is a one-time system prompt overlay. LangGraph has no concept of cross-model persona. Claude Code’s persona does not propagate across providers. PumpStation’s persona travels with the harness. A small judge model and a frontier dispatch model share the same character.

2. Paths compress the dispatch agent’s input

A ReAct loop’s tool list is flat. The LLM sees every tool definition in the system prompt, every turn. Hermes ships 60+ tools. Claude Code ships 80+ commands plus skills plus plugins. Codex ships a 32KB AGENTS.md blob plus the built-in tool set. As tools accumulate, the LLM’s input inflates. Selection accuracy degrades. Small models especially struggle with 60+ tool definitions.

A PumpStation path is a single named function. The dispatch agent sees N path names plus descriptions plus schemas, where each path abstracts an entire turn. The LLM picks one path by name. The path can call 100 tools internally, spawn 5 subagents, run async work, and return one MultimodalContent. The LLM never sees the internal complexity.

A 10-turn task on Hermes with 60 tools burns roughly $0.50 in tool-definition overhead. The same 10-turn task on PumpStation with 12 paths burns roughly $0.05 in path-descriptor overhead. The input saving is 3x to 10x depending on the path-to-tool ratio.

Game changer. The same 10-turn task on LangGraph burns more on tool definitions per turn. CrewAI’s agent definitions bloat the dispatch prompt as agents multiply. Claude Code’s 80+ commands fill the context window before reasoning starts. PumpStation’s 12 paths deliver the signal without the overhead.

3. Paths curate the path’s output

The path’s execution function is a Kotlin function. The developer is in the path. The developer can filter, parse, summarize, transform, or stash the path’s output before the LLM sees it.

A shell call returns 5,000 tokens of directory listing. The path’s execution function can return 200 tokens of curated output. “47 files, 3 modified in the last hour, 2 with errors.” The LLM gets the signal, not the noise. The cost stays bounded. The LLM reasons faster and more accurately over the curated result.

The harness provides three independent layers of output curation. Inside the path, the developer’s code shapes the result. After path validation, the pathTransformationFunction DITL hook transforms the result. After path validation fails, the pathValidationFunction DITL hook can reject the result entirely, causing the original input to flow through. All three layers are in code. All three are deterministic. None of them require an LLM call.

Game changer. LangGraph’s tool wrappers handle errors but not output size. CrewAI’s agent-to-agent handoff loses signal in prose. Claude Code’s tool outputs flow into the conversation raw. PumpStation’s path function is the developer-in-the-loop. The LLM reasons over curated signal, not raw transcripts.

4. Eighteen DITL hooks at every phase boundary

A ReAct loop has one or two hooks at the LLM call boundary. Hermes has pre_tool_call and pre_llm_call. Claude Code has PreToolUse and PostToolUse. Codex has system prompt injection. The hooks can read state, log, and block a tool call. They cannot intervene mid-loop at every phase.

PumpStation has 18 DITL hooks at every phase boundary of the harness. preInvokeFunction returns false to abort the run with InterventionTerminated. pathValidationFunction returns false to reject a path’s result. postGenerateFunction returns a P2PInterface to chain another agent. preCompactionFunction modifies the input to the summary agent. compactionRolledBackFunction overrides the backup restore. Every hook is a suspend function with the full harness state in scope.

The hooks are not middleware. They are first-class control points. The intervention surface is structural.

Game changer. Two hook calls cannot intervene at every state transition. Hermes Agent’s pre_tool_call and pre_llm_call are the entire control surface. Claude Code’s PreToolUse and PostToolUse cover one boundary. LangGraph has no phase-aware hooks. PumpStation’s 18 DITL hooks fire at pre-init, turn boundary, post-path, pre-compaction, post-compaction, every state transition. Production observability is structural.

5. Three-state history with pre-prune transformation

PumpStation maintains three distinct history states. turnSummary is the string at the top of every prompt. turnHistory is the curated middle. rawTurnHistory is the full event log for DITL hooks and the goal agent.

The LLM sees turnSummary plus turnHistory, never rawTurnHistory. The default pre-prune transform drops blank turns, drops stash placeholders, collapses duplicate system messages, drops pure echoes, collapses tool-call and result pairs into one turn, strips excess metadata, normalizes whitespace, and drops turns already in the turnSummary. A custom pre-prune transform can be wired with setPrePruneTransform or appendPrePruneTransform.

The LLM’s input is dense. No terminal vomit. No duplicate system messages. No tool-call sprawl. The LLM reasons over a curated stream, not a growing conversation.

Game changer. Every ReAct loop has one conversation. LangGraph’s MemorySaver persists state across runs but the LLM still sees the full transcript. CrewAI’s memory is a prose summary that drifts. Claude Code’s context grows without bound until truncation kicks in. PumpStation’s three-state history means the LLM sees curated signal. The raw event log survives for DITL hooks and the goal agent’s deep verification.

6. Native memory, compaction optional

PumpStation ships with TPipe’s memory substrate. ContextWindow plus MiniBank plus LoreBook plus TokenBudget plus TruncationSettings. The runtime context algorithm runs at every prompt build. Lorebook entries are selected by priority and weight. Multi-page MiniBank budget allocation is enforced. Token overflow triggers truncation, not compaction.

Three memory management modes. Compaction is the traditional summary-based path. Truncation is TPipe’s TokenBudget plus lorebook selection plus MiniBank allocation with no summarization. Hybrid runs both, auto-promoted from Compaction if a lorebook or summary agent is configured.

In Truncation mode, the developer does not bind a summary agent. The compaction phase returns SkippedNoAgent and does no work. The LLM never sees a compression step. The context management is deterministic. This is impossible in any ReAct loop, where compression is a hand-rolled prompt-trimming hack or a multi-layer cascade.

Game changer. LangGraph’s CheckpointSaver does not compress. CrewAI’s memory is a single prose summary. Claude Code’s context management is window-based with no weighted key selection. PumpStation’s three modes (Compaction, Truncation, Hybrid) are configured at build time. The developer picks. The substrate handles it. The LLM never sees a compression step.

7. Stash and retrieve for oversized outputs

When a path’s output would exceed the context window, the harness stashes it. The full content moves to a stash map. A StashEntry is added to the manifest with id, sourcePath, createdTurn, reason (TokenOverflow, BinaryPayload, ErrorLog, UnsafeForPrompt, DeveloperRequested, BackgroundResult), tokenEstimate, byteSize, and preview. A StashCreated event is emitted. A placeholder goes into the turn history.

The LLM sees the reference. A path designed for stashed content can call getStashContent(stashId, station) to retrieve the full content. A follow-up path can parse the content, summarize it, write it to a file, or route it to a subagent. The multi-path pattern handles oversized outputs without inflating the conversation.

No ReAct loop has a stash. A 50K token tool result is a context blowout in any ReAct loop. In PumpStation it is a StashEntry with a reference and a follow-up path ready to handle the content.

Game changer. A 50K token tool result is a context blowout in every ReAct loop. LangGraph truncates and loses data. CrewAI splits across messages and loses context. Claude Code shows the raw output and burns context on the next turn. PumpStation stashes the reference. A follow-up path can parse it, summarize it, write it to a file, or route it to a subagent. The conversation stays lean. The data survives.

What this unlocks for TPipe

PumpStation is the eighth container, but the architectural surface area is the largest in the system. Three categories of unlock.

Long-horizon agents. PumpStation brings TPipe’s existing long-horizon capabilities to the harness layer. Three-state history (turnSummary, turnHistory, rawTurnHistory) means an agent survives hundreds of turns without context blowout. LoreBook provides weighted key-based recall. MiniBank handles multi-page context. TokenBudget allocates deterministically. The harness sits on TPipe’s memory primitives. An agent that runs for days survives every handoff.

Production safety. The kill switch propagates per phase. The path-safety agent gates medium and high risk paths. 18 DITL hooks give developers first-class intervention points. Compaction can be turned off entirely (Truncation mode). Production is configuration, not research.

Composable orchestration. PumpStation implements P2PInterface. A pump station can nest inside another pump station. A Manifold stage can be a PumpStation. A DistributionGrid node can be a PumpStation. The P2P layer makes the eight containers nestable. An agent on another host is a node in your manifold. The substrate composes.

What PumpStation actually is

PumpStation is a Kotlin class at com.TTT.Pipeline.PumpStation. The runtime lives in eight source files totalling 10,950 lines. The class implements P2PInterface, so a pump station can nest inside another pump station, sit inside a Manifold stage, or live as a node in a DistributionGrid. The class is a P2P agent that drives itself.

The full lifecycle has 13 phases. PreInit runs once at startup. HealthCheck fires proactively before the judge on interval or error ratio. Judge runs (or is skipped in FlagTriggered mode). Dispatch runs. PathSafety gates medium and high risk paths. PathExecution calls the path. PathValidation runs the DITL hook. Intervention fires reactively after a path failure. ForegroundAgents and BackgroundAgents fire harness-level agents. MemoryUpdate queues lorebook and summary work. Compaction runs the v3 orchestrator. GoalValidation runs in runExitFlow when the judge or a path signals completion. Exit emits HarnessCompleted or HarnessFailed.

The two-scope structure is the most-missed concept in the design. Conflating the two scopes is the most common documentation error.

The outer scope is runHarnessLoop. It runs while (turnIndex < maxTurns && status == Running), calling runTurn each iteration. After the while loop exits, it runs runFinalizationPhase once.

The inner scope is runTurn. It runs the per-turn phases and returns Continue or Halt(reason).

The transition between the scopes is runExitFlow. It runs when the judge says complete, or when a path returns passPipeline: true and a goal agent is configured. With no goal agent, the harness exits with JudgeComplete or PassSignal. With a goal agent, the goal validates. Pass means deliver. Fail means re-loop with the goal’s critique appended to history, up to maxGoalFailAttempts.

The judge’s isComplete: true triggers a transition into goal validation. Most agent frameworks conflate the two signals. PumpStation treats completion and verification as distinct phases.

The eight magic contracts

PumpStation exposes eight LLM-facing JSON contracts. Each contract has a parser, a strictness policy, and a fallback. Strict where the harness needs the data to proceed (Dispatch, Path-Safety). Lenient where defaults flow through (Judge, Health, Lorebook, Summary).

The eight contracts: Judge decides completion. Dispatch picks the path (with one repair loop on parse failure). Path returns multimodal output via flags. Goal verifies via the terminatePipeline flag. Path-Safety gates medium and high risk paths with strict boolean JSON. Health monitors runtime state via a HealthReport envelope. Lorebook updates memory keys via a typed envelope. Summary produces the next turnSummary string. (Full JSON schemas in the magic-contracts doc.)

The harness is intelligent about this. It auto-injects the contracts that are required: Judge, Dispatch, Goal, Path-Safety, Health, Lorebook, Summary surface into the relevant pipes at content-build time, and the path descriptor protocol is injected into the dispatch pipe. Contracts the developer does not bind fall back to the flag-driven interface. The developer writes no JSON.

The contracts exist so developers understand they exist. The system is fundamentally designed to allow multimodal content flags: passPipeline, terminatePipeline, interuptPipeline on the returned MultimodalContent. Unlike the Manifold, where JSON contracts are mandatory, the PumpStation developer does not own a contract. The cognitive burden of remembering magic contracts is lifted. The developer has vastly more freedom in agent design patterns.

The path system

PathObject is the harness’s atom. A path can run a local function, host an internal Pipeline, build a fresh agent per invocation, or bind a PCP function. A path can be another PumpStation. Risk levels (Low, Medium, High) gate execution through the path-safety agent. Reserve paths hide from the dispatch agent until a predicate fires, keeping the dispatch prompt bounded.

When to reach for PumpStation vs Manifold

Reach for Manifold when the steps are known but the order is not, or when the agents live on different machines and you need P2P routing between them. Manifold is a state machine. The state graph is fixed at build time. The manager is deterministic. The workers are typed. You wire the stages, the manifold runs them.

Reach for PumpStation when the problem is not known and neither is the solution, but you have a rough workflow you want to execute to solve whatever the incoming task could be. The judge evaluates completion. The dispatch picks the next path. The goal verifies. The paths are LLM-decided. The harness is a runtime, not a state machine.

Manifold is for the case where you know the steps. PumpStation is for the case where you do not.

Manifold is also for the case where the agents are not on your machine. PumpStation executes in a single process. A path can be another PumpStation, but a Manifold can route to a DistributionGrid, and a DistributionGrid can route across machines. If the agent is on another host, the right container is Manifold over DistributionGrid, not PumpStation.

A working example

Here is the smallest harness that runs a single-turn task end to end. The PumpStationDefaults.withOpenRouter factory wires judge plus dispatch plus kill switch plus memory defaults plus tracing in one call. The developer adds the paths and calls executeLocal. This is the Ten Trillion Triangles TPipe equivalent of “hello world” for an LLM agent runtime.

val station = PumpStationDefaults.withOpenRouter(config) {
    path("answer") {
        description = "Responds with a one-sentence answer and signals pass-pipeline."
        setExecutionFunction { content, _, _, _ ->
            MultimodalContent(text = "ok: ${content.text}").apply { passPipeline = true }
        }
    }
}
val result = station.executeLocal(MultimodalContent(text = "Say 'hello' and stop."))

The full source ships with four working examples covering all three exit mechanisms plus a kill switch trip. See TPipe-Defaults/src/main/kotlin/examples/pumpstation/PumpStationOpenRouterExample.kt.

The closer

Every other LLM agent framework in 2026 is a ReAct loop. The ReAct loop is the prototype. The state machine is the platform. PumpStation is the runtime. The LLM is the controller. The substrate is the brain.

PumpStation ships in the Apache 2.0 Manifold tier. The production harness is the same harness that ships to users. The LLM that runs the judge in production is the same LLM that runs the judge in development. The substrate is what survives the difference.

The KillSwitch: Token Budgets That Actually Kill the Agent — The prior post in this series. How the per-phase cost cap propagates through every PathObject in the station, the 66-line Kotlin file that ends with throw, and the catch-and-rethrow carve-out that defends the termination.
The Open Source Lie: What Every AI Agent Stack Actually Costs in 2026 — The TCO table every TPipe license tier sits in. PumpStation ships in the Apache 2.0 Manifold tier; the post shows the 3-year bill versus the frontier-locked and DIY alternatives.
Building Your First TPipe Pipeline — The pipeline configuration that PumpStation sits on top of. The pump station is a P2PInterface; the pipeline is the substrate beneath it.
Reasoning Pipes Explained: How TPipe Stops Prompting and Starts Programming — Why the eight magic contracts are JSON schemas, not free-form prompts. The dispatch and judge pipes in PumpStation are reasoning pipes; the LLM is treated as a compiler, not a conversation partner.

Frequently Asked Questions

What is PumpStation in one sentence?

PumpStation is a runtime agentic harness that drives a two-scope loop (judge, dispatch, path, memory, goal) over a set of typed PathObjects, with an LLM judge for completion and a separate LLM goal agent for verification. The LLM is the controller. The state machine is the substrate.

How is PumpStation different from a ReAct loop?

A ReAct loop is one LLM with a flat tool list and a while loop. The LLM is the brain. The tools are the hands. The conversation is the memory. PumpStation is a state machine with three LLMs in three roles (judge, dispatch, goal), a typed path system, a three-state history, a kill switch, an 18-hook DITL surface, a v3 compaction substrate, and a stash for oversized outputs. The loop is the substrate. The LLM is the controller.

How is PumpStation different from Manifold?

Manifold is a state machine. The stages are known at build time. The routing is deterministic. The workers are typed. Manifold is for when the steps are known but the order is not, or when the agents live on different machines and you need P2P routing. PumpStation is for when the problem is not known and the LLM has to decide. Manifold is the deterministic container. PumpStation is the LLM-driven container.

Does PumpStation work without compaction?

Yes. Set memoryManagementMode to Truncation. The harness uses TPipe's TokenBudget plus lorebook selection plus MiniBank allocation. No summary agent is bound. The compaction phase returns SkippedNoAgent and does no work. Context management is deterministic. The LLM never sees a compression step.

Can a PumpStation be nested inside a Manifold?

Yes. A pump station is a P2PInterface. A PathObject's internal agent can be another PumpStation. A Manifold stage can be a PumpStation. A DistributionGrid node can be a PumpStation. The P2P layer makes the eight containers nestable.

How is the cost controlled?

The KillSwitch. The harness accumulates token usage after every judge, dispatch, and path phase. When the configured input or output limit is exceeded, the kill switch trips. The default callback throws KillSwitchException. The outer loop catches it, sets lastError=KillSwitchTripped, runFinalizationPhase emits HarnessFailed, and the exception is re-thrown. The kill switch is propagated to every PathObject in the station. Per-path limits are honored independently.

What LLM providers does PumpStation work with?

Any provider that exposes an OpenAI-compatible chat completions endpoint or an Anthropic-compatible messages endpoint. The PumpStationDefaults.withOpenRouter factory is the reference. Pipe models wrap the model. The harness wraps the pipe.

How do I get started?

The Apache 2.0 Manifold tier includes the full TPipe feature set. The PumpStation source ships in the main branch. The TPipe-Defaults package has a PumpStationDefaults.withOpenRouter factory and a four-example file. The harness is a class, not a service. It is java -jar myapp.jar or a GraalVM native binary.