Reasoning Pipes
Force any LLM to think using structured JSON control over token prediction. Chain-of-Draft, Role Play Reasoning, and other thinking modes — bypasses internal weights for true behavioral control.
TPipe is an Agent Operating Substrate — production infrastructure for autonomous AI agents that survive long-horizon tasks without memory loss or failure recovery gaps.
Built for agents that run for days, not minutes. Reasoning Pipes — structured JSON control over token prediction, forces any LLM to think regardless of native capability. Persistent ContextBank memory across distributed runs. Pipeline orchestration with pause/resume/jump. P2P agent coordination without dispatcher bottlenecks.
TPipe is infrastructure your agents inhabit — providing the foundation to build the agents of tomorrow, today.
What makes TPipe different from traditional agent frameworks?
Force any LLM to think using structured JSON control over token prediction. Chain-of-Draft, Role Play Reasoning, and other thinking modes — bypasses internal weights for true behavioral control.
Thread-safe persistent memory with weighted lorebook injection, substring-triggered activation, and token-budget-aware retrieval. Custom hooks for write-back and on-read transformations.
Sequential orchestration with pause/resume/jump control. Declarative pause points for developer-in-the-loop validation.
Stateful multi-agent orchestration. A manager pipeline dispatches to registered workers, cycles until explicit pass or terminate, with configurable context truncation and overflow protection.
Secure multi-language function calling. Transport executors for Stdio, HTTP, Python, Kotlin, JavaScript.
Collaborative discussion, voting, and workflow handoff between pipeline agents. Junction enables multi-agent consensus.
8,773 LOC of distributed infrastructure — node routing, P2P discovery, remote pipeline handoff, and cluster orchestration in a single container.
All TPipe containers implement P2PInterface — registry-based discovery, capability registration, and secure cross-pipe calls via TPipe, HTTP, or STDIO transports.
See TPipe powering real-world AI systems
Multi-agent debugger. Agents test bugs, capture crashes, step through code via DAP/ADB. Long-horizon reliability for debugging complex systems.
Coming SoonHeadless game master. 25 rounds, 120+ turns inference without degradation. Qwen 30B outperforms Claude Opus on complex narrative tasks.
Join the Waitlist →Long-horizon manuscript orchestration. 300-page coherent manuscript with multi-stage refinement, maintaining consistency across thousands of generations.
View on GitHubHow does TPipe orchestrate AI agents?
TPipe is an Agent Operating Substrate — not a library you call, an environment your agents inhabit.
See all comparisons →| Feature | TPipe | LangChain | CrewAI |
|---|---|---|---|
| Category | Agent Operating Substrate | Agent Framework (library) | Agent Framework (crew) |
| Memory | ContextBank — persistent, global, thread-safe via per-key mutex locks. `emplaceWithMutex` / `getContextFromBank` for thread-safe writes and reads. LoreBook entries activate via substring matching (key + aliasKeys), weighted retrieval, token-budget-aware selection. | ConversationMemory object — scoped to a single run | Task output persistence via SQLite (KickoffTaskOutputsSQLiteStorage). Optional LaterMemory for long-term storage. Per-crew, not per-run — survives crew restarts. |
| Reasoning | 8 reasoning methods: Structured CoT, Explicit CoT, Process-Focused CoT, Best Idea, Comprehensive Plan, Role Play, Chain of Draft, Semantic Decompression. 5 injectors (system prompt, before/after user prompt, converse history, context). Multi-round Focus Points. | Prompt engineering within chains — LLM thinks however it wants | Prompt engineering within agent roles — LLM thinks however it wants |
| Token Governance | Token counting + truncation (ContextWindow, LoreBook, MiniBank, Dictionary). Tunable per-model tokenizer with TPipe-Tuner. Memory resource management — NOT a termination mechanism. | Token limits are advisory — set per-call with max_tokens | Token limits are advisory — set per-call with max_tokens |
| Long-Horizon Tasks | Autogenesis runs continuously, processing hundreds of millions of tokens with zero drift failures. 120+ turn tasks validated in production. ContextBank + LoreBook keep memory reproducible across long horizons. | Context degrades past 30–50 turns without manual truncation | Context degrades past 30–50 turns without manual truncation |
| Safety / Governance | KillSwitch — uncaught exception, bypasses all retry, propagates through container hierarchy. Manifold loop limit (default 100, throws ManifoldLoopLimitExceededException) — Manifold only. TraceServer is a separate module: REST + WebSocket dashboard with dual auth (agent bearer + client session). | Retry policies — can be caught and ignored | Retry policies — can be caught and ignored |
| DITL Hooks | 18 named hooks across three layers: PumpStation (preInitFunction, preValidationJudgeFunction, preValidationDispatchFunction, preInvokeFunction, postGenerateFunction, pathValidationFunction), Pipe (validatorPipe, validatorFunction, transformationPipe, transformationFunction, branchPipe, onFailure), Pipeline callbacks (preValidationFunction, conditionalPauseFunction, pauseCallback, resumeCallback, pipeCompletionCallback, pipelineCompletionCallback). | Callback hooks (limited) | No native support |
| Flow Control | Pause / Resume / Jump at declarative points. pauseBeforePipes(), pauseAfterPipes(), pauseOnCompletion(), pauseWhen with a predicate, enablePausing(). Pipes jump via validation return. passPipeline / terminatePipeline flags on MultimodalContent. | Conditional edges in graph — explicit programming required | Process/Task hooks — limited to process-level |
| Multi-Agent | Manifold (state-machine), Junction (voting/handoff), DistributionGrid (cluster) — three distinct patterns | LangGraph — graph-based orchestration only | CrewAI — role-based crews with manager inherit |
| Agent-to-Agent | P2P pipe-to-pipe. Registry-based discovery via P2PDescriptor. Transports: TPipe, HTTP, Stdio. Built into all P2PInterface containers. Per-agent security boundary. | No native P2P — requires external service mesh | No native P2P — inter-crew via external services |
| Deployment | JVM-native (Kotlin), headless-first. Runs as java -jar TPipe-*.jar on JVM 24. GraalVM Native Image — 50MB binary, no JVM at runtime, sub-128MB footprint, millisecond startup, ARM and mobile targets. | Python runtime required — full interpreter | Python runtime required — full interpreter |
| Tool Calling | PCP — Pipe Context Protocol. 6 transports: Stdio, TPipe, HTTP, Python, Kotlin, JavaScript. Security managers per language. Access control via allowedDirectoryPaths, forbiddenDirectoryPaths, allowedFiles, forbiddenFiles. Output validated through PcPResponseParser. | Standard function/tool calling with LCEL — no structured validation gate between tools and next step | Standard function/tool calling — no structured validation gate between tools and next step |
| Runtime | Linux, macOS, Windows, ARM. JVM 24. GraalVM Native Image — 50MB binary, no JVM at runtime, millisecond startup, ARM and mobile targets. | Python only | Python only |
Agent infrastructure — not a framework you call, an environment your agents inhabit.
TPipe provides: Reasoning Pipes (Chain-of-Draft, 75% token reduction), persistent ContextBank memory across distributed systems, Pipeline orchestration with pause/resume/jump, multi-agent patterns (Manifold, Junction, DistributionGrid), and strict token governance enforced top-down. Agents run inside TPipe — they don't invoke it.
Different category — TPipe is infrastructure, not a library.
LangChain and CrewAI are Python frameworks you call. TPipe is an OS-like substrate your agents live inside. Key practical differences: persistent ContextBank memory across sessions vs conversation objects scoped to a single run; token budgets enforced top-down vs per-call limits; headless-first GraalVM binary vs Python scripts. See the full comparison table for the complete picture.
A pipe that forces any LLM to think — including models with no native thinking mode.
Reasoning Pipes work by using structured JSON to control left-to-right token prediction, forcing the LLM to produce a structured prediction of thinking before it produces output. Chain-of-Draft, Role Play Reasoning, and other thinking modes are different control structures applied through this mechanism. The key insight: this bypasses the model's internal weights and behavior patterns — you control what the model focuses on and when, independent of what the model was fine-tuned to do.
Persistent memory that survives distributed systems — not a conversation object, a shared state layer.
ContextBank persists across sessions and distributed nodes. Weighted lorebook injection with substring-triggered activation. Token-budget-aware retrieval — ContextBank doesn't just store, it selects what to surface based on the current context. Custom hooks for write-back and on-read transformations. Survives 120+ turn conversations without degradation.
Three distinct patterns — not one-size-fits-all.
Manifold: state-machine manager-worker orchestration. Manager dispatches to registered workers, cycles until explicit pass or terminate, configurable context truncation and overflow protection. Junction: democratic voting and workflow handoff between pipeline agents. DistributionGrid: cluster-wide P2P routing with 8,773 LOC of distributed infrastructure. Each handles a different collaboration topology — coordinated teams, peer-to-peer handoff, or node-spanning clusters.
Two steps: configure a Pipe, compose into a Pipeline.
Install via Gradle. Configure a Pipe with your model (Bedrock, Ollama, OpenRouter — or any LLM via transport executors). Set a TokenBudget. Add a Reasoning Pipe if you want Chain-of-Draft. Chain pipes into a Pipeline with pause/resume/jump at declarative validation points. Start simple — one pipe — and compose as your system grows.
Pipes chain sequentially — output of one becomes input of the next. Every transition is a validation point.
Declarative pause points let you insert a blocking state at any pipe boundary. Resume continues with updated context. Jump skips forward or backward to any named pipe. Every pipe has 7 DITL intervention points: Pre-Init, Pre-Validation, Pre-Invoke, Post-Generate, Validator, Transformation, On-Failure. Human validation gates are part of the pipeline declaration — not bolted on.
Native code entry points at every phase of pipe execution — not bash hooks, not string manipulators.
Each of the 7 intervention points gives you direct access to the content object and TPipe substrate. Inspect or modify context before the LLM sees it. Validate or transform output before the next pipe receives it. Redirect flow based on conditions you evaluate. Inject logic without touching the core pipe — the substrate handles the mechanics, your code handles the judgment.
KillSwitch is a forced termination that cannot be caught and ignored.
When token limits are exceeded, KillSwitch propagates as an uncaught exception — no retry handler can absorb it, no fallback can silently continue. Loop Limit halts after configured iterations (default: 100) and throws ManifoldLoopLimitExceededException. Both are fail-safe mechanisms: when something goes wrong in a TPipe system, it stops — it doesn't limp.
Any LLM accessible via standard transport executors — AWS Bedrock, Ollama, OpenRouter are instances of the pattern.
Transport executors: Stdio, HTTP, Python, Kotlin, JavaScript. If you can send a request and receive a response, you can build a Pipe around it. Configure credentials via environment variables or IAM roles. TPipe's P2P interface means any container can discover and call any other container — model selection is a configuration choice, not an architectural constraint.
Real-time WebSocket streaming to a browser dashboard — every decision captured, indexed, and replayable.
Configure detail levels from Minimal to Debug. Output formats: JSON, HTML, Markdown. Automatic cycle detection for nested pipes. Enable with setTraceConfig(), connect browsers to the TraceServer endpoint. Every trace is a full execution record — what the LLM received, what it produced, what the DITL hooks did, where tokens were spent. Useful for production auditing and for reproducing production failures in a local debug loop.
TokenBudgetSettings enforces strict top-down accounting — not advisory limits.
Configure max tokens per pipe, context window size, and automatic truncation strategies (Top, Bottom, Middle). Token budgets can subtract from input rather than output — meaning you can carve out space for lorebook context before the main prompt hits the window. KillSwitch fires automatically on overrun. The same input reliably produces the same output — which is the requirement for enterprise compliance.
GraalVM Native Image ships as a 50MB binary — no JVM required, sub-128MB memory footprint.
TPipe is headless-first: no UI, runs as a cluster of headless processes. P2P Registry enables agent discovery across nodes. DistributionGrid handles cluster-wide orchestration. Docker and Kubernetes compatible — the same binary that runs locally runs in a pod. GraalVM Native Image means startup in milliseconds, not minutes. Linux, macOS, Windows, ARM, Android (.so), iOS (.dylib).
Same input, same output — every time. Deterministic execution is the product.
Configurable timeout/retry with Fail, Retry, or CustomLogic strategies. Snapshot-based state restoration on retry — parent pipe failure propagates recursively to child pipes. TraceServer audit trail captures every decision. Token budgets enforced top-down. KillSwitch guarantees forced termination. When something fails, you have the full trace of what happened and why — not just an error log.