Head-to-Head

TPipe vs Koog

Two architectures. TPipe is the agent operating substrate for production headless deployments. Koog is JetBrains' JVM-native graph-based agent framework with Spring AI integration.

Category Agent Operating Substrate vs JVM Graph Framework
Paradigm 4th-gen substrate — managed environment vs 2nd-gen graph — directed state machines
Memory ContextBank + LoreBook — deterministic vs AgentMemory + RAG — probabilistic
Reasoning Chain-of-Draft — 75% token reduction vs No custom reasoning mechanism
P2P Architecture DistributionGrid mesh — no coordinator vs A2A — client-server hub-and-spoke
Verdict 10 of 11 dimensions vs 1 draw, 0 outright wins

Why This Comparison Matters

Koog is the only other production-grade Kotlin/JVM-native AI agent framework shipping today. If you're a JVM team evaluating agent infrastructure, Koog is on your shortlist — and so is TPipe. Both launched in the same 12-month window. Both target enterprises fleeing Python frameworks. Both speak the language your team already uses.

The question is not "which one is easier to start with." The question: are you deploying headless agents that need substrate-level enforcement, or are you building graph-based workflows where Spring AI integration and framework-level access patterns are enough?

TPipe is the agent operating substrate. Persistence, resource governance, protocol enforcement, and explicit flow control (pause/resume/jump/terminate) are infrastructure primitives — not application code. ContextBank persists state across distributed runs. Chain-of-Draft compresses internal reasoning by 75%. DistributionGrid runs a P2P mesh with no coordinator, trust-chained discovery, and 16-hop limits. KillSwitch propagates as an uncaught exception through the entire container hierarchy. TPipe runs on JVM bytecode (default) or compiles to a GraalVM Native shared library for iOS, Android, embedded, and edge targets — both runtimes supported.

Koog is a graph-based framework. As of 1.0 (May 27, 2026) it ships four agent types — basic, functional, graph-based, and Planner (beta). It is well-engineered, has JetBrains' distribution channel, and integrates with Spring AI for the Spring ecosystem. What Koog 1.0 added is operational maturity: a stable/beta module split with a one-year breaking-change guarantee, multiplatform OpenTelemetry to Langfuse/Weave/DataDog, Anthropic prompt caching, LiteRT on Android, redesigned Java interop, and a decoupled HTTP transport. None of that closes the architectural gap. Koog remains a framework you call into; TPipe is the substrate your agents inhabit.

Ten of eleven dimensions go to TPipe. One is a Draw — Paradigm, where substrate and graph framework are different design centers with different answers. Koog wins zero outright. Here is the structural breakdown.

Architecture Comparison

Capability
TPipe
Koog
Paradigm

What it actually is

Agent Operating Substrate

Managed environment the LLM runs inside. Persistence, governance, flow control, and protocol enforcement are infrastructure primitives.

Graph Framework

Directed state machines with nodes and edges. Basic, functional, graph, and Planner (beta) agent types. Spring AI as the model layer above.

Memory Model

How state persists

ContextBank + LoreBook — 3-tier architecture. ContextWindow per-run, ContextBank global thread-safe singleton, LoreBook weighted keyword-triggered recall with substring matching. Deterministic. Auditable. Reproducible. When a production incident occurs, "the keyword didn't match" is debuggable. RAG's "the similarity threshold was calibrated incorrectly" is not. Production-validated: 120+ turn tasks survived in Autogenesis running continuously, processing hundreds of millions of tokens without drift.

AgentMemory + RAG — hierarchical organization (subjects and scopes), encrypted storage, Chat/Long-Term split, plus RAG for vector similarity search. History compression in 5 strategies (NoCompression, WholeHistory, ChunkedHistoryCompression, etc.). Memory is probabilistic at retrieval time — vector similarity introduces a calibration step the operator must tune.

Reasoning Optimization

How you compress what the LLM thinks

8 reasoning methods via ReasoningBuilder: StructuredCoT (analyze→plan→execute→validate), ExplicitCoT (transparent step-by-step), processFocusedCoT (methodological justification), BestIdea, ComprehensivePlan, RolePlay, ChainOfDraft, and SemanticDecompression. 5 injectors: SystemPrompt, BeforeUserPrompt, AfterUserPrompt, BeforeUserPromptWithConverse, AsContext. Multi-round Blind and Merge round modes with focus points. Chain-of-Draft — compresses verbose Chain-of-Thought into minimal internal drafts (`[factor1, factor2] → conclusion` format, under 20 tokens). Reasoning is never exposed in output; it lives in the model's compressed internal state. 75% token reduction, 78% latency decrease, academically backed (HuggingFace 2502.18600, Zoom AI). Production benchmarks: Financial Analysis 76%, Code Review 78%, Document Classification 75%, Customer Support Triage 75%.

Anthropic prompt caching (1.0) — reuses already-billed prompt prefixes. History compression — 5 strategies for managing already-generated content. These are not reasoning optimizations — they are transport and storage optimizations. Prompt caching reduces cost on cached prefix reads (~90% per Anthropic); it does not compress the model's internal reasoning. History compression operates on prior output, not on the reasoning process. Different axis from Chain-of-Draft.

P2P Architecture

How agents discover and route to each other

DistributionGrid — P2P mesh with no coordinator. One instance = one node. Each node has a router role and a local worker role. Task exchange over `Transport.Tpipe`, `Transport.Http`, `Transport.Stdio` interchangeably. P2PRegistry with SHARED/ISOLATED concurrency modes, agents advertise capabilities via P2PDescriptor. P2PRequirements enforces security boundaries at the boundary, opaque to callers. Trust-chained discovery. Envelope-based RPC with cycle detection and 16-hop limits. Lifecycle hooks at every routing stage. KillSwitch propagates through call chains, accumulating costs from root agent down.

A2A (Agent2Agent Protocol) — standardized HTTP/JSON protocol for agent-to-agent communication across platforms and clouds. Agents register with an A2A server, clients discover via the protocol, tasks delegate through it. Hub-and-spoke topology: the A2A server is the coordinator. Solves cross-framework agent communication. Does not solve fault-tolerant mesh scaling without a coordinator.

Multi-Agent

Coordination topologies

Three distinct patterns. Manifold — state-machine manager-worker orchestration, shared ConverseHistory, summary pipelines, init functions, validation hooks. Junction — democratic discussion/voting harness with 6 role-based workflow recipes (VOTE_PLAN_OUTPUT_EXIT, PLAN_VOTE_ADJUST_OUTPUT_EXIT, VOTE_ACT_VERIFY_REPEAT, ACT_VOTE_VERIFY_REPEAT, VOTE_PLAN_ACT_VERIFY_REPEAT, PLAN_VOTE_ACT_VERIFY_REPEAT), 7 binding kinds (MODERATOR, PARTICIPANT, PLANNER, ACTOR, VERIFIER, ADJUSTER, OUTPUT), and 3 DiscussionStrategies (SIMULTANEOUS, ROUND_ROBIN, CONVERSATIONAL). DistributionGrid — P2P grid-harness for distributed node clusters.

Subgraphs — composable agent architectures, more primitive than TPipe's containers. A2A protocol — cross-platform/cross-cloud communication. Agent-as-tool — dynamic agent creation within tool call functions. Planner agents (1.0 beta) — iteratively build and execute a plan until state matches desired conditions. No voting harness — no JVM framework ships Junction's role-based recipe + moderator-intervention democratic decision-making.

Safety / Governance

What happens when something goes wrong

KillSwitch — emergency halt when token limits exceeded. `KillSwitchException` propagates as an uncaught exception, bypassing all retry policies and exception handlers. Works at every container level with automatic child propagation. Accumulates costs from root agent down. 7 DITL hook points — Pre-Init, Pre-Validation, Pre-Invoke, Post-Generate, Validator, Transformation, On-Failure. Token Budgeting enforced. Loop Limits configurable.

Checkpoint/restore — state machine save/restore, with a known issue: `ctx.storage` state is NOT restored on checkpoint restore (GitHub issue #1944, open since May 2026, status unverified in 1.0 release notes). Trust Layers — token-based authorization, data encryption, human-in-the-loop validation. Agent Events — tracks lifecycle changes, tool calls, LLM requests. 1.0 stability commitment: "no breaking changes for stable modules for at least one year" — applies to API stability, not functional bug status.

Tool Calling

How functions execute and validate

PCP (Pipe Context Protocol) — multi-language sandbox with internal execution. Kotlin (~10-50ms startup, native JVM speed, shared memory), JavaScript (~50-200ms startup, Node.js isolated process), Python (external process), Native functions. Security managers per language: `KotlinSecurityManager` validates imports, packages, reflection, ClassLoader, system access. `JavaScriptSecurityManager` checks `require()`, `eval`, file/network patterns. AST validation before execution. Whitelist enforcement via `PcpFunctionHandler`.

Type-safe tools with automatic schema generation. MCP (Model Context Protocol) integration for external tool servers. Class-based tools with registries. Annotation-based tools (with known issue #798: code silently stops). Built-in tools. MCP connects to external tool servers — it does not run tools inside the agent's execution environment. PCP and MCP solve different problems.

Observability

How you see what's happening

TraceServer — self-contained Kotlin module, single standalone JAR or embedded mode. REST API for trace submission and retrieval. WebSocket endpoint for live streaming to browser dashboards. Dual authentication: separate bearer tokens for agents and dashboard clients. In-memory storage — no external database, no infrastructure dependencies. Built-in HTML dashboard with search, filtering, live updates. Cost: $0. Included with all TPipe tiers. No subscription, no per-unit fees, no cloud required.

Agent Events — lifecycle, tool calls, LLM requests. Multiplatform OpenTelemetry (1.0) — Langfuse, Weave, DataDog export on every Koog target (JVM, Android, iOS, JS, WasmJS) via a Ktor-based OTLP/JSON exporter. Langfuse integration — Cloud Core $29/mo, Pro $199/mo, Enterprise $2,499/mo. W&B Weave — $2,100/mo base commitment plus usage. DataDog — separate SaaS subscription. Self-hosted Langfuse is MIT but requires PostgreSQL, Redis, S3-compatible storage.

Language/Runtime

JVM-first vs KMP-first

JVM-first, with GraalVM Native as an optional AOT compilation target. Kotlin / JVM (Java 24+) is the default runtime. GraalVM Native Image compiles TPipe to a ~50MB native shared library (.so/.dylib) for iOS, Android, ARM, embedded systems, and edge devices. Both runtimes supported. The substrate also uses JVM-specific features that KMP common code cannot expose: ClassLoader isolation, full java.lang.reflect reflection, JVMTI for runtime inspection, java.lang.invoke Method Handles, StackWalker for stack introspection, JDK Flight Recorder for production telemetry, JNI for native code integration, HotSpot-specific optimizations.

KMP-first, JVM-second. Kotlin Multiplatform compiles common code to JVM, Android, iOS, JS, WasmJS. The cross-platform API surface lives in commonMain and cannot use JVM-specific features (ClassLoader, JVMTI, GraalVM Native, Method Handles, StackWalker, JFR, JNI) — those would only work in jvmMain source set, fragmenting the codebase per target. Koog 1.0 advertises JVM as a target because the market is pointing at JVM, but the architecture is KMP-first. Trade-off: cross-platform code sharing, no JVM-specific power in the common API.

Multiplatform Mobile

Where the agent runs

JVM bytecode (default) or GraalVM Native Image. Default: java -jar TPipe-*.jar on JVM 24, runs anywhere a compliant JVM is available (server, container, dev). Optional: GraalVM Native Image compiles to a ~50MB native shared library (.so/.dylib) for iOS, Android, ARM, embedded systems, and edge devices. Millisecond cold start, closed-world AOT. Trade-off: no JS/WasmJS targets; mobile via native compilation, not JVM bytecode.

KMP → JVM, Android, iOS, JS, WasmJS from a single Kotlin codebase. JVM-derived compilation, cross-platform code sharing. LiteRT on Android (1.0) — on-device model inference via TensorFlow Lite. Trade-off: JVM-derived compilation, not native machine code; broader target surface, larger binary, JVM at runtime.

Pricing / TCO

Total cost at production scale

Manifold $7,500/yr, all-inclusive. TraceServer, KillSwitch, PCP, all container types, all 8 reasoning methods, all P2P transports, both JVM and GraalVM Native runtimes — all included. No per-seat pricing, no usage-based fees, no SaaS subscriptions required. Pipe and Pipeline tiers available for free / development use.

Framework is free (Apache 2.0). Langfuse Core $29/mo (100k units/mo + $8/100k additional). Langfuse Pro $199/mo (500k units/mo). Langfuse Enterprise $2,499/mo. W&B Weave $2,100/mo base. DataDog separate subscription. At commercial scale — millions of agent traces per month — Langfuse Enterprise alone exceeds the cost of TPipe Manifold. Koog's "free" framework is not free at production scale.

When to Choose TPipe

TPipe is the right choice when:

When to Choose Koog

Koog is the right choice if you are already locked into the Spring or JetBrains ecosystem and need Spring AI integration, the framework-level access pattern, and JetBrains' distribution channel. The 1.0 stability commitment — no breaking changes for stable modules for at least one year — is a real operational maturity signal for enterprise teams that have been burned by LangChain's release velocity.

Beyond that, the structural limits show. The KMP-first architecture is the binding constraint: the cross-platform API surface in commonMain cannot use JVM-specific features (ClassLoader, full reflection, JVMTI, Method Handles, StackWalker, JFR, JNI, GraalVM Native) without forking the codebase per target. Memory is probabilistic at retrieval, not deterministic. A2A solves cross-framework communication, not fault-tolerant mesh scaling without a coordinator. The production observability stack requires SaaS subscriptions that, at scale, exceed TPipe's all-inclusive Manifold tier. The competitive question is whether you want framework-level composition across multiple targets (Koog's design center) or substrate-level enforcement on the JVM (TPipe's design center). Different categories; different answers.

Adopting TPipe for Production

The shift is architectural, not syntactic. Koog's graph strategies and Spring AI orchestration don't translate to TPipe pipelines line-by-line. TPipe is a substrate; adopting it means inhabiting a different runtime with enforcement at every layer.

1

Adopt ContextBank for persistent distributed state

Koog's AgentMemory organizes by subjects and scopes, with RAG for vector similarity search. ContextBank persists across runs, across distributed nodes, with weighted LoreBook injection and substring-triggered activation. State you were managing in AgentMemory scopes becomes a ContextBank entry with deterministic retrieval. RAG's vector similarity becomes LoreBook's keyword triggers — auditable, debuggable, production-incident-tractable.

2

Adopt Pipeline for declarative flow control

Koog's strategy graphs compose nodes and edges. TPipe Pipelines chain `Pipe` subclasses with declarative pause/resume/jump at validation boundaries. pauseBeforePipes(), pauseAfterPipes(), pauseOnCompletion(), pauseWhen are first-class substrate primitives. The mental model is a state machine with enforcement points, not a graph you wire by hand.

3

Adopt Chain-of-Draft for token-cost discipline

Koog's Anthropic prompt caching is a transport optimization for repeated prompt prefixes. Chain-of-Draft is a reasoning optimization that compresses internal reasoning steps into minimal drafts. They operate on different axes. At commercial scale — where 10 million LLM calls per day is the unit of measurement — Chain-of-Draft's 75% token reduction is a structural cost advantage that prompt caching does not address.

4

Adopt Manifold, Junction, and DistributionGrid for multi-agent coordination

Koog has subgraphs, A2A protocol, Agent-as-tool, and Planner agents (beta). TPipe provides three distinct multi-agent patterns: Manifold for state-machine manager-worker orchestration, Junction for democratic voting workflows, DistributionGrid for cluster-wide P2P. A2A solves cross-framework communication; DistributionGrid solves fault-tolerant mesh scaling without a coordinator. Junction's role-based workflow recipes + moderator intervention have no JVM equivalent.

5

Adopt TraceServer for self-hosted observability

Koog's observability stack requires Langfuse, W&B Weave, or DataDog — all SaaS, all with per-unit billing that scales with production volume. TraceServer is self-hosted observability built into TPipe: WebSocket streaming, replayable traces, dual authentication, no subscription, no per-unit fees, no data leaves your infrastructure. At commercial scale, TPipe Manifold is cheaper than Langfuse Enterprise alone.

6

Adopt KillSwitch for forced termination

Koog's checkpoint/restore has an open bug (issue #1944) where `ctx.storage` is not restored. KillSwitch fires as an uncaught exception when accumulated tokens exceed a configured cap — it cannot be absorbed by retry policies or exception handlers. It propagates through the entire container hierarchy. Set KillSwitch on the container and it propagates down. This is the governance model enterprise deployments require.

Frequently Asked Questions

Is Koog easier to learn than TPipe?

Different learning curve, not easier. Koog is a graph framework; if you have built with LangGraph or Spring AI, the strategy graph pattern is familiar. TPipe is infrastructure your agents inhabit — if you arrive expecting to translate graph nodes to TPipe pipes 1:1, the mental model will resist. If you arrive with a clear picture of what production agent infrastructure needs (headless operation, deterministic memory, P2P coordination, enforced governance), the substrate concepts click fast. The documentation assumes you have built with graph frameworks and want to understand what TPipe provides beyond them.

Can I use Koog and TPipe together?

No. They are architecturally different — substrate versus framework, JVM bytecode (or GraalVM Native for native targets) versus KMP common code, enforced governance versus advisory policies, DistributionGrid P2P mesh versus A2A hub-and-spoke. Composing them at the integration boundary creates accidental complexity. Pick the one that fits your deployment target. Headless-first production agents: TPipe. Kotlin workflows with Spring AI integration: Koog.

Does Koog 1.0's Anthropic prompt caching match Chain-of-Draft?

No. They solve different problems. Anthropic prompt caching reuses already-billed prompt prefixes — same number of output tokens, applies only to Anthropic models with caching enabled, reduces cost on cached prefix reads. Chain-of-Draft compresses internal reasoning steps into minimal drafts — fewer output tokens per reasoning call, applies to every model, 75% token reduction. Prompt caching is a transport optimization; Chain-of-Draft is a reasoning optimization. Different axes.

What about Koog's Spring AI integration?

Real production-readiness signal. Spring AI as the model layer, Koog as orchestration above it. For Spring ecosystem teams, this is a low-friction on-ramp to JVM agent development. TPipe does not target Spring AI integration — TPipe targets headless-first production infrastructure. Different deployment targets. If your team is Spring-first and your use case is framework-level agent composition, Koog fits. If your team is infrastructure-first and your use case is server or container, TPipe fits.

How does TPipe handle Koog's GraphStrategy limitations?

Graph strategies model workflows as directed state machines with nodes and edges. TPipe Pipelines model workflows as chains of `Pipe` subclasses with declarative pause/resume/jump at validation boundaries. The architectural difference: graph strategies require explicit programming of conditional edges; TPipe pipelines provide infrastructure primitives for flow control that you declare, not implement. Junction's role-based recipes and DistributionGrid's P2P routing have no graph-strategy equivalent.

Does TPipe require GraalVM Native Image for production?

No. TPipe supports both JVM bytecode (default, Java 24+) and GraalVM Native Image (optional AOT target for iOS/Android/embedded/edge). Run java -jar TPipe-*.jar on any compliant JVM, or compile to a ~50MB native shared library for native targets. Both runtimes are production-supported. Koog's Kotlin Multiplatform targets the same deployment surfaces via cross-platform code sharing, not native machine code compilation.

What does KMP-first mean for Koog's JVM capabilities?

It means Koog is a Kotlin Multiplatform framework with JVM as one of several compile targets (JVM, Android, iOS, JS, WasmJS). The cross-platform API surface lives in commonMain and cannot use JVM-specific features — ClassLoader, full java.lang.reflect reflection, JVMTI, java.lang.invoke Method Handles, StackWalker, JDK Flight Recorder, JNI, GraalVM Native Image. Those would only work in jvmMain, fragmenting the codebase per target. Koog 1.0 advertises JVM as a target because the market is pointing at JVM, but the architecture is KMP-first. TPipe is JVM-first: the substrate uses JVM-specific features directly because JVM is the design center, not a target among many. On the JVM target, TPipe has access to power Koog cannot replicate in commonMain.

See Also