What is the TPipe pipe API?

TPipe.kt exposes ReasoningPipe, ContextBank, Minibank, and lorebook classes for headless agent execution.

How do I configure token budgeting?

Use TokenBudgetSettings with maxTokens, contextWindowSize, and truncation strategy (Top, Bottom, Middle).

reference ~3 min

Ollama Pipe Class API

- [Overview](#overview)

Ollama Pipe Class API

Overview
Public Functions
Ollama-Specific Options

Overview

The OllamaPipe class provides a TPipe abstraction for interacting with local Ollama instances. It supports multimodal inputs, streaming, reasoning extraction (specifically for models like DeepSeek-R1), and native tool calling.

class OllamaPipe : Pipe()

Public Functions

Server Configuration

`setIP(ip: String): OllamaPipe`

Sets the IP address of the Ollama server. Defaults to 127.0.0.1.

`setPort(port: Int): OllamaPipe`

Sets the port number of the Ollama server. Defaults to 11434.

`init(): Pipe`

Initializes the pipe. Checks if the Ollama server is running; if not, attempts to start it in the background using ollama serve.

API Mode

`useChatApi(): OllamaPipe`

Switches to the modern /api/chat endpoint. This is the default and enables conversation history and native tool calling.

`useLegacyApi(): OllamaPipe`

Switches to the legacy /api/generate endpoint.

Inference Settings

`setKeepAlive(duration: String): OllamaPipe`

Sets how long the model stays loaded in memory after a request.

5m: 5 minutes (default)
1h: 1 hour
0: Unload immediately
-1: Keep loaded indefinitely

`enableThink(): OllamaPipe`

Enables reasoning extraction for models that use <think> tags (e.g., DeepSeek-R1). Extracted reasoning is populated in MultimodalContent.modelReasoning.

`setMinP(minP: Float): OllamaPipe`

Sets the minimum probability threshold for token sampling.

`setTypicalP(typicalP: Float): OllamaPipe`

Sets the typical probability for sampling.

`setMirostat(mode: Int, eta: Float? = null, tau: Float? = null): OllamaPipe`

Enables and configures Mirostat sampling.

mode: 0 (off), 1 (Mirostat), 2 (Mirostat 2.0)

`setRepeatLastN(n: Int): OllamaPipe`

Sets how many tokens back to look for repeat penalty.

`setPenalizeNewline(penalize: Boolean): OllamaPipe`

Controls whether the model is penalized for generating newlines.

Resource Management

`setGpuSettings(numGpu: Int, mainGpu: Int? = null): OllamaPipe`

Configures GPU offloading.

numGpu: Number of layers to offload to GPU.
mainGpu: The ID of the main GPU to use.

`setNumThread(numThread: Int): OllamaPipe`

Sets the number of CPU threads to use for generation.

`setBatchSize(batchSize: Int): OllamaPipe`

Sets the prompt processing batch size.

`setNumCtx(numCtx: Int): OllamaPipe`

Sets the model’s context window size (tokens).

`setLowVram(lowVram: Boolean): OllamaPipe`

Enables low VRAM mode for limited hardware.

`setNuma(useNuma: Boolean): OllamaPipe`

Enables NUMA optimization on supported systems.

Streaming

`enableStreaming(callback: (suspend (String) -> Unit)? = null, showReasoning: Boolean = false, streamReasoning: Boolean = true): OllamaPipe`

Enables real-time streaming of responses.

callback: Suspendable function receiving text chunks.
showReasoning: Propagates streaming to reasoning pipes.
streamReasoning: Whether to emit reasoning chunks to the callback.

`setStreamingCallback(callback: suspend (String) -> Unit): OllamaPipe`

Registers a single streaming callback and enables streaming.

`streamingCallbacks(builder: StreamingCallbackBuilder.() -> Unit): OllamaPipe`

Configures multiple streaming callbacks using a DSL.

Ollama-Specific Options

OllamaPipe automatically maps standard TPipe parameters (temperature, topP, topK, maxTokens, stopSequences, repetitionPenalty) to the equivalent Ollama options. Advanced options listed above allow for fine-tuned control over the local runner environment.

Ollama Pipe Class API

Table of Contents

Overview

Public Functions

Server Configuration

setIP(ip: String): OllamaPipe

setPort(port: Int): OllamaPipe

init(): Pipe

API Mode

useChatApi(): OllamaPipe

useLegacyApi(): OllamaPipe

Inference Settings

setKeepAlive(duration: String): OllamaPipe

enableThink(): OllamaPipe

setMinP(minP: Float): OllamaPipe

setTypicalP(typicalP: Float): OllamaPipe

setMirostat(mode: Int, eta: Float? = null, tau: Float? = null): OllamaPipe

setRepeatLastN(n: Int): OllamaPipe

setPenalizeNewline(penalize: Boolean): OllamaPipe

Resource Management

setGpuSettings(numGpu: Int, mainGpu: Int? = null): OllamaPipe

setNumThread(numThread: Int): OllamaPipe

setBatchSize(batchSize: Int): OllamaPipe

setNumCtx(numCtx: Int): OllamaPipe

setLowVram(lowVram: Boolean): OllamaPipe

setNuma(useNuma: Boolean): OllamaPipe

Streaming

enableStreaming(callback: (suspend (String) -> Unit)? = null, showReasoning: Boolean = false, streamReasoning: Boolean = true): OllamaPipe

setStreamingCallback(callback: suspend (String) -> Unit): OllamaPipe

streamingCallbacks(builder: StreamingCallbackBuilder.() -> Unit): OllamaPipe

Ollama-Specific Options

`setIP(ip: String): OllamaPipe`

`setPort(port: Int): OllamaPipe`

`init(): Pipe`

`useChatApi(): OllamaPipe`

`useLegacyApi(): OllamaPipe`

`setKeepAlive(duration: String): OllamaPipe`

`enableThink(): OllamaPipe`

`setMinP(minP: Float): OllamaPipe`

`setTypicalP(typicalP: Float): OllamaPipe`

`setMirostat(mode: Int, eta: Float? = null, tau: Float? = null): OllamaPipe`

`setRepeatLastN(n: Int): OllamaPipe`

`setPenalizeNewline(penalize: Boolean): OllamaPipe`

`setGpuSettings(numGpu: Int, mainGpu: Int? = null): OllamaPipe`

`setNumThread(numThread: Int): OllamaPipe`

`setBatchSize(batchSize: Int): OllamaPipe`

`setNumCtx(numCtx: Int): OllamaPipe`

`setLowVram(lowVram: Boolean): OllamaPipe`

`setNuma(useNuma: Boolean): OllamaPipe`

`enableStreaming(callback: (suspend (String) -> Unit)? = null, showReasoning: Boolean = false, streamReasoning: Boolean = true): OllamaPipe`

`setStreamingCallback(callback: suspend (String) -> Unit): OllamaPipe`

`streamingCallbacks(builder: StreamingCallbackBuilder.() -> Unit): OllamaPipe`