| --- |
| summary: "Streaming + chunking behavior (block replies, channel preview streaming, mode mapping)" |
| read_when: |
| - Explaining how streaming or chunking works on channels |
| - Changing block streaming or channel chunking behavior |
| - Debugging duplicate/early block replies or channel preview streaming |
| title: "Streaming and Chunking" |
| --- |
| |
| # Streaming + chunking |
|
|
| OpenClaw has two separate streaming layers: |
|
|
| - **Block streaming (channels):** emit completed **blocks** as the assistant writes. These are normal channel messages (not token deltas). |
| - **Preview streaming (Telegram/Discord/Slack):** update a temporary **preview message** while generating. |
|
|
| There is **no true token-delta streaming** to channel messages today. Preview streaming is message-based (send + edits/appends). |
|
|
| ## Block streaming (channel messages) |
|
|
| Block streaming sends assistant output in coarse chunks as it becomes available. |
|
|
| ``` |
| Model output |
| ββ text_delta/events |
| ββ (blockStreamingBreak=text_end) |
| β ββ chunker emits blocks as buffer grows |
| ββ (blockStreamingBreak=message_end) |
| ββ chunker flushes at message_end |
| ββ channel send (block replies) |
| ``` |
|
|
| Legend: |
|
|
| - `text_delta/events`: model stream events (may be sparse for non-streaming models). |
| - `chunker`: `EmbeddedBlockChunker` applying min/max bounds + break preference. |
| - `channel send`: actual outbound messages (block replies). |
|
|
| **Controls:** |
|
|
| - `agents.defaults.blockStreamingDefault`: `"on"`/`"off"` (default off). |
| - Channel overrides: `*.blockStreaming` (and per-account variants) to force `"on"`/`"off"` per channel. |
| - `agents.defaults.blockStreamingBreak`: `"text_end"` or `"message_end"`. |
| - `agents.defaults.blockStreamingChunk`: `{ minChars, maxChars, breakPreference? }`. |
| - `agents.defaults.blockStreamingCoalesce`: `{ minChars?, maxChars?, idleMs? }` (merge streamed blocks before send). |
| - Channel hard cap: `*.textChunkLimit` (e.g., `channels.whatsapp.textChunkLimit`). |
| - Channel chunk mode: `*.chunkMode` (`length` default, `newline` splits on blank lines (paragraph boundaries) before length chunking). |
| - Discord soft cap: `channels.discord.maxLinesPerMessage` (default 17) splits tall replies to avoid UI clipping. |
|
|
| **Boundary semantics:** |
|
|
| - `text_end`: stream blocks as soon as chunker emits; flush on each `text_end`. |
| - `message_end`: wait until assistant message finishes, then flush buffered output. |
|
|
| `message_end` still uses the chunker if the buffered text exceeds `maxChars`, so it can emit multiple chunks at the end. |
|
|
| ## Chunking algorithm (low/high bounds) |
|
|
| Block chunking is implemented by `EmbeddedBlockChunker`: |
|
|
| - **Low bound:** donβt emit until buffer >= `minChars` (unless forced). |
| - **High bound:** prefer splits before `maxChars`; if forced, split at `maxChars`. |
| - **Break preference:** `paragraph` β `newline` β `sentence` β `whitespace` β hard break. |
| - **Code fences:** never split inside fences; when forced at `maxChars`, close + reopen the fence to keep Markdown valid. |
|
|
| `maxChars` is clamped to the channel `textChunkLimit`, so you canβt exceed per-channel caps. |
|
|
| ## Coalescing (merge streamed blocks) |
|
|
| When block streaming is enabled, OpenClaw can **merge consecutive block chunks** |
| before sending them out. This reduces βsingle-line spamβ while still providing |
| progressive output. |
|
|
| - Coalescing waits for **idle gaps** (`idleMs`) before flushing. |
| - Buffers are capped by `maxChars` and will flush if they exceed it. |
| - `minChars` prevents tiny fragments from sending until enough text accumulates |
| (final flush always sends remaining text). |
| - Joiner is derived from `blockStreamingChunk.breakPreference` |
| (`paragraph` β `\n\n`, `newline` β `\n`, `sentence` β space). |
| - Channel overrides are available via `*.blockStreamingCoalesce` (including per-account configs). |
| - Default coalesce `minChars` is bumped to 1500 for Signal/Slack/Discord unless overridden. |
|
|
| ## Human-like pacing between blocks |
|
|
| When block streaming is enabled, you can add a **randomized pause** between |
| block replies (after the first block). This makes multi-bubble responses feel |
| more natural. |
|
|
| - Config: `agents.defaults.humanDelay` (override per agent via `agents.list[].humanDelay`). |
| - Modes: `off` (default), `natural` (800β2500ms), `custom` (`minMs`/`maxMs`). |
| - Applies only to **block replies**, not final replies or tool summaries. |
|
|
| ## βStream chunks or everythingβ |
|
|
| This maps to: |
|
|
| - **Stream chunks:** `blockStreamingDefault: "on"` + `blockStreamingBreak: "text_end"` (emit as you go). Non-Telegram channels also need `*.blockStreaming: true`. |
| - **Stream everything at end:** `blockStreamingBreak: "message_end"` (flush once, possibly multiple chunks if very long). |
| - **No block streaming:** `blockStreamingDefault: "off"` (only final reply). |
|
|
| **Channel note:** Block streaming is **off unless** |
| `*.blockStreaming` is explicitly set to `true`. Channels can stream a live preview |
| (`channels.<channel>.streaming`) without block replies. |
|
|
| Config location reminder: the `blockStreaming*` defaults live under |
| `agents.defaults`, not the root config. |
|
|
| ## Preview streaming modes |
|
|
| Canonical key: `channels.<channel>.streaming` |
|
|
| Modes: |
|
|
| - `off`: disable preview streaming. |
| - `partial`: single preview that is replaced with latest text. |
| - `block`: preview updates in chunked/appended steps. |
| - `progress`: progress/status preview during generation, final answer at completion. |
|
|
| ### Channel mapping |
|
|
| | Channel | `off` | `partial` | `block` | `progress` | |
| | -------- | ----- | --------- | ------- | ----------------- | |
| | Telegram | β
| β
| β
| maps to `partial` | |
| | Discord | β
| β
| β
| maps to `partial` | |
| | Slack | β
| β
| β
| β
| |
|
|
| Slack-only: |
|
|
| - `channels.slack.nativeStreaming` toggles Slack native streaming API calls when `streaming=partial` (default: `true`). |
|
|
| Legacy key migration: |
|
|
| - Telegram: `streamMode` + boolean `streaming` auto-migrate to `streaming` enum. |
| - Discord: `streamMode` + boolean `streaming` auto-migrate to `streaming` enum. |
| - Slack: `streamMode` auto-migrates to `streaming` enum; boolean `streaming` auto-migrates to `nativeStreaming`. |
|
|
| ### Runtime behavior |
|
|
| Telegram: |
|
|
| - Uses `sendMessage` + `editMessageText` preview updates across DMs and group/topics. |
| - Preview streaming is skipped when Telegram block streaming is explicitly enabled (to avoid double-streaming). |
| - `/reasoning stream` can write reasoning to preview. |
|
|
| Discord: |
|
|
| - Uses send + edit preview messages. |
| - `block` mode uses draft chunking (`draftChunk`). |
| - Preview streaming is skipped when Discord block streaming is explicitly enabled. |
|
|
| Slack: |
|
|
| - `partial` can use Slack native streaming (`chat.startStream`/`append`/`stop`) when available. |
| - `block` uses append-style draft previews. |
| - `progress` uses status preview text, then final answer. |
|
|