Mirror OpenSkyNet workspace snapshot from Git HEAD

fc93158 verified 9 days ago

4.32 kB

	---
	summary: "Command queue design that serializes inbound auto-reply runs"
	read_when:
	- Changing auto-reply execution or concurrency
	title: "Command Queue"
	---

	# Command Queue (2026-01-16)

	We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.

	## Why

	- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
	- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.

	## How it works

	- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
	- `runEmbeddedPiAgent` enqueues by session key (lane `session:<key>`) to guarantee only one active run per session.
	- Each session run is then queued into a global lane (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent`.
	- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
	- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.

	## Queue modes (per channel)

	Inbound messages can steer the current run, wait for a followup turn, or do both:

	- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
	- `followup`: enqueue for the next agent turn after the current run ends.
	- `collect`: coalesce all queued messages into a single followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
	- `steer-backlog` (aka `steer+backlog`): steer now and preserve the message for a followup turn.
	- `interrupt` (legacy): abort the active run for that session, then run the newest message.
	- `queue` (legacy alias): same as `steer`.

	Steer-backlog means you can get a followup response after the steered run, so
	streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
	one response per inbound message.
	Send `/queue collect` as a standalone command (per-session) or set `messages.queue.byChannel.discord: "collect"`.

	Defaults (when unset in config):

	- All surfaces → `collect`

	Configure globally or per channel via `messages.queue`:

	```json5
	{
	messages: {
	queue: {
	mode: "collect",
	debounceMs: 1000,
	cap: 20,
	drop: "summarize",
	byChannel: { discord: "collect" },
	},
	},
	}
	```

	## Queue options

	Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):

	- `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
	- `cap`: max queued messages per session.
	- `drop`: overflow policy (`old`, `new`, `summarize`).

	Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.
	Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.

	## Per-session overrides

	- Send `/queue <mode>` as a standalone command to store the mode for the current session.
	- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
	- `/queue default` or `/queue reset` clears the session override.

	## Scope and guarantees

	- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
	- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
	- Additional lanes may exist (e.g. `cron`, `subagent`) so background jobs can run in parallel without blocking inbound replies.
	- Per-session lanes guarantee that only one agent run touches a given session at a time.
	- No external dependencies or background worker threads; pure TypeScript + promises.

	## Troubleshooting

	- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
	- If you need queue depth, enable verbose logs and watch for queue timing lines.