Spaces:
Paused
Paused
File size: 6,921 Bytes
b5b9c2e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 | ---
sidebar_position: 9
title: "Context Engine Plugins"
description: "How to build a context engine plugin that replaces the built-in ContextCompressor"
---
# Building a Context Engine Plugin
Context engine plugins replace the built-in `ContextCompressor` with an alternative strategy for managing conversation context. For example, a Lossless Context Management (LCM) engine that builds a knowledge DAG instead of lossy summarization.
## How it works
The agent's context management is built on the `ContextEngine` ABC (`agent/context_engine.py`). The built-in `ContextCompressor` is the default implementation. Plugin engines must implement the same interface.
Only **one** context engine can be active at a time. Selection is config-driven:
```yaml
# config.yaml
context:
engine: "compressor" # default built-in
engine: "lcm" # activates a plugin engine named "lcm"
```
Plugin engines are **never auto-activated** — the user must explicitly set `context.engine` to the plugin's name.
## Directory structure
Each context engine lives in `plugins/context_engine/<name>/`:
```
plugins/context_engine/lcm/
├── __init__.py # exports the ContextEngine subclass
├── plugin.yaml # metadata (name, description, version)
└── ... # any other modules your engine needs
```
## The ContextEngine ABC
Your engine must implement these **required** methods:
```python
from agent.context_engine import ContextEngine
class LCMEngine(ContextEngine):
@property
def name(self) -> str:
"""Short identifier, e.g. 'lcm'. Must match config.yaml value."""
return "lcm"
def update_from_response(self, usage: dict) -> None:
"""Called after every LLM call with the usage dict.
Update self.last_prompt_tokens, self.last_completion_tokens,
self.last_total_tokens from the response.
"""
def should_compress(self, prompt_tokens: int = None) -> bool:
"""Return True if compaction should fire this turn."""
def compress(self, messages: list, current_tokens: int = None) -> list:
"""Compact the message list and return a new (possibly shorter) list.
The returned list must be a valid OpenAI-format message sequence.
"""
```
### Class attributes your engine must maintain
The agent reads these directly for display and logging:
```python
last_prompt_tokens: int = 0
last_completion_tokens: int = 0
last_total_tokens: int = 0
threshold_tokens: int = 0 # when compression triggers
context_length: int = 0 # model's full context window
compression_count: int = 0 # how many times compress() has run
```
### Optional methods
These have sensible defaults in the ABC. Override as needed:
| Method | Default | Override when |
|--------|---------|--------------|
| `on_session_start(session_id, **kwargs)` | No-op | You need to load persisted state (DAG, DB) |
| `on_session_end(session_id, messages)` | No-op | You need to flush state, close connections |
| `on_session_reset()` | Resets token counters | You have per-session state to clear |
| `update_model(model, context_length, ...)` | Updates context_length + threshold | You need to recalculate budgets on model switch |
| `get_tool_schemas()` | Returns `[]` | Your engine provides agent-callable tools (e.g., `lcm_grep`) |
| `handle_tool_call(name, args, **kwargs)` | Returns error JSON | You implement tool handlers |
| `should_compress_preflight(messages)` | Returns `False` | You can do a cheap pre-API-call estimate |
| `get_status()` | Standard token/threshold dict | You have custom metrics to expose |
## Engine tools
Context engines can expose tools the agent calls directly. Return schemas from `get_tool_schemas()` and handle calls in `handle_tool_call()`:
```python
def get_tool_schemas(self):
return [{
"name": "lcm_grep",
"description": "Search the context knowledge graph",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"],
},
}]
def handle_tool_call(self, name, args, **kwargs):
if name == "lcm_grep":
results = self._search_dag(args["query"])
return json.dumps({"results": results})
return json.dumps({"error": f"Unknown tool: {name}"})
```
Engine tools are injected into the agent's tool list at startup and dispatched automatically — no registry registration needed.
## Registration
### Via directory (recommended)
Place your engine in `plugins/context_engine/<name>/`. The `__init__.py` must export a `ContextEngine` subclass. The discovery system finds and instantiates it automatically.
### Via general plugin system
A general plugin can also register a context engine:
```python
def register(ctx):
engine = LCMEngine(context_length=200000)
ctx.register_context_engine(engine)
```
Only one engine can be registered. A second plugin attempting to register is rejected with a warning.
## Lifecycle
```
1. Engine instantiated (plugin load or directory discovery)
2. on_session_start() — conversation begins
3. update_from_response() — after each API call
4. should_compress() — checked each turn
5. compress() — called when should_compress() returns True
6. on_session_end() — session boundary (CLI exit, /reset, gateway expiry)
```
`on_session_reset()` is called on `/new` or `/reset` to clear per-session state without a full shutdown.
## Configuration
Users select your engine via `hermes plugins` → Provider Plugins → Context Engine, or by editing `config.yaml`:
```yaml
context:
engine: "lcm" # must match your engine's name property
```
The `compression` config block (`compression.threshold`, `compression.protect_last_n`, etc.) is specific to the built-in `ContextCompressor`. Your engine should define its own config format if needed, reading from `config.yaml` during initialization.
## Testing
```python
from agent.context_engine import ContextEngine
def test_engine_satisfies_abc():
engine = YourEngine(context_length=200000)
assert isinstance(engine, ContextEngine)
assert engine.name == "your-name"
def test_compress_returns_valid_messages():
engine = YourEngine(context_length=200000)
msgs = [{"role": "user", "content": "hello"}]
result = engine.compress(msgs)
assert isinstance(result, list)
assert all("role" in m for m in result)
```
See `tests/agent/test_context_engine.py` for the full ABC contract test suite.
## See also
- [Context Compression and Caching](/docs/developer-guide/context-compression-and-caching) — how the built-in compressor works
- [Memory Provider Plugins](/docs/developer-guide/memory-provider-plugin) — analogous single-select plugin system for memory
- [Plugins](/docs/user-guide/features/plugins) — general plugin system overview
|