| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - agents |
| - llm |
| - mcp |
| - reliability |
| - agent-stack |
| - npm |
| - pypi |
| - typescript |
| - python |
| - anthropic |
| - openai |
| - tool-use |
| library_name: agent-stack |
| --- |
| |
| # agent-stack |
|
|
| Six small, single-concern reliability libraries for production LLM agents — published independently to **npm**, **PyPI**, and the **Model Context Protocol** registry. Each library is zero-dependency, under 500 LOC, and addresses one specific failure mode that production agent teams have to handle. |
|
|
| ## Paper |
|
|
| Backed by a peer-reviewable artifact paper with a DataCite DOI: |
|
|
| - **DOI:** [10.5281/zenodo.20074702](https://doi.org/10.5281/zenodo.20074702) |
| - **Title:** _Six Reliability Primitives for LLM Agents: An Artifact Pattern for Stackable, Single-Concern Libraries_ |
| - **Status:** Under review at ASE 2026 Tools track. |
|
|
| ## The six primitives |
|
|
| | Library | Concern | Failure mode it addresses | |
| | --- | --- | --- | |
| | **AgentFit** | Context-window fitting | Token-aware truncation with multiple strategies. Pluggable tokenizers for OpenAI / Anthropic / open models. | |
| | **AgentGuard** | Network egress allowlisting | Blocks the "agent suddenly POSTs PHI / secrets to attacker.com" failure mode. | |
| | **AgentSnap** | Snapshot tests for tool-call traces | Catches silent regressions when a model's tool-call shape changes between deploys. | |
| | **AgentVet** | Tool-arg validation | Throws a `ToolArgError` carrying an LLM-friendly retry hint, so the next turn can self-correct. | |
| | **AgentCast** | Structured-output validate-and-retry | Bring-your-own-LLM JSON validator + retry loop. | |
| | **AgentBudget** | Per-run token + dollar caps | Hard cap with hook for early termination. Prevents runaway loops billing $1000 on a single query. | |
|
|
| Each ships in three runtime forms: **TypeScript on npm**, **Python on PyPI**, and an **MCP-server variant** callable from Claude Desktop, Cursor, Continue, or any MCP client. |
|
|
| ## Install |
|
|
| ### TypeScript (npm) |
|
|
| ```bash |
| npm i @mukundakatta/agentvet @mukundakatta/agentguard @mukundakatta/agentbudget |
| ``` |
|
|
| ### Python (PyPI) |
|
|
| ```bash |
| pip install agentvet agentguard agentbudget |
| ``` |
|
|
| ### MCP server (Claude Desktop config) |
|
|
| ```json |
| { |
| "mcpServers": { |
| "agentvet": { "command": "npx", "args": ["-y", "@mukundakatta/agentvet-mcp"] }, |
| "agentguard": { "command": "npx", "args": ["-y", "@mukundakatta/agentguard-mcp"] } |
| } |
| } |
| ``` |
|
|
| ## Source |
|
|
| Umbrella repo: [github.com/MukundaKatta/agent-stack](https://github.com/MukundaKatta/agent-stack) |
|
|
| Per-library repositories (TS + Python + MCP variants) — search GitHub topic [`agent-stack`](https://github.com/search?q=user%3AMukundaKatta+topic%3Aagent-stack) for the full list. |
|
|
| ## Why it exists |
|
|
| Reliability concerns for LLM agents are typically bundled into one heavy framework that asks you to adopt prompting, tool routing, and runtime governance as a single dependency. agent-stack inverts that: each concern is a separate library you can adopt à la carte without buying into a programming model. |
|
|
| The artifact paper documents the six primitives, the cross-cutting invariants the design enforces, the trade-offs of single-concern packaging, and the operational questions that emerge when reliability is split across many small dependencies. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{katta2026agentstack, |
| author = {Katta, Mukunda Rao}, |
| title = {Six Reliability Primitives for LLM Agents: |
| An Artifact Pattern for Stackable, Single-Concern Libraries}, |
| year = {2026}, |
| publisher = {Zenodo}, |
| doi = {10.5281/zenodo.20074702}, |
| url = {https://doi.org/10.5281/zenodo.20074702} |
| } |
| ``` |
|
|
| ## License |
|
|
| MIT. |
|
|