hermes / website /docs /guides /python-library.md
lenson78's picture
initial upload: v2026.3.23 with HF Spaces deployment
9aa5185 verified
---
sidebar_position: 4
title: "Using Hermes as a Python Library"
description: "Embed AIAgent in your own Python scripts, web apps, or automation pipelines β€” no CLI required"
---
# Using Hermes as a Python Library
Hermes isn't just a CLI tool. You can import `AIAgent` directly and use it programmatically in your own Python scripts, web applications, or automation pipelines. This guide shows you how.
---
## Installation
Install Hermes directly from the repository:
```bash
pip install git+https://github.com/NousResearch/hermes-agent.git
```
Or with [uv](https://docs.astral.sh/uv/):
```bash
uv pip install git+https://github.com/NousResearch/hermes-agent.git
```
You can also pin it in your `requirements.txt`:
```text
hermes-agent @ git+https://github.com/NousResearch/hermes-agent.git
```
:::tip
The same environment variables used by the CLI are required when using Hermes as a library. At minimum, set `OPENROUTER_API_KEY` (or `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` if using direct provider access).
:::
---
## Basic Usage
The simplest way to use Hermes is the `chat()` method β€” pass a message, get a string back:
```python
from run_agent import AIAgent
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
)
response = agent.chat("What is the capital of France?")
print(response)
```
`chat()` handles the full conversation loop internally β€” tool calls, retries, everything β€” and returns just the final text response.
:::warning
Always set `quiet_mode=True` when embedding Hermes in your own code. Without it, the agent prints CLI spinners, progress indicators, and other terminal output that will clutter your application's output.
:::
---
## Full Conversation Control
For more control over the conversation, use `run_conversation()` directly. It returns a dictionary with the full response, message history, and metadata:
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
)
result = agent.run_conversation(
user_message="Search for recent Python 3.13 features",
task_id="my-task-1",
)
print(result["final_response"])
print(f"Messages exchanged: {len(result['messages'])}")
```
The returned dictionary contains:
- **`final_response`** β€” The agent's final text reply
- **`messages`** β€” The complete message history (system, user, assistant, tool calls)
- **`task_id`** β€” The task identifier used for VM isolation
You can also pass a custom system message that overrides the ephemeral system prompt for that call:
```python
result = agent.run_conversation(
user_message="Explain quicksort",
system_message="You are a computer science tutor. Use simple analogies.",
)
```
---
## Configuring Tools
Control which toolsets the agent has access to using `enabled_toolsets` or `disabled_toolsets`:
```python
# Only enable web tools (browsing, search)
agent = AIAgent(
model="anthropic/claude-sonnet-4",
enabled_toolsets=["web"],
quiet_mode=True,
)
# Enable everything except terminal access
agent = AIAgent(
model="anthropic/claude-sonnet-4",
disabled_toolsets=["terminal"],
quiet_mode=True,
)
```
:::tip
Use `enabled_toolsets` when you want a minimal, locked-down agent (e.g., only web search for a research bot). Use `disabled_toolsets` when you want most capabilities but need to restrict specific ones (e.g., no terminal access in a shared environment).
:::
---
## Multi-turn Conversations
Maintain conversation state across multiple turns by passing the message history back in:
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
)
# First turn
result1 = agent.run_conversation("My name is Alice")
history = result1["messages"]
# Second turn β€” agent remembers the context
result2 = agent.run_conversation(
"What's my name?",
conversation_history=history,
)
print(result2["final_response"]) # "Your name is Alice."
```
The `conversation_history` parameter accepts the `messages` list from a previous result. The agent copies it internally, so your original list is never mutated.
---
## Saving Trajectories
Enable trajectory saving to capture conversations in ShareGPT format β€” useful for generating training data or debugging:
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
save_trajectories=True,
quiet_mode=True,
)
agent.chat("Write a Python function to sort a list")
# Saves to trajectory_samples.jsonl in ShareGPT format
```
Each conversation is appended as a single JSONL line, making it easy to collect datasets from automated runs.
---
## Custom System Prompts
Use `ephemeral_system_prompt` to set a custom system prompt that guides the agent's behavior but is **not** saved to trajectory files (keeping your training data clean):
```python
agent = AIAgent(
model="anthropic/claude-sonnet-4",
ephemeral_system_prompt="You are a SQL expert. Only answer database questions.",
quiet_mode=True,
)
response = agent.chat("How do I write a JOIN query?")
print(response)
```
This is ideal for building specialized agents β€” a code reviewer, a documentation writer, a SQL assistant β€” all using the same underlying tooling.
---
## Batch Processing
For running many prompts in parallel, Hermes includes `batch_runner.py`. It manages concurrent `AIAgent` instances with proper resource isolation:
```bash
python batch_runner.py --input prompts.jsonl --output results.jsonl
```
Each prompt gets its own `task_id` and isolated environment. If you need custom batch logic, you can build your own using `AIAgent` directly:
```python
import concurrent.futures
from run_agent import AIAgent
prompts = [
"Explain recursion",
"What is a hash table?",
"How does garbage collection work?",
]
def process_prompt(prompt):
# Create a fresh agent per task for thread safety
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
skip_memory=True,
)
return agent.chat(prompt)
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
results = list(executor.map(process_prompt, prompts))
for prompt, result in zip(prompts, results):
print(f"Q: {prompt}\nA: {result}\n")
```
:::warning
Always create a **new `AIAgent` instance per thread or task**. The agent maintains internal state (conversation history, tool sessions, iteration counters) that is not thread-safe to share.
:::
---
## Integration Examples
### FastAPI Endpoint
```python
from fastapi import FastAPI
from pydantic import BaseModel
from run_agent import AIAgent
app = FastAPI()
class ChatRequest(BaseModel):
message: str
model: str = "anthropic/claude-sonnet-4"
@app.post("/chat")
async def chat(request: ChatRequest):
agent = AIAgent(
model=request.model,
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
response = agent.chat(request.message)
return {"response": response}
```
### Discord Bot
```python
import discord
from run_agent import AIAgent
client = discord.Client(intents=discord.Intents.default())
@client.event
async def on_message(message):
if message.author == client.user:
return
if message.content.startswith("!hermes "):
query = message.content[8:]
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
platform="discord",
)
response = agent.chat(query)
await message.channel.send(response[:2000])
client.run("YOUR_DISCORD_TOKEN")
```
### CI/CD Pipeline Step
```python
#!/usr/bin/env python3
"""CI step: auto-review a PR diff."""
import subprocess
from run_agent import AIAgent
diff = subprocess.check_output(["git", "diff", "main...HEAD"]).decode()
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
disabled_toolsets=["terminal", "browser"],
)
review = agent.chat(
f"Review this PR diff for bugs, security issues, and style problems:\n\n{diff}"
)
print(review)
```
---
## Key Constructor Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model` | `str` | `"anthropic/claude-opus-4.6"` | Model in OpenRouter format |
| `quiet_mode` | `bool` | `False` | Suppress CLI output |
| `enabled_toolsets` | `List[str]` | `None` | Whitelist specific toolsets |
| `disabled_toolsets` | `List[str]` | `None` | Blacklist specific toolsets |
| `save_trajectories` | `bool` | `False` | Save conversations to JSONL |
| `ephemeral_system_prompt` | `str` | `None` | Custom system prompt (not saved to trajectories) |
| `max_iterations` | `int` | `90` | Max tool-calling iterations per conversation |
| `skip_context_files` | `bool` | `False` | Skip loading AGENTS.md files |
| `skip_memory` | `bool` | `False` | Disable persistent memory read/write |
| `api_key` | `str` | `None` | API key (falls back to env vars) |
| `base_url` | `str` | `None` | Custom API endpoint URL |
| `platform` | `str` | `None` | Platform hint (`"discord"`, `"telegram"`, etc.) |
---
## Important Notes
:::tip
- Set **`skip_context_files=True`** if you don't want `AGENTS.md` files from the working directory loaded into the system prompt.
- Set **`skip_memory=True`** to prevent the agent from reading or writing persistent memory β€” recommended for stateless API endpoints.
- The `platform` parameter (e.g., `"discord"`, `"telegram"`) injects platform-specific formatting hints so the agent adapts its output style.
:::
:::warning
- **Thread safety**: Create one `AIAgent` per thread or task. Never share an instance across concurrent calls.
- **Resource cleanup**: The agent automatically cleans up resources (terminal sessions, browser instances) when a conversation ends. If you're running in a long-lived process, ensure each conversation completes normally.
- **Iteration limits**: The default `max_iterations=90` is generous. For simple Q&A use cases, consider lowering it (e.g., `max_iterations=10`) to prevent runaway tool-calling loops and control costs.
:::