Instructions to use intuit/agent-tool-optimizer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use intuit/agent-tool-optimizer with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="intuit/agent-tool-optimizer")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("intuit/agent-tool-optimizer")
model = AutoModelForCausalLM.from_pretrained("intuit/agent-tool-optimizer")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use intuit/agent-tool-optimizer with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "intuit/agent-tool-optimizer"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "intuit/agent-tool-optimizer",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/intuit/agent-tool-optimizer

SGLang

How to use intuit/agent-tool-optimizer with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "intuit/agent-tool-optimizer" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "intuit/agent-tool-optimizer",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "intuit/agent-tool-optimizer" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "intuit/agent-tool-optimizer",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use intuit/agent-tool-optimizer with Docker Model Runner:
```
docker model run hf.co/intuit/agent-tool-optimizer
```

kevindong-intuit commited on Feb 21

Commit

a7aef48

verified ·

1 Parent(s): 3dddcac

Update README.md

Browse files

Files changed (1) hide show

README.md +145 -4

README.md CHANGED Viewed

@@ -6,14 +6,42 @@ datasets:
 - intuit/tool-optimizer-dataset
 base_model:
 - Qwen/Qwen3-4B-Instruct-2507
 ---
-Use this model to improve API tool descriptions for LLM Agents.
-For information on how to do inference or training on this model go to the [Agent Tool Interface Optimizer](https://github.com/intuit-ai-research/tool-optimizer).
-SFT prompt (Trace-free), you can use for inference without execution traces.
 ```
 You are an API documentation specialist.
@@ -49,10 +77,123 @@ Output ONLY valid JSON (no markdown, no code blocks):
 {{"description": "<your improved API description here>"}}
 ```
-Example (Before vs After)
 ![Screenshot 2026-02-20 at 5.23.36 PM](https://cdn-uploads.huggingface.co/production/uploads/65dcb410bda21d181b38321b/dFj0XgXancXD51iyGxC83.png)

 - intuit/tool-optimizer-dataset
 base_model:
 - Qwen/Qwen3-4B-Instruct-2507
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- agents
+- tool-use
+- sft
+- documentation
+- text-generation
 ---
+# Agent Tool Optimizer (`intuit/agent-tool-optimizer`)
+`intuit/agent-tool-optimizer` is a **supervised fine-tuned (SFT)** model that rewrites **tool / API descriptions** to be more usable by **LLM agents**. Given a tool name, a parameter schema, and a baseline (often human-written) description, the model produces an improved description that helps an agent:
+- decide **when to use vs. not use** the tool
+- generate **valid parameters** (required vs optional, constraints, defaults)
+- avoid common mistakes and likely validation failures
+This model is trained to work in a **trace-free** setting at inference time (i.e., **no tool execution traces required**).
+For the accompanying codebase (inference + training), see: [Agent Tool Interface Optimizer](https://github.com/intuit-ai-research/tool-optimizer).
+---
+## What problem does this solve?
+Tool interfaces (descriptions + parameter schemas) are the “contract” between agents and tools, but are typically written for humans. When descriptions under-specify **required parameters**, omit **constraints**, or fail to explain **tool boundaries**, agent performance can plateau and can degrade as the number of available tools increases.
+We study tool interface improvement as a scalable complement to agent fine-tuning, and propose **Trace-Free+**: a curriculum-learning approach that transfers knowledge learned from trace-rich training to trace-free inference for unseen tools.
+---
+## Recommended prompt (trace-free)
+This is the **canonical inference prompt** used for trace-free tool description generation (also available as `tool_prompt.txt` in the `tool-optimizer` repo).
 ```
 You are an API documentation specialist.
 {{"description": "<your improved API description here>"}}
 ```
+### Inputs
+- **`tool_name`**: the tool/API name
+- **`parameter_json`**: a JSON string describing the parameter schema (treat this as authoritative)
+- **`original_description`**: the baseline description you want to improve
+### Output
+The model is trained to output **only valid JSON** with a single field:
+- **`description`**: the improved tool description (string)
+---
+## Prompt variation guidance (SFT-sensitive)
+Because this model is SFT to follow a specific prompt and output contract, it can be sensitive to prompt changes. The safest strategy is to treat the prompt as a template and apply only **minimal, well-scoped edits**.
+### Prompt invariants (do not change)
+- Keep the three input slots exactly: `{tool_name}`, `{parameter_json}`, `{original_description}`
+- Keep: **“Output ONLY valid JSON (no markdown, no code blocks)”**
+- Keep the output schema exactly: `{"description": "..."}` (same key name; no extra keys)
+### Safe, minimal edits (usually OK)
+- Add 1–3 bullets under **“Infer (do not output)”** to clarify what to pay attention to
+- Add constraints under **“Write a clear API description that:”** as additional bullets
+- Add brief reminders about schema authority, parameter-name exactness, or concision
+### Risky edits (often break JSON / behavior)
+- Reordering or removing the output-format lines
+- Asking for examples, multi-part outputs, markdown, or extra keys
+- Changing placeholder names or introducing new “inputs” not present during training
+### Concrete example: minimal diff that still tends to work
+The prompt below is a conservative variation. It adds clarifications without changing the core structure or output contract:
+```diff
+ Infer (do not output):
+ - Preserve key lexical tokens from the baseline description that may match user queries
+ - Clarify boundaries if this API could be confused with similar tools
+ Write a clear API description that:
+ - Treats the parameter schema as authoritative and does not introduce fields, types, or requirements not defined in it
+ - Explains each parameter's meaning ... while keeping parameter names exactly as defined in the schema
+ - Lists REQUIRED parameters before optional ones
+ - Uses enumerated or candidate values exactly as defined in the schema when applicable
+ - Describes likely validation failures strictly based on schema-defined constraints ...
+ - Keeps the description concise and avoids unnecessary verbosity
+```
+---
+## Inference
+### Option A: Use the `tool-optimizer` library (recommended)
+The open-source repo includes a working CLI that runs this model with either **vLLM** or **Hugging Face Transformers**:
+```bash
+git clone https://github.com/intuit-ai-research/tool-optimizer
+cd tool-optimizer
+# Install (one option)
+python -m pip install -e .
+# Run inference (vLLM default)
+python src/agent_tool_optimizer/inference_main.py \
+  --model_name intuit/agent-tool-optimizer \
+  --dataset_id intuit/tool-optimizer-dataset
+```
+Notes:
+- `--inference_engine vllm` (default) or `--inference_engine hf`
+- The dataset is expected to have a `test` split with a `prompt` field.
+### Option B: Transformers (direct)
+```python
+import json
+from transformers import pipeline
+import torch
+model_id = "intuit/agent-tool-optimizer"
+gen = pipeline(
+    "text-generation",
+    model=model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+prompt = """<prompt above>"""
+out = gen(
+    [{\"role\": \"user\", \"content\": prompt}],
+    max_new_tokens=512,
+    do_sample=True,
+    temperature=0.6,
+    top_p=0.95,
+    top_k=40,
+    return_full_text=False,
+)
+result = out[0][\"generated_text\"]
+print(result)
+# Optional: validate JSON
+json.loads(result)
+```
+---
+## Example (Before vs After)
 ![Screenshot 2026-02-20 at 5.23.36 PM](https://cdn-uploads.huggingface.co/production/uploads/65dcb410bda21d181b38321b/dFj0XgXancXD51iyGxC83.png)