graphtestbed / agents /README.md
Zhu Jiajun (jz28583)
Trim agents/cliproxyapi surface
701d9c5
# `agents/` β€” third-party harness integrations
Wraps external agent harnesses so they can be pointed at a GraphTestbed task
and produce a `submission.csv` the scoring API understands. LLM traffic is
routed through one local [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI)
instance via the [`agents.cliproxyapi`](cliproxyapi/README.md) shim.
## Layout
```
agents/
β”œβ”€β”€ cliproxyapi/ # generic Anthropic/OpenAI/Gemini β†’ proxy shim (reusable)
β”œβ”€β”€ common/ # workspace + task-instruction + submit helpers
β”œβ”€β”€ ai_build_ai/ # AI-Build-AI integration (default: claude-sonnet-4-6)
└── mlevolve/ # MLEvolve integration (default: gpt-5.3-codex-spark)
```
`agents/<agent>/_vendor/` (gitignored) holds the upstream binary or git
clone for that agent.
## End-to-end (figraph example)
```bash
# 0. One-time setup of the proxy (see agents/cliproxyapi/README.md)
export CLIPROXYAPI_KEY=<from your config.yaml>
# 1. Fetch the task data once
gtb fetch figraph
# 2. Install whichever agent you want
bash agents/ai_build_ai/install.sh # downloads upstream tarball
# or
bash agents/mlevolve/install.sh # git clone + pip install
# 3. Run; the runner prints the produced submission.csv path
python -m agents.ai_build_ai.runner --task figraph
python -m agents.mlevolve.runner --task figraph
# 4. Submit when ready (default is print-and-stop)
gtb submit figraph --file <printed-path> --agent <my-agent-id>
# or pass --submit <name> to the runner to combine 3+4
```
## Adding another agent
1. Create `agents/<new_agent>/{__init__.py,runner.py,install.sh,README.md}`.
2. In `runner.py` import from `agents.cliproxyapi` (one of `anthropic_env`,
`openai_env`, or `openai_yaml_block` per the agent's SDK).
3. Use `agents.common.workspace.make_workspace()` for the run dir,
`agents.common.tasks.task_instruction()` for the task prompt,
`agents.common.submit.finalize()` for validate+optional-submit.
No changes to `agents/cliproxyapi/` or `agents/common/` are required for new
agents that fit one of the three supported SDK shapes.