graphtestbed / agents /README.md
Zhu Jiajun (jz28583)
Trim agents/cliproxyapi surface
701d9c5

agents/ β€” third-party harness integrations

Wraps external agent harnesses so they can be pointed at a GraphTestbed task and produce a submission.csv the scoring API understands. LLM traffic is routed through one local CLIProxyAPI instance via the agents.cliproxyapi shim.

Layout

agents/
β”œβ”€β”€ cliproxyapi/    # generic Anthropic/OpenAI/Gemini β†’ proxy shim (reusable)
β”œβ”€β”€ common/         # workspace + task-instruction + submit helpers
β”œβ”€β”€ ai_build_ai/    # AI-Build-AI integration   (default: claude-sonnet-4-6)
└── mlevolve/       # MLEvolve integration      (default: gpt-5.3-codex-spark)

agents/<agent>/_vendor/ (gitignored) holds the upstream binary or git clone for that agent.

End-to-end (figraph example)

# 0. One-time setup of the proxy (see agents/cliproxyapi/README.md)
export CLIPROXYAPI_KEY=<from your config.yaml>

# 1. Fetch the task data once
gtb fetch figraph

# 2. Install whichever agent you want
bash agents/ai_build_ai/install.sh        # downloads upstream tarball
# or
bash agents/mlevolve/install.sh           # git clone + pip install

# 3. Run; the runner prints the produced submission.csv path
python -m agents.ai_build_ai.runner --task figraph
python -m agents.mlevolve.runner    --task figraph

# 4. Submit when ready (default is print-and-stop)
gtb submit figraph --file <printed-path> --agent <my-agent-id>
# or pass --submit <name> to the runner to combine 3+4

Adding another agent

  1. Create agents/<new_agent>/{__init__.py,runner.py,install.sh,README.md}.
  2. In runner.py import from agents.cliproxyapi (one of anthropic_env, openai_env, or openai_yaml_block per the agent's SDK).
  3. Use agents.common.workspace.make_workspace() for the run dir, agents.common.tasks.task_instruction() for the task prompt, agents.common.submit.finalize() for validate+optional-submit.

No changes to agents/cliproxyapi/ or agents/common/ are required for new agents that fit one of the three supported SDK shapes.