Spaces:
Sleeping
Sleeping
agents/ β third-party harness integrations
Wraps external agent harnesses so they can be pointed at a GraphTestbed task
and produce a submission.csv the scoring API understands. LLM traffic is
routed through one local CLIProxyAPI
instance via the agents.cliproxyapi shim.
Layout
agents/
βββ cliproxyapi/ # generic Anthropic/OpenAI/Gemini β proxy shim (reusable)
βββ common/ # workspace + task-instruction + submit helpers
βββ ai_build_ai/ # AI-Build-AI integration (default: claude-sonnet-4-6)
βββ mlevolve/ # MLEvolve integration (default: gpt-5.3-codex-spark)
agents/<agent>/_vendor/ (gitignored) holds the upstream binary or git
clone for that agent.
End-to-end (figraph example)
# 0. One-time setup of the proxy (see agents/cliproxyapi/README.md)
export CLIPROXYAPI_KEY=<from your config.yaml>
# 1. Fetch the task data once
gtb fetch figraph
# 2. Install whichever agent you want
bash agents/ai_build_ai/install.sh # downloads upstream tarball
# or
bash agents/mlevolve/install.sh # git clone + pip install
# 3. Run; the runner prints the produced submission.csv path
python -m agents.ai_build_ai.runner --task figraph
python -m agents.mlevolve.runner --task figraph
# 4. Submit when ready (default is print-and-stop)
gtb submit figraph --file <printed-path> --agent <my-agent-id>
# or pass --submit <name> to the runner to combine 3+4
Adding another agent
- Create
agents/<new_agent>/{__init__.py,runner.py,install.sh,README.md}. - In
runner.pyimport fromagents.cliproxyapi(one ofanthropic_env,openai_env, oropenai_yaml_blockper the agent's SDK). - Use
agents.common.workspace.make_workspace()for the run dir,agents.common.tasks.task_instruction()for the task prompt,agents.common.submit.finalize()for validate+optional-submit.
No changes to agents/cliproxyapi/ or agents/common/ are required for new
agents that fit one of the three supported SDK shapes.