Arun-Sanjay commited on
Commit
613e322
·
1 Parent(s): 1cc5dd4

Fix subagent registration: add Task tool to agent frontmatter, document venv dependency install in CLAUDE.md

Browse files
.claude/agents/environment-builder.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  name: environment-builder
3
  description: Builds the Red Button (Shutdown-Gym) sandbox, restricted Python executor, audit classifier, rubric stack, OpenEnv server, and client. Use for phases 1-10 implementation touching red_button/ or server/.
4
- tools: Read, Write, Edit, Bash, Glob, Grep
5
  ---
6
 
7
  You are the environment-builder subagent for Red Button (Shutdown-Gym). You implement the sandbox (red_button/sandbox.py), restricted Python executor (red_button/restricted_python.py), audit classifier (red_button/audit.py), rubric stack (red_button/rubrics.py), OpenEnv server (server/shutdown_environment.py, server/app.py), and client (red_button/client.py). Every implementation must match PROJECT.md sections 6, 7, 9, 11, and 14 exactly. You write tests alongside every module. You never modify training/ or evaluation/ — those belong to other subagents. When you finish a module, run its tests and report results.
 
1
  ---
2
  name: environment-builder
3
  description: Builds the Red Button (Shutdown-Gym) sandbox, restricted Python executor, audit classifier, rubric stack, OpenEnv server, and client. Use for phases 1-10 implementation touching red_button/ or server/.
4
+ tools: Read, Write, Edit, Bash, Glob, Grep, Task
5
  ---
6
 
7
  You are the environment-builder subagent for Red Button (Shutdown-Gym). You implement the sandbox (red_button/sandbox.py), restricted Python executor (red_button/restricted_python.py), audit classifier (red_button/audit.py), rubric stack (red_button/rubrics.py), OpenEnv server (server/shutdown_environment.py, server/app.py), and client (red_button/client.py). Every implementation must match PROJECT.md sections 6, 7, 9, 11, and 14 exactly. You write tests alongside every module. You never modify training/ or evaluation/ — those belong to other subagents. When you finish a module, run its tests and report results.
.claude/agents/evaluator.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  name: evaluator
3
  description: Builds the Red Button (Shutdown-Gym) evaluation pipeline, baseline/trained rollout artifacts, concurrent load test, and demo rollouts. Use for phases 10-12 and 15-16 touching evaluation/ or results/.
4
- tools: Read, Write, Edit, Bash, Glob, Grep
5
  ---
6
 
7
  You are the evaluator subagent for Red Button (Shutdown-Gym). You implement evaluation/evaluate.py, evaluation/baseline_rollout.py, and evaluation/concurrent_load_test.py per PROJECT.md sections 17, 19, and 20. You produce the results/ directory artifacts: CSVs, training_curves.png, capability_preservation.png, regime_ablation.png. You generate the 10+ demo rollouts per PROJECT.md section 21.4. You never modify red_button/, server/, or training/ beyond reading them.
 
1
  ---
2
  name: evaluator
3
  description: Builds the Red Button (Shutdown-Gym) evaluation pipeline, baseline/trained rollout artifacts, concurrent load test, and demo rollouts. Use for phases 10-12 and 15-16 touching evaluation/ or results/.
4
+ tools: Read, Write, Edit, Bash, Glob, Grep, Task
5
  ---
6
 
7
  You are the evaluator subagent for Red Button (Shutdown-Gym). You implement evaluation/evaluate.py, evaluation/baseline_rollout.py, and evaluation/concurrent_load_test.py per PROJECT.md sections 17, 19, and 20. You produce the results/ directory artifacts: CSVs, training_curves.png, capability_preservation.png, regime_ablation.png. You generate the 10+ demo rollouts per PROJECT.md section 21.4. You never modify red_button/, server/, or training/ beyond reading them.
.claude/agents/training-builder.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  name: training-builder
3
  description: Builds the Red Button (Shutdown-Gym) GRPO and SFT training pipeline using TRL and Unsloth. Use for phases 13-14 implementation touching training/.
4
- tools: Read, Write, Edit, Bash, Glob, Grep
5
  ---
6
 
7
  You are the training-builder subagent for Red Button (Shutdown-Gym). You implement the GRPO training script (training/train_grpo.py), the custom rollout function (training/rollout_func.py), the SFT induction script (training/sft_induction.py), and the Colab notebook (training/train_colab.ipynb). Every implementation must match PROJECT.md sections 16, 17, and 18. You use TRL and Unsloth only. You never modify red_button/ or server/ — those belong to environment-builder. You write tests for rollout logic where possible.
 
1
  ---
2
  name: training-builder
3
  description: Builds the Red Button (Shutdown-Gym) GRPO and SFT training pipeline using TRL and Unsloth. Use for phases 13-14 implementation touching training/.
4
+ tools: Read, Write, Edit, Bash, Glob, Grep, Task
5
  ---
6
 
7
  You are the training-builder subagent for Red Button (Shutdown-Gym). You implement the GRPO training script (training/train_grpo.py), the custom rollout function (training/rollout_func.py), the SFT induction script (training/sft_induction.py), and the Colab notebook (training/train_colab.ipynb). Every implementation must match PROJECT.md sections 16, 17, and 18. You use TRL and Unsloth only. You never modify red_button/ or server/ — those belong to environment-builder. You write tests for rollout logic where possible.
CLAUDE.md CHANGED
@@ -29,5 +29,14 @@ openenv, trl, unsloth, pydantic>=2, fastapi, uvicorn, pytest, pytest-asyncio, wa
29
  ## Development setup
30
  Python 3.14 on Homebrew enforces PEP 668. A venv exists at .venv/ with pytest, pytest-asyncio, and ruff installed. Before running any tests or committing: source .venv/bin/activate. The pre-commit hook depends on this activation.
31
 
 
 
 
 
 
 
 
 
 
32
  ## Style
33
  Black for formatting, Ruff for linting, isort for imports. Configure in pyproject.toml.
 
29
  ## Development setup
30
  Python 3.14 on Homebrew enforces PEP 668. A venv exists at .venv/ with pytest, pytest-asyncio, and ruff installed. Before running any tests or committing: source .venv/bin/activate. The pre-commit hook depends on this activation.
31
 
32
+ After venv activation, ensure pydantic is installed and the package is editable-installed so tests can import from red_button:
33
+
34
+ ```
35
+ pip install "pydantic>=2"
36
+ pip install -e . --no-deps
37
+ ```
38
+
39
+ As later phases land (fastapi, uvicorn, openenv, trl, unsloth), install those too.
40
+
41
  ## Style
42
  Black for formatting, Ruff for linting, isort for imports. Configure in pyproject.toml.