# Docker Space — Claude Code Evolution Containers ## Overview Each subdirectory here is a workspace mounted into a Docker container where Claude Code autonomously evolves C++ solutions for Frontier-CS problems. The judge service runs in a separate container (`Competitive-Programming`) and is shared across all evolution containers. ## Prerequisites - `claude-docker` image built locally - `algorithmic-lightcpverifier` judge container running (`Competitive-Programming` on port 8081) - Docker network `algorithmic_default` (created by the judge's docker-compose) ## How to Create a New Evolution Container ### Step 1: Create the workspace directory ```bash PROBLEM_ID=0 NAME="frontier_cs_${PROBLEM_ID}_myname" mkdir -p docker_space/${NAME} ``` ### Step 2: Copy problem files (WITHOUT testdata) From the Frontier-CS problem directory, copy only these files into the workspace root: ```bash SRC="tasks/Frontier-CS/algorithmic/problems/${PROBLEM_ID}" cp ${SRC}/statement.txt docker_space/${NAME}/ cp ${SRC}/chk.cc docker_space/${NAME}/ cp ${SRC}/config.yaml docker_space/${NAME}/ cp -r ${SRC}/examples docker_space/${NAME}/ # reference solutions ``` For most problems, `testdata/` only contains 3 sample test cases and can be included as examples for the agent. However, for problem 0 (polyomino), `testdata/` contains all 70 test cases — do NOT copy it to avoid reward hacking. ### Step 3: Add instruction files Copy from an existing workspace or create new ones: ```bash cp docker_space/frontier_cs_0_polyomino_ev2/INSTRUCTION.md docker_space/${NAME}/ cp docker_space/frontier_cs_0_polyomino_ev2/evaluate.md docker_space/${NAME}/ ``` The workspace should now contain: ``` docker_space/${NAME}/ ├── INSTRUCTION.md # Agent instructions (evolutionary loop, logging, rules) ├── evaluate.md # How to call the judge API ├── statement.txt # Problem description ├── chk.cc # Checker source (for reference only, judge uses its own copy) ├── config.yaml # Problem config (time/memory limits) └── examples/ # Reference solutions to bootstrap from ├── reference.cpp └── gpt5.cpp ``` ### Step 4: Launch the container The entire workspace directory is mounted to `/workspace` inside the container, so all agent output (solutions, logs) is visible on the host. ```bash docker run -d \ --name ${NAME} \ --privileged \ --shm-size=4g \ -v $(pwd)/docker_space/${NAME}:/workspace \ claude-docker \ sleep infinity ``` ### Step 5: Connect to the judge network ```bash docker network connect algorithmic_default ${NAME} ``` ### Step 6: Verify judge connectivity ```bash docker exec ${NAME} curl -s http://Competitive-Programming:8081/problems | head -c 100 ``` ### Step 7: Enter the container and start Claude Code ```bash docker exec -it ${NAME} bash ``` Inside the container, Claude Code reads `/workspace/problem/0/INSTRUCTION.md` and begins evolving. ## File Descriptions | File | Purpose | |------|---------| | `INSTRUCTION.md` | Main agent directive — defines the evolutionary loop, mutation strategies, logging format, and rules | | `evaluate.md` | API reference for submitting solutions to the judge and polling results | | `statement.txt` | Full problem statement (input/output format, scoring, constraints) | | `chk.cc` | Checker source code — for the agent's reference only, not executed locally | | `config.yaml` | Problem config (time limit, memory limit, number of test cases) | | `examples/` | Baseline solutions to initialize evolution from | ## Path Mapping The host directory is mounted directly to `/workspace` in the container: ``` Host: docker_space/${NAME}/ Container: /workspace/ ``` Everything the agent writes under `/workspace/` is visible on the host in `docker_space/${NAME}/`. ## Directory Layout (after agent runs) ``` docker_space/${NAME}/ ← host path (= /workspace in container) ├── INSTRUCTION.md # Agent instructions ├── evaluate.md # Judge API reference ├── statement.txt # Problem description ├── chk.cc # Checker source (reference only) ├── config.yaml # Problem config ├── examples/ # Baseline solutions │ ├── reference.cpp │ └── gpt5.cpp ├── solution.cpp # (created by agent) Current working solution ├── best.cpp # (created by agent) Best solution so far └── logs/ # (created by agent) Evolution history ├── evolution.log ├── gen_0.cpp ├── gen_1.cpp └── ... ``` ## Quick Reference ```bash # List all evolution containers docker ps --filter "ancestor=claude-docker" --format "table {{.Names}}\t{{.Status}}" # Check judge is running docker ps --filter "name=Competitive-Programming" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" # Stop and remove a container docker stop ${NAME} && docker rm ${NAME} # Clean up workspace rm -rf docker_space/${NAME} ``` ## Propmt to start claude code Follow INSTRUCTION.md, please use iterative refinement to improve the scores you achieve, the higher the better. You can log different generations under logs/. Keep your best solution and scores under best/. I believe you can do it. IMPORTANT: you can evolve your own evaluation process as well to find some insightful perspectives on how to escape local optima and to create better solutions.