Docker Space β Claude Code Evolution Containers
Overview
Each subdirectory here is a workspace mounted into a Docker container where Claude Code autonomously evolves C++ solutions for Frontier-CS problems. The judge service runs in a separate container (Competitive-Programming) and is shared across all evolution containers.
Prerequisites
claude-dockerimage built locallyalgorithmic-lightcpverifierjudge container running (Competitive-Programmingon port 8081)- Docker network
algorithmic_default(created by the judge's docker-compose)
How to Create a New Evolution Container
Step 1: Create the workspace directory
PROBLEM_ID=0
NAME="frontier_cs_${PROBLEM_ID}_myname"
mkdir -p docker_space/${NAME}
Step 2: Copy problem files (WITHOUT testdata)
From the Frontier-CS problem directory, copy only these files into the workspace root:
SRC="tasks/Frontier-CS/algorithmic/problems/${PROBLEM_ID}"
cp ${SRC}/statement.txt docker_space/${NAME}/
cp ${SRC}/chk.cc docker_space/${NAME}/
cp ${SRC}/config.yaml docker_space/${NAME}/
cp -r ${SRC}/examples docker_space/${NAME}/ # reference solutions
For most problems, testdata/ only contains 3 sample test cases and can be included as examples for the agent. However, for problem 0 (polyomino), testdata/ contains all 70 test cases β do NOT copy it to avoid reward hacking.
Step 3: Add instruction files
Copy from an existing workspace or create new ones:
cp docker_space/frontier_cs_0_polyomino_ev2/INSTRUCTION.md docker_space/${NAME}/
cp docker_space/frontier_cs_0_polyomino_ev2/evaluate.md docker_space/${NAME}/
The workspace should now contain:
docker_space/${NAME}/
βββ INSTRUCTION.md # Agent instructions (evolutionary loop, logging, rules)
βββ evaluate.md # How to call the judge API
βββ statement.txt # Problem description
βββ chk.cc # Checker source (for reference only, judge uses its own copy)
βββ config.yaml # Problem config (time/memory limits)
βββ examples/ # Reference solutions to bootstrap from
βββ reference.cpp
βββ gpt5.cpp
Step 4: Launch the container
The entire workspace directory is mounted to /workspace inside the container, so all agent output (solutions, logs) is visible on the host.
docker run -d \
--name ${NAME} \
--privileged \
--shm-size=4g \
-v $(pwd)/docker_space/${NAME}:/workspace \
claude-docker \
sleep infinity
Step 5: Connect to the judge network
docker network connect algorithmic_default ${NAME}
Step 6: Verify judge connectivity
docker exec ${NAME} curl -s http://Competitive-Programming:8081/problems | head -c 100
Step 7: Enter the container and start Claude Code
docker exec -it ${NAME} bash
Inside the container, Claude Code reads /workspace/problem/0/INSTRUCTION.md and begins evolving.
File Descriptions
| File | Purpose |
|---|---|
INSTRUCTION.md |
Main agent directive β defines the evolutionary loop, mutation strategies, logging format, and rules |
evaluate.md |
API reference for submitting solutions to the judge and polling results |
statement.txt |
Full problem statement (input/output format, scoring, constraints) |
chk.cc |
Checker source code β for the agent's reference only, not executed locally |
config.yaml |
Problem config (time limit, memory limit, number of test cases) |
examples/ |
Baseline solutions to initialize evolution from |
Path Mapping
The host directory is mounted directly to /workspace in the container:
Host: docker_space/${NAME}/
Container: /workspace/
Everything the agent writes under /workspace/ is visible on the host in docker_space/${NAME}/.
Directory Layout (after agent runs)
docker_space/${NAME}/ β host path (= /workspace in container)
βββ INSTRUCTION.md # Agent instructions
βββ evaluate.md # Judge API reference
βββ statement.txt # Problem description
βββ chk.cc # Checker source (reference only)
βββ config.yaml # Problem config
βββ examples/ # Baseline solutions
β βββ reference.cpp
β βββ gpt5.cpp
βββ solution.cpp # (created by agent) Current working solution
βββ best.cpp # (created by agent) Best solution so far
βββ logs/ # (created by agent) Evolution history
βββ evolution.log
βββ gen_0.cpp
βββ gen_1.cpp
βββ ...
Quick Reference
# List all evolution containers
docker ps --filter "ancestor=claude-docker" --format "table {{.Names}}\t{{.Status}}"
# Check judge is running
docker ps --filter "name=Competitive-Programming" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# Stop and remove a container
docker stop ${NAME} && docker rm ${NAME}
# Clean up workspace
rm -rf docker_space/${NAME}
Propmt to start claude code
Follow INSTRUCTION.md, please use iterative refinement to improve the scores you achieve, the higher the better. You can log different generations under logs/. Keep your best solution and scores under best/. I believe you can do it. IMPORTANT: you can evolve your own evaluation process as well to find some insightful perspectives on how to escape local optima and to create better solutions.