File size: 5,564 Bytes
6f90f5c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | # Docker Space β Claude Code Evolution Containers
## Overview
Each subdirectory here is a workspace mounted into a Docker container where Claude Code autonomously evolves C++ solutions for Frontier-CS problems. The judge service runs in a separate container (`Competitive-Programming`) and is shared across all evolution containers.
## Prerequisites
- `claude-docker` image built locally
- `algorithmic-lightcpverifier` judge container running (`Competitive-Programming` on port 8081)
- Docker network `algorithmic_default` (created by the judge's docker-compose)
## How to Create a New Evolution Container
### Step 1: Create the workspace directory
```bash
PROBLEM_ID=0
NAME="frontier_cs_${PROBLEM_ID}_myname"
mkdir -p docker_space/${NAME}
```
### Step 2: Copy problem files (WITHOUT testdata)
From the Frontier-CS problem directory, copy only these files into the workspace root:
```bash
SRC="tasks/Frontier-CS/algorithmic/problems/${PROBLEM_ID}"
cp ${SRC}/statement.txt docker_space/${NAME}/
cp ${SRC}/chk.cc docker_space/${NAME}/
cp ${SRC}/config.yaml docker_space/${NAME}/
cp -r ${SRC}/examples docker_space/${NAME}/ # reference solutions
```
For most problems, `testdata/` only contains 3 sample test cases and can be included as examples for the agent. However, for problem 0 (polyomino), `testdata/` contains all 70 test cases β do NOT copy it to avoid reward hacking.
### Step 3: Add instruction files
Copy from an existing workspace or create new ones:
```bash
cp docker_space/frontier_cs_0_polyomino_ev2/INSTRUCTION.md docker_space/${NAME}/
cp docker_space/frontier_cs_0_polyomino_ev2/evaluate.md docker_space/${NAME}/
```
The workspace should now contain:
```
docker_space/${NAME}/
βββ INSTRUCTION.md # Agent instructions (evolutionary loop, logging, rules)
βββ evaluate.md # How to call the judge API
βββ statement.txt # Problem description
βββ chk.cc # Checker source (for reference only, judge uses its own copy)
βββ config.yaml # Problem config (time/memory limits)
βββ examples/ # Reference solutions to bootstrap from
βββ reference.cpp
βββ gpt5.cpp
```
### Step 4: Launch the container
The entire workspace directory is mounted to `/workspace` inside the container, so all agent output (solutions, logs) is visible on the host.
```bash
docker run -d \
--name ${NAME} \
--privileged \
--shm-size=4g \
-v $(pwd)/docker_space/${NAME}:/workspace \
claude-docker \
sleep infinity
```
### Step 5: Connect to the judge network
```bash
docker network connect algorithmic_default ${NAME}
```
### Step 6: Verify judge connectivity
```bash
docker exec ${NAME} curl -s http://Competitive-Programming:8081/problems | head -c 100
```
### Step 7: Enter the container and start Claude Code
```bash
docker exec -it ${NAME} bash
```
Inside the container, Claude Code reads `/workspace/problem/0/INSTRUCTION.md` and begins evolving.
## File Descriptions
| File | Purpose |
|------|---------|
| `INSTRUCTION.md` | Main agent directive β defines the evolutionary loop, mutation strategies, logging format, and rules |
| `evaluate.md` | API reference for submitting solutions to the judge and polling results |
| `statement.txt` | Full problem statement (input/output format, scoring, constraints) |
| `chk.cc` | Checker source code β for the agent's reference only, not executed locally |
| `config.yaml` | Problem config (time limit, memory limit, number of test cases) |
| `examples/` | Baseline solutions to initialize evolution from |
## Path Mapping
The host directory is mounted directly to `/workspace` in the container:
```
Host: docker_space/${NAME}/
Container: /workspace/
```
Everything the agent writes under `/workspace/` is visible on the host in `docker_space/${NAME}/`.
## Directory Layout (after agent runs)
```
docker_space/${NAME}/ β host path (= /workspace in container)
βββ INSTRUCTION.md # Agent instructions
βββ evaluate.md # Judge API reference
βββ statement.txt # Problem description
βββ chk.cc # Checker source (reference only)
βββ config.yaml # Problem config
βββ examples/ # Baseline solutions
β βββ reference.cpp
β βββ gpt5.cpp
βββ solution.cpp # (created by agent) Current working solution
βββ best.cpp # (created by agent) Best solution so far
βββ logs/ # (created by agent) Evolution history
βββ evolution.log
βββ gen_0.cpp
βββ gen_1.cpp
βββ ...
```
## Quick Reference
```bash
# List all evolution containers
docker ps --filter "ancestor=claude-docker" --format "table {{.Names}}\t{{.Status}}"
# Check judge is running
docker ps --filter "name=Competitive-Programming" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# Stop and remove a container
docker stop ${NAME} && docker rm ${NAME}
# Clean up workspace
rm -rf docker_space/${NAME}
```
## Propmt to start claude code
Follow INSTRUCTION.md, please use iterative refinement to improve the scores you achieve, the higher the better. You can log different generations under logs/. Keep your best solution and scores under best/. I believe you can do it.
IMPORTANT: you can evolve your own evaluation process as well to find some insightful perspectives on how to escape local optima and to create better solutions. |