JustinTX's picture
Add files using upload-large-folder tool
1556404 verified

Docker All Workspace

Each subdirectory is an experiment workspace mounted into a Docker container where Claude Code autonomously evolves C++ solutions for Frontier-CS problems.

Prerequisites

  • claude-docker image built locally
  • Competitive-Programming judge container running on network algorithmic_default

Workspace Structure

docker_all_workspace/
β”œβ”€β”€ ev2_skill_0409/                    # Experiment: ev2 skill evaluation
β”‚   β”œβ”€β”€ frontier_cs_1/                 # Problem 1 workspace
β”‚   β”‚   β”œβ”€β”€ INSTRUCTION.md             # Agent instructions (includes ev2 skill reference)
β”‚   β”‚   β”œβ”€β”€ ev2_skill.md               # Evolve-evaluation skill document
β”‚   β”‚   β”œβ”€β”€ statement.txt              # Problem description
β”‚   β”‚   β”œβ”€β”€ chk.cc                     # Checker (reference only)
β”‚   β”‚   β”œβ”€β”€ config.yaml                # Time/memory limits
β”‚   β”‚   β”œβ”€β”€ examples/                  # Baseline solutions
β”‚   β”‚   β”‚   β”œβ”€β”€ gpt5.cpp
β”‚   β”‚   β”‚   └── gemini3pro.cpp
β”‚   β”‚   β”œβ”€β”€ logs/                      # Evolution history (created by agent)
β”‚   β”‚   └── best/                      # Best solution (created by agent)
β”‚   β”œβ”€β”€ frontier_cs_2/
β”‚   β”œβ”€β”€ ...
β”‚   └── frontier_cs_10/

How to Launch

1. Create and start the container

EXPERIMENT="ev2_skill_0409"

docker run -d \
  --name ${EXPERIMENT} \
  --privileged \
  --shm-size=4g \
  -v $(pwd)/docker_all_workspace/${EXPERIMENT}:/workspace \
  claude-docker \
  sleep infinity

2. Connect to the judge network

docker network connect algorithmic_default ${EXPERIMENT}

3. Verify judge connectivity

docker exec ${EXPERIMENT} curl -s http://Competitive-Programming:8081/problems | head -c 100

4. Enter the container

docker exec -it ${EXPERIMENT} bash

Inside the container, /workspace/ contains all problem workspaces. Navigate to a problem and start Claude Code:

cd /workspace/frontier_cs_1

5. Start Claude Code

Prompt to use:

Follow INSTRUCTION.md, please use iterative refinement to improve the scores you achieve, the higher the better. You can log different generations under logs/. Keep your best solution and scores under best/. I believe you can do it. 
IMPORTANT: you can evolve your own evaluation process as well to find some insightful perspectives on how to escape local optima and to create better solutions.

How to Create a New Experiment

EXPERIMENT="my_experiment_YYYYMMDD"
mkdir -p docker_all_workspace/${EXPERIMENT}

# For each problem:
for PID in $(seq 1 10); do
    DIR="docker_all_workspace/${EXPERIMENT}/frontier_cs_${PID}"
    mkdir -p ${DIR}/examples ${DIR}/logs ${DIR}/best

    SRC="tasks/Frontier-CS/algorithmic/problems/${PID}"
    SOL="tasks/Frontier-CS/algorithmic/solutions/${PID}"

    cp ${SRC}/statement.txt ${DIR}/
    cp ${SRC}/chk.cc ${DIR}/ 2>/dev/null
    cp ${SRC}/config.yaml ${DIR}/
    cp ${SOL}/gpt5.cpp ${DIR}/examples/ 2>/dev/null
    cp ${SOL}/gemini3pro.cpp ${DIR}/examples/ 2>/dev/null

    # Copy skill files and create INSTRUCTION.md as needed
done

Then launch the container following steps 1-5 above.

Quick Reference

# List all experiment containers
docker ps --filter "ancestor=claude-docker" --format "table {{.Names}}\t{{.Status}}"

# Stop and remove
docker stop ${EXPERIMENT} && docker rm ${EXPERIMENT}

# Check judge
docker ps --filter "name=Competitive-Programming"