Add files using upload-large-folder tool

16dd578 verified 21 days ago

preview code

raw

history blame contribute delete

7.41 kB

KernelBench Integration with SkyDiscover

GPU kernel optimization tasks using the KernelBench dataset and evaluation protocol.

Overview

The KernelBench integration allows you to run SkyDiscover on any problem from the KernelBench dataset. The framework automatically:

Fetches the reference implementation of the target kernel from KernelBench
Creates an initial_program.py with EVOLVE-BLOCK markers
Configures the evaluator with problem-specific parameters
Runs the optimization using either a containerized or native Python evaluator

The evaluator uses the KernelBench evaluation infrastructure to measure speedup over PyTorch eager execution.

Evaluator Modes

Containerized (Docker): Runs evaluation inside a Docker container (default)
Native Python: Runs evaluation directly as Python code (for clusters without Docker/Podman)

Directory Structure

benchmarks/kernelbench/
├── config.yaml              # System prompt + search/evaluator settings
├── resolver.py              # Benchmark loader (fetches target problems from KernelBench)
├── requirements.txt         # Resolver dependencies (kernelbench library)
└── evaluator/               # Self-contained Docker benchmark
    ├── Dockerfile           # Container image definition
    ├── evaluate.sh          # Entrypoint (receives solution path)
    ├── evaluator.py         # Scoring logic using KernelBench
    ├── requirements.txt     # Evaluator dependencies (kernelbench[gpu])
    └── wrapper.py           # JSON protocol wrapper

Note: The run_and_check.py script is downloaded directly from the KernelBench repository during Docker build (pinned to commit 423217d for reproducibility). To update, modify the KERNELBENCH_COMMIT build arg in the Dockerfile.

Installation

Before using the KernelBench integration, install the required dependencies:

# Install KernelBench library (required for problem fetching)
uv pip install -r benchmarks/kernelbench/requirements.txt

Note: The resolver (problem fetching) only needs the base kernelbench package. The containerized evaluator installs kernelbench[gpu] for GPU support.

Quick Start

Using Docker (Default)

Edit benchmarks/kernelbench/config.yaml to select a target kernel from the KernelBench database:

benchmark:
  # KernelBench problem specification
  level: 2                    # Problem difficulty level (1, 2, 3 or 4)
  problem_id: 5               # Specific problem ID within the level

Then, run optimization on this problem:

# algo can be "adaevolve", "evox", "topk", "beam_search", "best_of_n", etc.
uv run skydiscover-run benchmarks/kernelbench/evaluator/ \
  -c benchmarks/kernelbench/config.yaml \
  --search <algo> \
  --iterations 50

Using Native Python (No Docker Required)

For clusters without Docker/Podman privileges, you can run the evaluator as native Python code.

1. Install Dependencies

# Install KernelBench with GPU support
pip install -r benchmarks/kernelbench/evaluator/requirements.txt

2. Configure Native Mode

Edit benchmarks/kernelbench/config.yaml:

benchmark:
  enabled: true
  name: kernelbench
  resolver: benchmarks.kernelbench.resolver
  
  # Set to false to use native Python evaluator (no Docker)
  use_docker: false
  
  level: 2
  problem_id: 11
  # ... rest of config

3. Run Optimization

# algo can be "adaevolve", "evox", "topk", "beam_search", "best_of_n", etc.
uv run skydiscover-run benchmarks/kernelbench/evaluator/ \
  -c benchmarks/kernelbench/config.yaml \
  --search <algo> \
  --iterations 50

Note: The run_and_check.py script from KernelBench will be automatically downloaded on first run.

Note: No initial_program argument is needed - it is fetched automatically based on the benchmark section in config.yaml.

Configuration Reference

Benchmark Section

The benchmark section in config.yaml controls problem loading:

benchmark:
  enabled: true                    # Enable benchmark loader
  name: kernelbench                # Benchmark name (for logging)
  resolver: benchmarks.kernelbench.resolver  # Python module path
  
  # Evaluator mode
  use_docker: true                 # true: containerized (Docker), false: native Python
  
  # Problem specification
  level: 1                         # Difficulty: 1 (easy), 2 (medium), 3 (hard), 4 (very hard)
  problem_id: 1                    # Problem ID within the level
  
  # Dataset source
  dataset_src: huggingface         # 'huggingface' or 'local'
  dataset_name: ScalingIntelligence/KernelBench  # HF dataset name
  
  # Evaluation settings
  eval_mode: local                 # 'local' or 'modal'
  gpu: H100                        # GPU type: H100, A100, etc.
  num_correct_trials: 5            # Correctness validation runs
  num_perf_trials: 100             # Performance measurement runs

Environment Variables

The resolver provides these environment variables to the evaluator:

KERNELBENCH_LEVEL: Problem difficulty level (1, 2, or 3)
KERNELBENCH_PROBLEM_ID: Specific problem within the level
KERNELBENCH_EVAL_MODE: Evaluation mode (local, modal)
KERNELBENCH_GPU: GPU type (H100, A100, etc.)
KERNELBENCH_NUM_CORRECT_TRIALS: Number of correctness validation runs
KERNELBENCH_NUM_PERF_TRIALS: Number of performance measurement runs
KERNELBENCH_TIMEOUT: Timeout per evaluation in seconds

These variables are passed directly to the evaluator (not set globally), ensuring isolation between concurrent runs.

Evaluation Modes

local: Run evaluation on your local machine (requires GPU)
modal: Run evaluation on Modal's cloud GPUs (requires Modal setup)

GPU Types

The list of currently supported GPU types can be found here.

Metrics

The evaluator returns:

combined_score: Speedup over PyTorch eager execution (primary metric)
speedup_over_eager: Same as combined_score
speedup_over_compile: Speedup over torch.compile()
kernel_time_ms: Execution time of optimized kernel
ref_eager_time_ms: Reference eager execution time

Traditional Usage (Manual Initial Program)

You can still provide an initial program manually if needed:

# Run with explicit initial program
uv run skydiscover-run my_kernel.py benchmarks/kernelbench/evaluator/ \
  -c benchmarks/kernelbench/config.yaml \
  --search <algo>

Troubleshooting

Error: "kernelbench package not found"

Install KernelBench:

pip install "kernelbench[gpu] @ git+https://github.com/ScalingIntelligence/KernelBench.git"

Error: "Failed to resolve benchmark problem"

Check that:

benchmark.enabled is true in config
level and problem_id are valid
KernelBench package is installed
You have internet access (for HuggingFace dataset)

Generated Files Location

The framework creates temporary files in /tmp/skydiscover_kernelbench_*/:

initial_program.py: Generated initial program
Evaluator uses the existing benchmarks/kernelbench/evaluator/ directory