| # GPU Mode: Float16 Vector Addition | |
| Evolve a Triton kernel for float16 vector addition using SkyDiscover. | |
| **Operation:** `C = A + B` (element-wise, float16) | |
| ## Quick Start | |
| From the repo root: | |
| ```bash | |
| uv run skydiscover-run \ | |
| benchmarks/gpu_mode/vecadd/initial_program.py \ | |
| benchmarks/gpu_mode/vecadd/evaluator.py \ | |
| -c benchmarks/gpu_mode/vecadd/config.yaml \ | |
| -s [your_algorithm] -i 50 | |
| ``` | |
| ## Scoring | |
| - **Correctness weight:** 0.3 (must return float16, rtol/atol=1e-3) | |
| - **Speedup weight:** 1.0 (geometric mean vs PyTorch reference, capped at 10x) | |
| - **Combined:** `0.3 * correctness + speedup` | |
| ## Modal Cloud GPU Support | |
| ```bash | |
| GPUMODE_USE_MODAL=true GPUMODE_MODAL_GPU=H100 \ | |
| uv run skydiscover-run \ | |
| benchmarks/gpu_mode/vecadd/initial_program.py \ | |
| benchmarks/gpu_mode/vecadd/evaluator.py \ | |
| -c benchmarks/gpu_mode/vecadd/config.yaml \ | |
| -s [your_algorithm] -i 50 | |
| ``` | |