|
|
--- |
|
|
license: bsd-3-clause |
|
|
tags: |
|
|
- openenv |
|
|
- cuda |
|
|
- triton |
|
|
- gpu-kernels |
|
|
- reinforcement-learning |
|
|
- grpo |
|
|
--- |
|
|
|
|
|
# kernrl Training Materials |
|
|
|
|
|
Training resources for the kernrl GPU kernel optimization environment. |
|
|
|
|
|
## Overview |
|
|
|
|
|
This repository contains: |
|
|
- GRPO training notebook for training LLMs to write optimized GPU kernels |
|
|
- Example scripts and configurations |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
```python |
|
|
from trl import GRPOConfig, GRPOTrainer |
|
|
from kernrl import kernrl_env, KernelAction |
|
|
|
|
|
# Connect to kernrl environment |
|
|
env = kernrl_env(base_url="http://localhost:8000") |
|
|
|
|
|
# Train with GRPO |
|
|
trainer = GRPOTrainer( |
|
|
model="Qwen/Qwen2.5-Coder-1.5B-Instruct", |
|
|
reward_funcs=[reward_compilation, reward_correctness, reward_speedup], |
|
|
train_dataset=dataset, |
|
|
rollout_func=rollout_func, |
|
|
args=GRPOConfig(use_vllm=True, vllm_mode="colocate"), |
|
|
) |
|
|
trainer.train() |
|
|
``` |
|
|
|
|
|
## Files |
|
|
|
|
|
- `kernrl_grpo_training.ipynb` - Complete GRPO training notebook |
|
|
- `train_kernrl.py` - Standalone training script |
|
|
|
|
|
## Links |
|
|
|
|
|
- [kernrl Environment](https://huggingface.co/spaces/Infatoshi/kernrl) |
|
|
- [OpenEnv Repository](https://github.com/meta-pytorch/OpenEnv) |
|
|
- [TRL Documentation](https://huggingface.co/docs/trl) |
|
|
|