Optimize for Kaggle P100: float16, batch=1, grad_accum=8, num_gen=4, max_completion=256, lora_r=8 73f957d shank commited on Apr 25
Fix GRPOConfig: rename max_new_tokens to max_completion_length for trl==0.14.0 8b16369 shank commited on Apr 25
Stabilize Space runtime: pin ML deps and disable runtime package drift 663b8db shank commited on Apr 25
Pin torch to cu121 build + use model.device instead of hardcoded cuda string 8f291e0 shank commited on Apr 25
Replace unsloth with bitsandbytes+peft: fixes CUDA driver incompatibility on HF A100 c325ad7 shank commited on Apr 25
Reduce training to 500 steps with tightened curriculum for A10G budget ba8df98 shank commited on Apr 25
Optimize for A100 80GB: 8 generations, batch 4, lr 2e-5, dense logging 2b1fbf3 shank commited on Apr 25
Reduce training to 500 steps with tightened curriculum for A10G budget 3152fa9 shank commited on Apr 25
Fix: Final submission cleanup, unified identity and integrity markers 8807d25 shank commited on Apr 8