chore: normalize dataset inputs and fix mergekit dependency for TRL 0.14.0 e67270e shank commited on Apr 25
Optimize for Kaggle P100: float16, batch=1, grad_accum=8, num_gen=4, max_completion=256, lora_r=8 73f957d shank commited on Apr 25