Fix: add assistant_only_loss=False to prevent all labels being masked to -100 61b62c0 verified ssdataanalysis commited on 14 days ago
Optimal config: high-quality Hebrew data, constant LR, packing disabled, lora_dropout=0.1 960c757 verified ssdataanalysis commited on 14 days ago
Switch to packing=True, batch=4, grad_acc=4, step-based checkpoints for speed 522d2a2 verified ssdataanalysis commited on 14 days ago
Fix OOM: reduce max_length to 2048, disable packing, increase grad accum, remove liger 8792aab verified ssdataanalysis commited on 15 days ago
Fix typo in exclude_modules and add entrypoint script 8fd9e86 verified ssdataanalysis commited on 15 days ago