Fix GradScaler resume bug - wrapped scaler.load_state_dict() in try/except at line 512. Allows resuming from checkpoints saved without AMP. 9dd1da1 verified OpenTransformer commited on Jan 14
Add trainer: AR+SAT joint training, hot-config, HF auto-upload d648db7 verified OpenTransformer commited on Jan 9