fix: shift logits/labels for proper causal LM loss (was predicting current token instead of next token) 3edee97 verified maidacundo commited on Apr 22
fix: shift logits/labels for proper causal LM loss in production training script 8fed5c0 verified maidacundo commited on Apr 22