Commit History

Document pretraining tokenizer behavior
7a5473c
verified

sanjeevnv commited on

Append </s> in pretraining chat template file
f271e15
verified

sanjeevnv commited on

Use </s> as pretraining EOD token
cc75d08
verified

sanjeevnv commited on

Remove BOS from tokenizer
afdb489
verified

sanjeevnv commited on

Add generation markers around answer content for optional SFT loss masking
26f81d5
verified

sanjeevnv commited on

Update to simple question/answer pretraining template (all tokens trainable)
7a287ad
verified

sanjeevnv commited on

Add Nemotron-Nano tokenizer with generation markers for SFT loss masking
356b6aa
verified

sanjeevnv commited on

initial commit
64abc35
verified

sanjeevnv commited on