Add detailed base model info, pre-training datasets, and research context 925c5eb verified ronnengmail commited on 3 days ago
Clarify: 20M is SFT tokens, base model pre-trained on 9.8B tokens 4dd04f7 verified ronnengmail commited on 3 days ago