Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 18 days ago • 146
latam-gpt/Wayra-Perplexity-Estimator-55M Text Classification • 55.4M • Updated Aug 15, 2025 • 121 • 19
facebook/wav2vec2-large-960h-lv60-self Automatic Speech Recognition • Updated May 23, 2022 • 94.3k • 161
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 272k • 2.82k