nvidia
/

Nemotron-Flash-3B

Text Generation

Model card Files Files and versions

YongganFu commited on Nov 26, 2025

Commit

42b4137

·

verified ·

1 Parent(s): 3e52573

Update README.md

Files changed (1) hide show

README.md +9 -2

README.md CHANGED Viewed

@@ -1,9 +1,16 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Nemotron-Flash-3B Base Model
 Nemotron-Flash is a new hybrid SLM model family that outperforms Qwen models in accuracy (math, coding, and commonsense), batch-size-1 latency, and throughput. More details are in our NeurIPS 2025 [paper](https://drive.google.com/drive/folders/17vOGktwUfUpRAJPGJUV6oX8XwLSczZtv?usp=sharing).

 ---
 library_name: transformers
+license: other
+license_name: cc-by-nc-4.0
+pipeline_tag: text-generation
 ---
+# Nemotron-Flash-3B-Base
+<p align="center">
+ :hugging_face: <a href="https://github.com/NVlabs/hymba">Github</a>&nbsp&nbsp | &nbsp&nbsp 📄 <a href="https://arxiv.org/abs/2411.13676">Paper</a> | &nbsp&nbsp 📜 <a href="https://developer.nvidia.com/blog/hymba-hybrid-head-architecture-boosts-small-language-model-performance/">Blog</a> &nbsp
+</p>
 Nemotron-Flash is a new hybrid SLM model family that outperforms Qwen models in accuracy (math, coding, and commonsense), batch-size-1 latency, and throughput. More details are in our NeurIPS 2025 [paper](https://drive.google.com/drive/folders/17vOGktwUfUpRAJPGJUV6oX8XwLSczZtv?usp=sharing).