Update README.md
Browse files
README.md
CHANGED
|
@@ -1,9 +1,16 @@
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
-
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
-
# Nemotron-Flash-3B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
|
| 8 |
Nemotron-Flash is a new hybrid SLM model family that outperforms Qwen models in accuracy (math, coding, and commonsense), batch-size-1 latency, and throughput. More details are in our NeurIPS 2025 [paper](https://drive.google.com/drive/folders/17vOGktwUfUpRAJPGJUV6oX8XwLSczZtv?usp=sharing).
|
| 9 |
|
|
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
+
license: other
|
| 4 |
+
license_name: cc-by-nc-4.0
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
---
|
| 7 |
|
| 8 |
+
# Nemotron-Flash-3B-Base
|
| 9 |
+
|
| 10 |
+
<p align="center">
|
| 11 |
+
:hugging_face: <a href="https://github.com/NVlabs/hymba">Github</a>   |    📄 <a href="https://arxiv.org/abs/2411.13676">Paper</a> |    📜 <a href="https://developer.nvidia.com/blog/hymba-hybrid-head-architecture-boosts-small-language-model-performance/">Blog</a>  
|
| 12 |
+
</p>
|
| 13 |
+
|
| 14 |
|
| 15 |
Nemotron-Flash is a new hybrid SLM model family that outperforms Qwen models in accuracy (math, coding, and commonsense), batch-size-1 latency, and throughput. More details are in our NeurIPS 2025 [paper](https://drive.google.com/drive/folders/17vOGktwUfUpRAJPGJUV6oX8XwLSczZtv?usp=sharing).
|
| 16 |
|