Hannes von Essen commited on
Commit ·
6bbb71f
1
Parent(s): 7262713
Model card: add top accuracy plot image for Edge2
Browse files
README.md
CHANGED
|
@@ -20,6 +20,8 @@ license_link: https://github.com/embedl/embedl-models/blob/main/LICENSE
|
|
| 20 |
|
| 21 |
# Cosmos-Reason2-2B-W4A16-Edge2
|
| 22 |
|
|
|
|
|
|
|
| 23 |
**Optimized version of [nvidia/Cosmos-Reason2-2B](https://huggingface.co/nvidia/Cosmos-Reason2-2B) using quantization and targeted mixed-precision exclusions.**
|
| 24 |
This release is based on the W4A16 line and adds a **mixed precision quantization** recipe resulting in **almost no accuracy drop** while preserving the **2x speedup**.
|
| 25 |
|
|
|
|
| 20 |
|
| 21 |
# Cosmos-Reason2-2B-W4A16-Edge2
|
| 22 |
|
| 23 |
+
<img src="https://huggingface.co/datasets/embedl/documentation-images/resolve/main/Cosmos-Reason2-2B-W4A16-Edge2/bar_pot_accuracy.png" alt="Cosmos-Reason2-2B Benchmark Results" width="75%">
|
| 24 |
+
|
| 25 |
**Optimized version of [nvidia/Cosmos-Reason2-2B](https://huggingface.co/nvidia/Cosmos-Reason2-2B) using quantization and targeted mixed-precision exclusions.**
|
| 26 |
This release is based on the W4A16 line and adds a **mixed precision quantization** recipe resulting in **almost no accuracy drop** while preserving the **2x speedup**.
|
| 27 |
|