Johannes Truong Le commited on
Commit ·
e4fbe06
1
Parent(s): 7a92e20
Add Qwen3 Coder Next artifacts
Browse files- .gitattributes +2 -0
- README.md +3 -5
- conversion_config_mixed_precision.json +3 -0
- conversion_config_tFP8_calib_based.json +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
conversion_config_mixed_precision.json filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
conversion_config_tFP8_calib_based.json filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -28,11 +28,9 @@ The quantized models are evaluated on 10% of the [WikiText-2 raw v1](https://hug
|
|
| 28 |
|
| 29 |
| Model Configuration | Absolute Perplexity | Relative Perplexity Drop vs. BF16 | Details |
|
| 30 |
|----------------------------------|---------------------|-----------------------------------|-------------------------------------------------------------|
|
| 31 |
-
| BF16 |
|
| 32 |
-
|
|
| 33 |
-
| calibration_based_tFP16 |
|
| 34 |
-
| layerwise_mixed_precision | ???? | ???? % | calibration-based mixed-precision: tFP8, outliers in tFP16 |
|
| 35 |
-
| calibration_free_dynamic_tFP8 | ???? | ???? % | calibration-free tFP8 dynamic quantization |
|
| 36 |
|
| 37 |
## 🚀 Getting Started
|
| 38 |
Refer to the Tensordyne Hugging Face Hub tutorial in our [hosted documentation](https://resources.tensordyne.ai/sdk/) for instructions on using the artifacts provided in this repository.
|
|
|
|
| 28 |
|
| 29 |
| Model Configuration | Absolute Perplexity | Relative Perplexity Drop vs. BF16 | Details |
|
| 30 |
|----------------------------------|---------------------|-----------------------------------|-------------------------------------------------------------|
|
| 31 |
+
| BF16 | 6.351 | – | The baseline model trained in BF16 |
|
| 32 |
+
| layerwise_mixed_precision | 6.365 | 0.23 % | calibration-based mixed-precision: tFP8, outliers in tFP16 |
|
| 33 |
+
| calibration_based_tFP16 | 6.498 | 2.33 % | calibration-free tFP8 dynamic quantization |
|
|
|
|
|
|
|
| 34 |
|
| 35 |
## 🚀 Getting Started
|
| 36 |
Refer to the Tensordyne Hugging Face Hub tutorial in our [hosted documentation](https://resources.tensordyne.ai/sdk/) for instructions on using the artifacts provided in this repository.
|
conversion_config_mixed_precision.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a6e5a7b022d44413a453780c46baf58e58ce8662991557b4ae10e51a446e5eb4
|
| 3 |
+
size 60106414
|
conversion_config_tFP8_calib_based.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:890ca2ccf418229e80722b8ebcc586e6dc29a1884186ad76871c06300e73e6ca
|
| 3 |
+
size 60103205
|