Update README.md
Browse files
README.md
CHANGED
|
@@ -38,7 +38,7 @@ This is an update over the instruction-tuned Phi-3 Mini ONNX model release. We b
|
|
| 38 |
|
| 39 |
## What’s New (2026-02)
|
| 40 |
|
| 41 |
-
This update introduces an improved **INT4 GPU ONNX model** that incorporates **quantization-aware fine-tuning (QAT)** on top of the existing quantization pipeline.
|
| 42 |
|
| 43 |
### Benchmark Accuracy Improvements (INT4 GPU)
|
| 44 |
|
|
@@ -49,7 +49,7 @@ This update introduces an improved **INT4 GPU ONNX model** that incorporates **q
|
|
| 49 |
| Commonsense | PIQA, Winogrande | **+0.5 to +1.0 pts** |
|
| 50 |
| Broad Coverage | MMLU (overall) | −0.5 pts |
|
| 51 |
|
| 52 |
-
|
| 53 |
|
| 54 |
### Generation Stability (EOS Behavior)
|
| 55 |
|
|
@@ -194,7 +194,7 @@ Activation Aware Quantization (AWQ) works by identifying the top 1% most salient
|
|
| 194 |
parinitarahi
|
| 195 |
|
| 196 |
## Contributors
|
| 197 |
-
Sunghoon Choi, Yufeng Li, Kunal Vaishnavi, Akshay Sonawane, Rui Ren, Parinita Rahi
|
| 198 |
|
| 199 |
## License
|
| 200 |
The model is licensed under the MIT license.
|
|
|
|
| 38 |
|
| 39 |
## What’s New (2026-02)
|
| 40 |
|
| 41 |
+
This update introduces an improved **INT4 GPU ONNX model** that incorporates **quantization-aware fine-tuning (QAT)** on top of the existing quantization pipeline.
|
| 42 |
|
| 43 |
### Benchmark Accuracy Improvements (INT4 GPU)
|
| 44 |
|
|
|
|
| 49 |
| Commonsense | PIQA, Winogrande | **+0.5 to +1.0 pts** |
|
| 50 |
| Broad Coverage | MMLU (overall) | −0.5 pts |
|
| 51 |
|
| 52 |
+
The table above provides a high-level summary of observed accuracy deltas across benchmark categories compared to the old INT4 GPU model. The QAT-tuned INT4 GPU model improves performance on the majority of downstream reasoning and QA benchmarks, with a small regression on broad-coverage evaluation.
|
| 53 |
|
| 54 |
### Generation Stability (EOS Behavior)
|
| 55 |
|
|
|
|
| 194 |
parinitarahi
|
| 195 |
|
| 196 |
## Contributors
|
| 197 |
+
Sunghoon Choi, Yufeng Li, Kunal Vaishnavi, Akshay Sonawane, Rui Ren, Parinita Rahi, Nenad Banfic
|
| 198 |
|
| 199 |
## License
|
| 200 |
The model is licensed under the MIT license.
|