philschmid
/

quantized-distilbert-banking77

Text Classification

Eval Results (legacy)

Model card Files Files and versions

philschmid commited on Jun 8, 2022

Commit

0936cdd

·

1 Parent(s): 8bc14a9

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -29,15 +29,15 @@ This model is a statically quantized version of [optimum/distilbert-base-uncased
 It achieves the following results on the evaluation set:
 - Vanilla model: 92.5%
-- Quantized model: 92.24%
 => The quantized model achieves 99.72% accuracy of the fp32 model
 Latency
-Payload sequence length: 128
-Instance type: AWS c6i.xlarge
-Vanilla model: P95 latency (ms) - 86.7772593483096; Average latency (ms) - 62.55 +\- 8.66;
-Quantized model: P95 latency (ms) - 27.027633551188046; Average latency (ms) - 26.17 +\- 0.66;
-Improvement through quantization: 2.39x
 ## How to use

 It achieves the following results on the evaluation set:
 - Vanilla model: 92.5%
+- Quantized model: 92.24%.
 => The quantized model achieves 99.72% accuracy of the fp32 model
 Latency
+Payload sequence length: 128
+Instance type: AWS c6i.xlarge
+Vanilla model: P95 latency (ms) - 86.7772593483096; Average latency (ms) - 62.55 +\- 8.66;
+Quantized model: P95 latency (ms) - 27.027633551188046; Average latency (ms) - 26.17 +\- 0.66;
+Improvement through quantization: 2.39x
 ## How to use