philschmid commited on
Commit
0936cdd
·
1 Parent(s): 8bc14a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -29,15 +29,15 @@ This model is a statically quantized version of [optimum/distilbert-base-uncased
29
  It achieves the following results on the evaluation set:
30
 
31
  - Vanilla model: 92.5%
32
- - Quantized model: 92.24%
33
  => The quantized model achieves 99.72% accuracy of the fp32 model
34
 
35
  Latency
36
- Payload sequence length: 128
37
- Instance type: AWS c6i.xlarge
38
- Vanilla model: P95 latency (ms) - 86.7772593483096; Average latency (ms) - 62.55 +\- 8.66;
39
- Quantized model: P95 latency (ms) - 27.027633551188046; Average latency (ms) - 26.17 +\- 0.66;
40
- Improvement through quantization: 2.39x
41
 
42
  ## How to use
43
 
 
29
  It achieves the following results on the evaluation set:
30
 
31
  - Vanilla model: 92.5%
32
+ - Quantized model: 92.24%.
33
  => The quantized model achieves 99.72% accuracy of the fp32 model
34
 
35
  Latency
36
+ Payload sequence length: 128
37
+ Instance type: AWS c6i.xlarge
38
+ Vanilla model: P95 latency (ms) - 86.7772593483096; Average latency (ms) - 62.55 +\- 8.66;
39
+ Quantized model: P95 latency (ms) - 27.027633551188046; Average latency (ms) - 26.17 +\- 0.66;
40
+ Improvement through quantization: 2.39x
41
 
42
  ## How to use
43