philschmid commited on
Commit
c3dbcf9
·
1 Parent(s): bb6b264

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -33,9 +33,11 @@ It achieves the following results on the evaluation set:
33
  Latency
34
  Payload sequence length: 128
35
  Instance type: AWS c6i.xlarge
36
- Vanilla model: P95 latency (ms) - 86.7772593483096; Average latency (ms) - 62.55 +\- 8.66;
37
- Quantized model: P95 latency (ms) - 27.027633551188046; Average latency (ms) - 26.17 +\- 0.66;
38
- Improvement through quantization: 2.39x
 
 
39
 
40
  ## How to use
41
 
 
33
  Latency
34
  Payload sequence length: 128
35
  Instance type: AWS c6i.xlarge
36
+
37
+ | latency | vanilla transformers | quantized optimum model | improvement |
38
+ |---------|----------------------|-------------------------|-------------|
39
+ | p95 | 86.77ms | 27.03ms | 3.21x |
40
+ | avg | 62.55ms | 26.17ms | 2.39x |
41
 
42
  ## How to use
43