inferencerlabs commited on
Commit
a59b138
·
verified ·
1 Parent(s): 48ac534

Upload model file

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: text-generation
12
 
13
  #### Tested on a M3 Ultra 512GB RAM using [Inferencer app v1.10](https://inferencer.com)
14
  - Single inference ~36.5 tokens/s @ 1000 tokens
15
- - Batched inference ~ total tokens/s across six inferences
16
  - Memory usage: ~239 GiB
17
 
18
  *q9bit quant typically achieves near lossless accuracy in our coding test*
 
12
 
13
  #### Tested on a M3 Ultra 512GB RAM using [Inferencer app v1.10](https://inferencer.com)
14
  - Single inference ~36.5 tokens/s @ 1000 tokens
15
+ - Batched inference ~44 total tokens/s across two inferences
16
  - Memory usage: ~239 GiB
17
 
18
  *q9bit quant typically achieves near lossless accuracy in our coding test*