almaghrabima commited on
Commit
c94c9a8
·
verified ·
1 Parent(s): 4e2b74e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -86,10 +86,10 @@ Comparison with state-of-the-art tokenizers on 60,000 samples (30k Arabic + 30k
86
 
87
  | Tokenizer | Vocab | AR Fert | EN Fert | Avg Fert | AR C/T | EN C/T | Parity |
88
  |-----------|-------|---------|---------|----------|--------|--------|--------|
89
- | **SARFTokenizer** | 64,641 | **1.72** | 1.57 | **1.64** | 3.45 | 2.99 | 1.156 |
90
  | ALLaM-7B | 64,000 | 1.82 | 1.48 | 1.65 | 3.08 | 2.65 | 1.163 |
91
  | Gemma-3-4B | 262,145 | 2.78 | 1.33 | 2.05 | 2.42 | 3.00 | 0.805 |
92
- | Falcon-H1-7B | 130,049 | 2.65 | 1.55 | 2.10 | 2.55 | 2.75 | **0.926** |
93
  | Fanar-1-9B | 128,256 | 2.85 | 1.36 | 2.11 | 2.27 | 2.93 | 0.775 |
94
  | Hala-9B | 128,256 | 2.85 | 1.36 | 2.11 | 2.27 | 2.93 | 0.775 |
95
  | GPT-4o | 200,019 | 2.81 | 1.44 | 2.12 | 2.45 | 3.37 | 0.726 |
 
86
 
87
  | Tokenizer | Vocab | AR Fert | EN Fert | Avg Fert | AR C/T | EN C/T | Parity |
88
  |-----------|-------|---------|---------|----------|--------|--------|--------|
89
+ | **SARFTokenizer** | 64,641 | **1.72** | 1.57 | **1.64** | 3.45 | 2.99 | **1.156** |
90
  | ALLaM-7B | 64,000 | 1.82 | 1.48 | 1.65 | 3.08 | 2.65 | 1.163 |
91
  | Gemma-3-4B | 262,145 | 2.78 | 1.33 | 2.05 | 2.42 | 3.00 | 0.805 |
92
+ | Falcon-H1-7B | 130,049 | 2.65 | 1.55 | 2.10 | 2.55 | 2.75 | 0.926 |
93
  | Fanar-1-9B | 128,256 | 2.85 | 1.36 | 2.11 | 2.27 | 2.93 | 0.775 |
94
  | Hala-9B | 128,256 | 2.85 | 1.36 | 2.11 | 2.27 | 2.93 | 0.775 |
95
  | GPT-4o | 200,019 | 2.81 | 1.44 | 2.12 | 2.45 | 3.37 | 0.726 |