Update README.md
Browse files
README.md
CHANGED
|
@@ -164,15 +164,17 @@ Values represent **relative performance drop**, computed as `(Acc_clean − Acc_
|
|
| 164 |
| **Avg** | 0.31 | 0.44 | 0.11 | 0.08 | 0.24 | **0.04** | 0.15 | 0.21 | 0.22 | 0.28 | **0.53** | 0.24 |
|
| 165 |
|
| 166 |
|
|
|
|
|
|
|
| 167 |
## Intended Use
|
| 168 |
|
| 169 |
This model is intended for:
|
| 170 |
-
- research on tokenization
|
| 171 |
-
-
|
| 172 |
- controlled ablation studies,
|
| 173 |
-
- benchmarking
|
| 174 |
|
| 175 |
-
It is **not instruction-tuned
|
| 176 |
|
| 177 |
---
|
| 178 |
|
|
@@ -180,8 +182,8 @@ It is **not instruction-tuned** and **not aligned** for deployment or interactiv
|
|
| 180 |
|
| 181 |
- Trained on a limited set of five languages.
|
| 182 |
- Not optimized for instruction following or dialogue.
|
| 183 |
-
-
|
| 184 |
-
- Intended strictly for research
|
| 185 |
|
| 186 |
---
|
| 187 |
|
|
|
|
| 164 |
| **Avg** | 0.31 | 0.44 | 0.11 | 0.08 | 0.24 | **0.04** | 0.15 | 0.21 | 0.22 | 0.28 | **0.53** | 0.24 |
|
| 165 |
|
| 166 |
|
| 167 |
+
---
|
| 168 |
+
|
| 169 |
## Intended Use
|
| 170 |
|
| 171 |
This model is intended for:
|
| 172 |
+
- research on tokenization and robustness,
|
| 173 |
+
- multilingual NLP analysis,
|
| 174 |
- controlled ablation studies,
|
| 175 |
+
- benchmarking tokenizer behavior under noise.
|
| 176 |
|
| 177 |
+
It is **not** instruction-tuned, aligned, or optimized for deployment.
|
| 178 |
|
| 179 |
---
|
| 180 |
|
|
|
|
| 182 |
|
| 183 |
- Trained on a limited set of five languages.
|
| 184 |
- Not optimized for instruction following or dialogue.
|
| 185 |
+
- Fixed token budget constrains exposure to raw text depending on tokenization efficiency.
|
| 186 |
+
- Intended strictly for research purposes.
|
| 187 |
|
| 188 |
---
|
| 189 |
|