toksuite
/

gpt2

Text Generation

text-generation-inference

Model card Files Files and versions

Malikeh1375 commited on Dec 22, 2025

Commit

6aa028a

·

verified ·

1 Parent(s): b1434a7

Update README.md

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -164,15 +164,17 @@ Values represent **relative performance drop**, computed as `(Acc_clean − Acc_
 | **Avg** | 0.31 | 0.44 | 0.11 | 0.08 | 0.24 | **0.04** | 0.15 | 0.21 | 0.22 | 0.28 | **0.53** | 0.24 |
 ## Intended Use
 This model is intended for:
-- research on tokenization behavior,
-- robustness and perturbation analysis,
 - controlled ablation studies,
-- benchmarking subword tokenizers.
-It is **not instruction-tuned** and **not aligned** for deployment or interactive use.
 ---
@@ -180,8 +182,8 @@ It is **not instruction-tuned** and **not aligned** for deployment or interactiv
 - Trained on a limited set of five languages.
 - Not optimized for instruction following or dialogue.
-- Performance reflects tokenizer behavior rather than downstream task tuning.
-- Intended strictly for research use.
 ---

 | **Avg** | 0.31 | 0.44 | 0.11 | 0.08 | 0.24 | **0.04** | 0.15 | 0.21 | 0.22 | 0.28 | **0.53** | 0.24 |
+---
 ## Intended Use
 This model is intended for:
+- research on tokenization and robustness,
+- multilingual NLP analysis,
 - controlled ablation studies,
+- benchmarking tokenizer behavior under noise.
+It is **not** instruction-tuned, aligned, or optimized for deployment.
 ---
 - Trained on a limited set of five languages.
 - Not optimized for instruction following or dialogue.
+- Fixed token budget constrains exposure to raw text depending on tokenization efficiency.
+- Intended strictly for research purposes.
 ---