Update README.md
Browse files
README.md
CHANGED
|
@@ -43,6 +43,42 @@ outputs = model.generate(**inputs, max_new_tokens=128)
|
|
| 43 |
print(AutoTokenizer.from_pretrained(model_name).decode(outputs[0], skip_special_tokens=True))
|
| 44 |
```
|
| 45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
## Paper & Documentation
|
| 47 |
|
| 48 |
<details>
|
|
|
|
| 43 |
print(AutoTokenizer.from_pretrained(model_name).decode(outputs[0], skip_special_tokens=True))
|
| 44 |
```
|
| 45 |
|
| 46 |
+
## Model Description
|
| 47 |
+
|
| 48 |
+
Kazakh text generation model fine-tuned from `google/mt5-base` on the `Darmm/darmm-text-generation-kazakh` dataset.
|
| 49 |
+
|
| 50 |
+
## Training (summary)
|
| 51 |
+
|
| 52 |
+
- Base model: `google/mt5-base`
|
| 53 |
+
- Epochs: 3
|
| 54 |
+
- Batch size: 2
|
| 55 |
+
- Learning rate: 1e-4
|
| 56 |
+
- Max input length: 256
|
| 57 |
+
- Max target length: 256
|
| 58 |
+
|
| 59 |
+
## Metrics (summary)
|
| 60 |
+
|
| 61 |
+
```
|
| 62 |
+
{
|
| 63 |
+
"eval_loss": 0.08725570142269135,
|
| 64 |
+
"eval_exact_match": 0.5547445255474452,
|
| 65 |
+
"eval_rouge1": 0.10948905109489052,
|
| 66 |
+
"eval_rouge2": 0.10583941605839416,
|
| 67 |
+
"eval_rougeL": 0.10948905109489052,
|
| 68 |
+
"epoch": 3.0
|
| 69 |
+
}
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
## Intended use
|
| 73 |
+
|
| 74 |
+
- Instruction-style Kazakh text generation for short responses.
|
| 75 |
+
- Educational and informational content generation prototypes.
|
| 76 |
+
|
| 77 |
+
## Limitations
|
| 78 |
+
|
| 79 |
+
- Limited dataset size may reduce generalization to unseen domains.
|
| 80 |
+
- Outputs may be generic for short prompts.
|
| 81 |
+
|
| 82 |
## Paper & Documentation
|
| 83 |
|
| 84 |
<details>
|