R3iwan commited on
Commit
6a3068f
·
verified ·
1 Parent(s): d890ea4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md CHANGED
@@ -43,6 +43,42 @@ outputs = model.generate(**inputs, max_new_tokens=128)
43
  print(AutoTokenizer.from_pretrained(model_name).decode(outputs[0], skip_special_tokens=True))
44
  ```
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ## Paper & Documentation
47
 
48
  <details>
 
43
  print(AutoTokenizer.from_pretrained(model_name).decode(outputs[0], skip_special_tokens=True))
44
  ```
45
 
46
+ ## Model Description
47
+
48
+ Kazakh text generation model fine-tuned from `google/mt5-base` on the `Darmm/darmm-text-generation-kazakh` dataset.
49
+
50
+ ## Training (summary)
51
+
52
+ - Base model: `google/mt5-base`
53
+ - Epochs: 3
54
+ - Batch size: 2
55
+ - Learning rate: 1e-4
56
+ - Max input length: 256
57
+ - Max target length: 256
58
+
59
+ ## Metrics (summary)
60
+
61
+ ```
62
+ {
63
+ "eval_loss": 0.08725570142269135,
64
+ "eval_exact_match": 0.5547445255474452,
65
+ "eval_rouge1": 0.10948905109489052,
66
+ "eval_rouge2": 0.10583941605839416,
67
+ "eval_rougeL": 0.10948905109489052,
68
+ "epoch": 3.0
69
+ }
70
+ ```
71
+
72
+ ## Intended use
73
+
74
+ - Instruction-style Kazakh text generation for short responses.
75
+ - Educational and informational content generation prototypes.
76
+
77
+ ## Limitations
78
+
79
+ - Limited dataset size may reduce generalization to unseen domains.
80
+ - Outputs may be generic for short prompts.
81
+
82
  ## Paper & Documentation
83
 
84
  <details>