Koushim commited on
Commit
01901ca
·
verified ·
1 Parent(s): 7ecaf24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -29
README.md CHANGED
@@ -1,50 +1,123 @@
1
  ---
2
- library_name: transformers
3
  base_model: facebook/mbart-large-50-many-to-many-mmt
4
  tags:
5
- - generated_from_trainer
 
 
 
 
 
 
 
 
 
6
  metrics:
7
- - bleu
8
  model-index:
9
- - name: mbart50-en-te-hackhedron
10
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
 
12
 
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
 
16
- # mbart50-en-te-hackhedron
17
 
18
- This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 0.0511
21
- - Bleu: 66.9240
 
 
22
 
23
- ## Model description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- More information needed
26
 
27
- ## Intended uses & limitations
28
 
29
- More information needed
 
 
30
 
31
- ## Training and evaluation data
32
 
33
- More information needed
34
 
35
- ## Training procedure
36
 
37
- ### Training hyperparameters
 
 
 
 
 
 
 
 
 
38
 
39
- The following hyperparameters were used during training:
40
- - learning_rate: 2e-05
41
- - train_batch_size: 8
42
- - eval_batch_size: 8
43
- - seed: 42
44
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
- - lr_scheduler_type: linear
46
- - num_epochs: 1
47
- - mixed_precision_training: Native AMP
48
 
49
  ### Training results
50
 
@@ -52,6 +125,7 @@ The following hyperparameters were used during training:
52
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|
53
  | 0.0455 | 1.0 | 48808 | 0.0511 | 66.9240 |
54
 
 
55
 
56
  ### Framework versions
57
 
@@ -59,3 +133,21 @@ The following hyperparameters were used during training:
59
  - Pytorch 2.6.0+cu124
60
  - Datasets 3.6.0
61
  - Tokenizers 0.21.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  base_model: facebook/mbart-large-50-many-to-many-mmt
3
  tags:
4
+ - translation
5
+ - mbart50
6
+ - english
7
+ - telugu
8
+ - hackhedron
9
+ - neural-machine-translation
10
+ - huggingface
11
+ license: apache-2.0
12
+ datasets:
13
+ - hackhedron
14
  metrics:
15
+ - sacrebleu
16
  model-index:
17
+ - name: mbart50-en-te-hackhedron
18
+ language:
19
+ - en
20
+ - te
21
+ results:
22
+ - task:
23
+ name: Translation
24
+ type: translation
25
+ dataset:
26
+ name: HackHedron English-Telugu Parallel Corpus
27
+ type: hackhedron
28
+ args: en-te
29
+ metrics:
30
+ - name: SacreBLEU
31
+ type: sacrebleu
32
+ value: 66.9240
33
  ---
34
+ # 🌐 mBART50 English ↔ Telugu | HackHedron Dataset
35
 
36
+ This model is fine-tuned from [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on the [HackHedron English-Telugu Parallel Corpus](https://huggingface.co/datasets). It supports bidirectional translation between **English ↔ Telugu**.
 
37
 
38
+ ## 🧠 Model Architecture
39
 
40
+ - **Base model**: mBART50 (Multilingual BART with 50 languages)
41
+ - **Type**: Seq2Seq Transformer
42
+ - **Tokenizer**: MBart50TokenizerFast
43
+ - **Languages Used**:
44
+ - `en_XX` for English
45
+ - `te_IN` for Telugu
46
 
47
+ ---
48
+
49
+ ## 📚 Dataset
50
+
51
+ **HackHedron English-Telugu Parallel Corpus**
52
+ - ~390,000 training sentence pairs
53
+ - ~43,000 validation pairs
54
+ - Format:
55
+ ```json
56
+ {
57
+ "english": "Tom started his car and drove away.",
58
+ "telugu": "టామ్ తన కారును స్టార్ట్ చేసి దూరంగా నడిపాడు."
59
+ }
60
+ ````
61
+
62
+ ---
63
+
64
+ ## 📈 Evaluation
65
+
66
+ | Metric | Score | Loss |
67
+ | --------- | ------ | ------- |
68
+ | SacreBLEU | 66.924 | 0.0511 |
69
+
70
+ > 🧪 Evaluation done using Hugging Face `evaluate` library on validation set.
71
+ >
72
+ ---
73
+
74
+ ## 💻 How to Use
75
+
76
+ ```python
77
+ from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
78
+
79
+ model = MBartForConditionalGeneration.from_pretrained("koushik-reddy/mbart50-en-te-hackhedron")
80
+ tokenizer = MBart50TokenizerFast.from_pretrained("koushik-reddy/mbart50-en-te-hackhedron")
81
+
82
+ # Set source and target language
83
+ tokenizer.src_lang = "en_XX"
84
+ tokenizer.tgt_lang = "te_IN"
85
+
86
+ text = "How are you?"
87
+ inputs = tokenizer(text, return_tensors="pt")
88
+ generated_tokens = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["te_IN"])
89
+ translated = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
90
+ print(translated[0])
91
+ ```
92
+
93
+ ---
94
 
95
+ ## 📦 How to Fine-Tune Further
96
 
97
+ Use the `Seq2SeqTrainer` from Hugging Face:
98
 
99
+ ```python
100
+ from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments
101
+ ```
102
 
103
+ Make sure to properly set `forced_bos_token_id=tokenizer.lang_code_to_id["te_IN"]` during generation.
104
 
105
+ ---
106
 
107
+ ## 🛠️ Training Details
108
 
109
+ * Optimizer: AdamW
110
+ * Learning Rate: 2e-05
111
+ * Epochs: 1
112
+ * train_batch_size: 8
113
+ * eval_batch_size: 8
114
+ * seed: 42
115
+ * Truncation Length: 128 tokens
116
+ * Framework: 🤗 Transformers + Datasets
117
+ * Scheduler: Linear
118
+ * Mixed Precision: Enabled (fp16)
119
 
120
+ ---
 
 
 
 
 
 
 
 
121
 
122
  ### Training results
123
 
 
125
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|
126
  | 0.0455 | 1.0 | 48808 | 0.0511 | 66.9240 |
127
 
128
+ ---
129
 
130
  ### Framework versions
131
 
 
133
  - Pytorch 2.6.0+cu124
134
  - Datasets 3.6.0
135
  - Tokenizers 0.21.1
136
+
137
+ ---
138
+
139
+ ## 🏷️ License
140
+
141
+ This model is licensed under [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
142
+
143
+ ---
144
+
145
+ ## 🤝 Acknowledgements
146
+
147
+ * 🤗 Hugging Face Transformers
148
+ * Facebook AI for mBART50
149
+ * HackHedron Parallel Corpus Contributors
150
+
151
+ ---
152
+
153
+ > Created by **Koushik Reddy** – [Hugging Face Profile](https://huggingface.co/Koushim)