Koushim
/

mbart50-en-te-hackhedron

@@ -1,50 +1,123 @@
 ---
-library_name: transformers
 base_model: facebook/mbart-large-50-many-to-many-mmt
 tags:
-- generated_from_trainer
 metrics:
-- bleu
 model-index:
-- name: mbart50-en-te-hackhedron
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# mbart50-en-te-hackhedron
-This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0511
-- Bleu: 66.9240
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 8
-- eval_batch_size: 8
-- seed: 42
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- num_epochs: 1
-- mixed_precision_training: Native AMP
 ### Training results
@@ -52,6 +125,7 @@ The following hyperparameters were used during training:
 |:-------------:|:-----:|:-----:|:---------------:|:-------:|
 | 0.0455        | 1.0   | 48808 | 0.0511          | 66.9240 |
 ### Framework versions
@@ -59,3 +133,21 @@ The following hyperparameters were used during training:
 - Pytorch 2.6.0+cu124
 - Datasets 3.6.0
 - Tokenizers 0.21.1

 ---
 base_model: facebook/mbart-large-50-many-to-many-mmt
 tags:
+  - translation
+  - mbart50
+  - english
+  - telugu
+  - hackhedron
+  - neural-machine-translation
+  - huggingface
+license: apache-2.0
+datasets:
+  - hackhedron
 metrics:
+  - sacrebleu
 model-index:
+  - name: mbart50-en-te-hackhedron
+    language:
+      - en
+      - te
+    results:
+      - task:
+          name: Translation
+          type: translation
+        dataset:
+          name: HackHedron English-Telugu Parallel Corpus
+          type: hackhedron
+          args: en-te
+        metrics:
+          - name: SacreBLEU
+            type: sacrebleu
+            value: 66.9240
 ---
+# 🌐 mBART50 English ↔ Telugu | HackHedron Dataset
+This model is fine-tuned from [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on the [HackHedron English-Telugu Parallel Corpus](https://huggingface.co/datasets). It supports bidirectional translation between **English ↔ Telugu**.
+## 🧠 Model Architecture
+- **Base model**: mBART50 (Multilingual BART with 50 languages)
+- **Type**: Seq2Seq Transformer
+- **Tokenizer**: MBart50TokenizerFast
+- **Languages Used**:
+  - `en_XX` for English
+  - `te_IN` for Telugu
+---
+## 📚 Dataset
+**HackHedron English-Telugu Parallel Corpus**
+- ~390,000 training sentence pairs
+- ~43,000 validation pairs
+- Format:
+```json
+{
+  "english": "Tom started his car and drove away.",
+  "telugu": "టామ్ తన కారును స్టార్ట్ చేసి దూరంగా నడిపాడు."
+}
+````
+---
+## 📈 Evaluation
+| Metric    | Score  |  Loss   |
+| --------- | ------ | ------- |
+| SacreBLEU | 66.924 |  0.0511 |
+> 🧪 Evaluation done using Hugging Face `evaluate` library on validation set.
+>
+---
+## 💻 How to Use
+```python
+from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
+model = MBartForConditionalGeneration.from_pretrained("koushik-reddy/mbart50-en-te-hackhedron")
+tokenizer = MBart50TokenizerFast.from_pretrained("koushik-reddy/mbart50-en-te-hackhedron")
+# Set source and target language
+tokenizer.src_lang = "en_XX"
+tokenizer.tgt_lang = "te_IN"
+text = "How are you?"
+inputs = tokenizer(text, return_tensors="pt")
+generated_tokens = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["te_IN"])
+translated = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
+print(translated[0])
+```
+---
+## 📦 How to Fine-Tune Further
+Use the `Seq2SeqTrainer` from Hugging Face:
+```python
+from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments
+```
+Make sure to properly set `forced_bos_token_id=tokenizer.lang_code_to_id["te_IN"]` during generation.
+---
+## 🛠️ Training Details
+* Optimizer: AdamW
+* Learning Rate: 2e-05
+* Epochs: 1
+* train_batch_size: 8
+* eval_batch_size: 8
+* seed: 42
+* Truncation Length: 128 tokens
+* Framework: 🤗 Transformers + Datasets
+* Scheduler: Linear
+* Mixed Precision: Enabled (fp16)
+---
 ### Training results
 |:-------------:|:-----:|:-----:|:---------------:|:-------:|
 | 0.0455        | 1.0   | 48808 | 0.0511          | 66.9240 |
+---
 ### Framework versions
 - Pytorch 2.6.0+cu124
 - Datasets 3.6.0
 - Tokenizers 0.21.1
+---
+## 🏷️ License
+This model is licensed under [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
+---
+## 🤝 Acknowledgements
+* 🤗 Hugging Face Transformers
+* Facebook AI for mBART50
+* HackHedron Parallel Corpus Contributors
+---
+> Created by **Koushik Reddy** – [Hugging Face Profile](https://huggingface.co/Koushim)