Aparna852 commited on
Commit
b5fba2a
Β·
verified Β·
1 Parent(s): 260fd06

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -12
README.md CHANGED
@@ -9,23 +9,89 @@ This is a German to English translation model, fine-tuned over multiple stages s
9
  3. **Stage 2 Dataset**: Filtered `wmt16` with better train/val split
10
  4. **Stage 3 Dataset**: `iwslt2017` (clean conversational corpus)
11
 
12
- ### πŸ“Š Final Evaluation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- Evaluated on the **iwslt2017** test set:
15
 
16
- - πŸ”΅ **BLEU Score**: 39.23
17
- - 🟒 **ROUGE-L**: 0.67
18
- - 🟣 **BERTScore (F1)**: 0.9535
19
 
20
- ### πŸ“¦ Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ```python
23
  from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
24
 
25
- tokenizer = AutoTokenizer.from_pretrained("Aparna852/final-de-en-iwslt-model")
26
- model = AutoModelForSeq2SeqLM.from_pretrained("Aparna852/final-de-en-iwslt-model")
27
 
28
- text = "Guten Morgen!"
29
- inputs = tokenizer(text, return_tensors="pt")
30
- outputs = model.generate(**inputs)
31
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
9
  3. **Stage 2 Dataset**: Filtered `wmt16` with better train/val split
10
  4. **Stage 3 Dataset**: `iwslt2017` (clean conversational corpus)
11
 
12
+ ---
13
+ license: apache-2.0
14
+ tags:
15
+ - translation
16
+ - german
17
+ - english
18
+ - seq2seq
19
+ - transformers
20
+ - evaluation
21
+ datasets:
22
+ - iwslt2017
23
+ language:
24
+ - de
25
+ - en
26
+ metrics:
27
+ - sacrebleu
28
+ - rouge
29
+ - bertscore
30
+ ---
31
 
32
+ # πŸ‡©πŸ‡ͺβž‘οΈπŸ‡¬πŸ‡§ de-en-translator-3
33
 
34
+ A transformer-based German β†’ English translation model fine-tuned on the **IWSLT2017** dataset using Hugging Face's `Seq2SeqTrainer`.
 
 
35
 
36
+ ---
37
+
38
+ ## πŸš€ Model Overview
39
+
40
+ - βœ… Architecture: Seq2Seq (e.g., mBART / BART-style)
41
+ - πŸ”€ Direction: German β†’ English
42
+ - 🧠 Trained using Hugging Face Transformers
43
+ - 🎯 Optimized with early stopping and BLEU-based evaluation
44
+ - πŸ“¦ Available on Hugging Face Hub for direct loading
45
+
46
+ ---
47
+
48
+ ## πŸ“Š Evaluation Results
49
+
50
+ Tested on the **IWSLT2017 `test` split**:
51
+
52
+ | Metric | Score |
53
+ |--------------|-----------|
54
+ | πŸ”΅ BLEU | **39.23** |
55
+ | 🟒 ROUGE-L | **0.67** |
56
+ | 🟣 BERTScore (F1) | **0.9535** |
57
+
58
+ ---
59
+
60
+ ## βš™οΈ Training Hyperparameters
61
+
62
+ | Parameter | Value |
63
+ |-------------------------------|----------------------------------|
64
+ | **Model Checkpoint** | `Aparna852/de-en-translator` |
65
+ | **Dataset** | `iwslt2017` (German-English) |
66
+ | **Epochs** | `3` |
67
+ | **Train Batch Size** | `4` |
68
+ | **Eval Batch Size** | `4` |
69
+ | **Gradient Accumulation** | `8` |
70
+ | **Learning Rate** | `2e-5` |
71
+ | **Weight Decay** | `0.01` |
72
+ | **Warmup Steps** | `500` |
73
+ | **Max Sequence Length** | `128` |
74
+ | **FP16 (Mixed Precision)** | `True` *(if CUDA available)* |
75
+ | **Evaluation Strategy** | `epoch` |
76
+ | **Save Strategy** | `epoch` |
77
+ | **Logging Strategy** | `steps` (every 10 steps) |
78
+ | **Scheduler** | `linear` |
79
+ | **Metric for Best Model** | `eval_loss` |
80
+ | **Early Stopping** | `patience=2` |
81
+ | **Load Best Model at End** | `True` |
82
+ | **Trainer API** | `Seq2SeqTrainer` from πŸ€— Transformers |
83
+
84
+ ---
85
+
86
+ ## πŸ“₯ Usage Example (Python)
87
 
88
  ```python
89
  from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
90
 
91
+ model = AutoModelForSeq2SeqLM.from_pretrained("Aparna852/de-en-translator-3")
92
+ tokenizer = AutoTokenizer.from_pretrained("Aparna852/de-en-translator-3")
93
 
94
+ input_text = "Guten Morgen, wie geht es dir?"
95
+ inputs = tokenizer(input_text, return_tensors="pt")
96
+ output = model.generate(**inputs, max_length=128)
97
+ print(tokenizer.decode(output[0], skip_special_tokens=True))