YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
final-de-en-iwslt-model π
This is a German to English translation model, fine-tuned over multiple stages starting from Helsinki-NLP/opus-mt-de-en.
β Training Stages
- Base model:
Helsinki-NLP/opus-mt-de-en - Stage 1 Dataset:
wmt16(German-English) - Stage 2 Dataset: Filtered
wmt16with better train/val split - Stage 3 Dataset:
iwslt2017(clean conversational corpus)
license: apache-2.0 tags: - translation - german - english - seq2seq - transformers - evaluation datasets: - iwslt2017 language: - de - en metrics: - sacrebleu - rouge - bertscore
π©πͺβ‘οΈπ¬π§ de-en-translator-3
A transformer-based German β English translation model fine-tuned on the IWSLT2017 dataset using Hugging Face's Seq2SeqTrainer.
π Model Overview
- β Architecture: Seq2Seq (e.g., mBART / BART-style)
- π€ Direction: German β English
- π§ Trained using Hugging Face Transformers
- π― Optimized with early stopping and BLEU-based evaluation
- π¦ Available on Hugging Face Hub for direct loading
π Evaluation Results
Tested on the IWSLT2017 test split:
| Metric | Score |
|---|---|
| π΅ BLEU | 39.23 |
| π’ ROUGE-L | 0.67 |
| π£ BERTScore (F1) | 0.9535 |
βοΈ Training Hyperparameters
| Parameter | Value |
|---|---|
| Model Checkpoint | Aparna852/de-en-translator |
| Dataset | iwslt2017 (German-English) |
| Epochs | 3 |
| Train Batch Size | 4 |
| Eval Batch Size | 4 |
| Gradient Accumulation | 8 |
| Learning Rate | 2e-5 |
| Weight Decay | 0.01 |
| Warmup Steps | 500 |
| Max Sequence Length | 128 |
| FP16 (Mixed Precision) | True (if CUDA available) |
| Evaluation Strategy | epoch |
| Save Strategy | epoch |
| Logging Strategy | steps (every 10 steps) |
| Scheduler | linear |
| Metric for Best Model | eval_loss |
| Early Stopping | patience=2 |
| Load Best Model at End | True |
| Trainer API | Seq2SeqTrainer from π€ Transformers |
π₯ Usage Example (Python)
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained("Aparna852/de-en-translator-3")
tokenizer = AutoTokenizer.from_pretrained("Aparna852/de-en-translator-3")
input_text = "Guten Morgen, wie geht es dir?"
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs, max_length=128)
print(tokenizer.decode(output[0], skip_special_tokens=True))
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support