YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

final-de-en-iwslt-model πŸš€

This is a German to English translation model, fine-tuned over multiple stages starting from Helsinki-NLP/opus-mt-de-en.

βœ… Training Stages

  1. Base model: Helsinki-NLP/opus-mt-de-en
  2. Stage 1 Dataset: wmt16 (German-English)
  3. Stage 2 Dataset: Filtered wmt16 with better train/val split
  4. Stage 3 Dataset: iwslt2017 (clean conversational corpus)

license: apache-2.0 tags: - translation - german - english - seq2seq - transformers - evaluation datasets: - iwslt2017 language: - de - en metrics: - sacrebleu - rouge - bertscore

πŸ‡©πŸ‡ͺβž‘οΈπŸ‡¬πŸ‡§ de-en-translator-3

A transformer-based German β†’ English translation model fine-tuned on the IWSLT2017 dataset using Hugging Face's Seq2SeqTrainer.


πŸš€ Model Overview

  • βœ… Architecture: Seq2Seq (e.g., mBART / BART-style)
  • πŸ”€ Direction: German β†’ English
  • 🧠 Trained using Hugging Face Transformers
  • 🎯 Optimized with early stopping and BLEU-based evaluation
  • πŸ“¦ Available on Hugging Face Hub for direct loading

πŸ“Š Evaluation Results

Tested on the IWSLT2017 test split:

Metric Score
πŸ”΅ BLEU 39.23
🟒 ROUGE-L 0.67
🟣 BERTScore (F1) 0.9535

βš™οΈ Training Hyperparameters

Parameter Value
Model Checkpoint Aparna852/de-en-translator
Dataset iwslt2017 (German-English)
Epochs 3
Train Batch Size 4
Eval Batch Size 4
Gradient Accumulation 8
Learning Rate 2e-5
Weight Decay 0.01
Warmup Steps 500
Max Sequence Length 128
FP16 (Mixed Precision) True (if CUDA available)
Evaluation Strategy epoch
Save Strategy epoch
Logging Strategy steps (every 10 steps)
Scheduler linear
Metric for Best Model eval_loss
Early Stopping patience=2
Load Best Model at End True
Trainer API Seq2SeqTrainer from πŸ€— Transformers

πŸ“₯ Usage Example (Python)

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("Aparna852/de-en-translator-3")
tokenizer = AutoTokenizer.from_pretrained("Aparna852/de-en-translator-3")

input_text = "Guten Morgen, wie geht es dir?"
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs, max_length=128)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Downloads last month
-
Safetensors
Model size
73.9M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support