πŸ‡©πŸ‡ͺβž‘οΈπŸ‡¬πŸ‡§ de-en-translator

A transformer-based German β†’ English translation model fine-tuned on a custom split of the WMT16 (de-en) dataset using πŸ€— Transformers and Seq2SeqTrainer.


🧠 Model Details

  • βœ… Model: Aparna852/german-english-translator (fine-tuned)
  • πŸ”€ Task: German ➑️ English Translation
  • πŸ“š Dataset: WMT16 (wmt/wmt16 - de-en)
  • βš™οΈ Strategy: Custom train/val/test split, truncated input
  • πŸ§ͺ Evaluation Metrics: BLEU (via sacrebleu)

βš™οΈ Training Hyperparameters

Parameter Value
Dataset wmt/wmt16 (German-English)
Train Size ~2.5% of original training set
Validation Size ~2.8% of original validation
Max Length 64
Epochs 3
Train Batch Size 4
Eval Batch Size 4
Gradient Accumulation 8
Learning Rate 1e-5
Weight Decay 0.03
Warmup Steps 500
FP16 (Mixed Precision) True (if CUDA available)
Scheduler linear
Evaluation Strategy epoch
Save Strategy epoch
Logging Steps 10
Early Stopping patience=2
Metric for Best Model eval_loss
Trainer API Seq2SeqTrainer from πŸ€— Transformers

πŸ“Š Evaluation Setup

You can run the evaluation after training using:

from evaluate import load
bleu = load("sacrebleu")

# Compute BLEU on tokenized test dataset
preds = [...]  # Generated translations
refs = [...]   # Reference translations
bleu.compute(predictions=preds, references=[[r] for r in refs])
Downloads last month
-
Safetensors
Model size
73.9M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Aparna852/de-en-translator