π©πͺβ‘οΈπ¬π§ de-en-translator
A transformer-based German β English translation model fine-tuned on a custom split of the WMT16 (de-en) dataset using π€ Transformers and Seq2SeqTrainer.
π§ Model Details
- β
Model:
Aparna852/german-english-translator(fine-tuned) - π€ Task: German β‘οΈ English Translation
- π Dataset: WMT16 (
wmt/wmt16-de-en) - βοΈ Strategy: Custom train/val/test split, truncated input
- π§ͺ Evaluation Metrics: BLEU (via
sacrebleu)
βοΈ Training Hyperparameters
| Parameter | Value |
|---|---|
| Dataset | wmt/wmt16 (German-English) |
| Train Size | ~2.5% of original training set |
| Validation Size | ~2.8% of original validation |
| Max Length | 64 |
| Epochs | 3 |
| Train Batch Size | 4 |
| Eval Batch Size | 4 |
| Gradient Accumulation | 8 |
| Learning Rate | 1e-5 |
| Weight Decay | 0.03 |
| Warmup Steps | 500 |
| FP16 (Mixed Precision) | True (if CUDA available) |
| Scheduler | linear |
| Evaluation Strategy | epoch |
| Save Strategy | epoch |
| Logging Steps | 10 |
| Early Stopping | patience=2 |
| Metric for Best Model | eval_loss |
| Trainer API | Seq2SeqTrainer from π€ Transformers |
π Evaluation Setup
You can run the evaluation after training using:
from evaluate import load
bleu = load("sacrebleu")
# Compute BLEU on tokenized test dataset
preds = [...] # Generated translations
refs = [...] # Reference translations
bleu.compute(predictions=preds, references=[[r] for r in refs])
- Downloads last month
- -