be-en-translator / README.md
Aleton's picture
Update README.md
c1a481b verified
|
Raw
History Blame Contribute Delete
1.54 kB
---
language:
- be
- en
tags:
- translation
- pytorch
- transformers
- marian
pipeline_tag: translation
datasets:
- Helsinki-NLP/opus-100
base_model: Helsinki-NLP/opus-mt-mul-en
metrics:
- bleu
---
# Belarusian to English Translator (be-en)
This model is a fine-tuned version of [Helsinki-NLP/opus-mt-mul-en](https://huggingface.co/Helsinki-NLP/opus-mt-mul-en) for translating text from **Belarusian (be)** to **English (en)**.
## Model Description
The model was fine-tuned using the `transformers` library on the Belarusian-English split of the [OPUS-100 dataset](https://huggingface.co/datasets/Helsinki-NLP/opus-100). It is based on the MarianMT architecture and is optimized for quick and accurate translation of short to medium-length sentences.
## Example of usage
You can use this model directly with the `transformers` library:
```python
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
# Load model and tokenizer
model_name = "Aleton/be-en-translator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
# Text to translate
text = "Прывітанне, як справы?"
# Generate translation
inputs = tokenizer(text, return_tensors="pt").to(device)
outputs = model.generate(**inputs, max_length=128)
translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translated_text)
# Expected output: Hello, how are you?
```