Instructions to use Aleton/en-be-translator with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Aleton/en-be-translator with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="Aleton/en-be-translator")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Aleton/en-be-translator") model = AutoModelForSeq2SeqLM.from_pretrained("Aleton/en-be-translator") - Notebooks
- Google Colab
- Kaggle
metadata
language:
- en
- be
tags:
- translation
- pytorch
- transformers
- marian
pipeline_tag: translation
datasets:
- Helsinki-NLP/opus-100
base_model: Helsinki-NLP/opus-mt-en-mul
metrics:
- bleu
English to Belarusian Translator (en-be)
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-mul for translating text from English (en) to Belarusian (be).
Model Description
The model was fine-tuned using the transformers library on the English–Belarusian split of the OPUS-100 dataset. It is based on the MarianMT architecture and is optimized for translating short and medium-length sentences from English into Belarusian.
Example of usage
You can use this model directly with the transformers library:
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
# Load model and tokenizer
model_name = "Aleton/en-be-translator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
model.eval()
# Text to translate
text = "Hello, how are you?"
inputs = tokenizer(
text,
return_tensors="pt",
truncation=True,
max_length=128,
).to(device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=128,
num_beams=4,
)
translation = tokenizer.decode(
outputs[0],
skip_special_tokens=True,
)
print(translation)
# Example output: Прывітанне, як справы?