Aleton
/

en-be-translator

text2text-generation

Model card Files Files and versions

en-be-translator / README.md

Aleton's picture

Update README.md

f295390 verified 8 days ago

|

History Blame Contribute Delete

1.67 kB

	---
	language:
	- en
	- be
	tags:
	- translation
	- pytorch
	- transformers
	- marian
	pipeline_tag: translation
	datasets:
	- Helsinki-NLP/opus-100
	base_model: Helsinki-NLP/opus-mt-en-mul
	metrics:
	- bleu
	---

	# English to Belarusian Translator (en-be)

	This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-mul](https://huggingface.co/Helsinki-NLP/opus-mt-en-mul) for translating text from English (en) to Belarusian (be).

	## Model Description

	The model was fine-tuned using the `transformers` library on the English–Belarusian split of the [OPUS-100 dataset](https://huggingface.co/datasets/Helsinki-NLP/opus-100). It is based on the MarianMT architecture and is optimized for translating short and medium-length sentences from English into Belarusian.

	## Example of usage

	You can use this model directly with the `transformers` library:

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	# Load model and tokenizer
	model_name = "Aleton/en-be-translator"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

	# Set device
	device = "cuda" if torch.cuda.is_available() else "cpu"
	model = model.to(device)
	model.eval()

	# Text to translate
	text = "Hello, how are you?"

	inputs = tokenizer(
	text,
	return_tensors="pt",
	truncation=True,
	max_length=128,
	).to(device)

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=128,
	num_beams=4,
	)

	translation = tokenizer.decode(
	outputs[0],
	skip_special_tokens=True,
	)

	print(translation)
	# Example output: Прывітанне, як справы?
	```