Instructions to use renanserrano/yanomami-finetuning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use renanserrano/yanomami-finetuning with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="renanserrano/yanomami-finetuning")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("renanserrano/yanomami-finetuning")
model = AutoModelForCausalLM.from_pretrained("renanserrano/yanomami-finetuning")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use renanserrano/yanomami-finetuning with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "renanserrano/yanomami-finetuning"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "renanserrano/yanomami-finetuning",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/renanserrano/yanomami-finetuning

SGLang

How to use renanserrano/yanomami-finetuning with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "renanserrano/yanomami-finetuning" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "renanserrano/yanomami-finetuning",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "renanserrano/yanomami-finetuning" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "renanserrano/yanomami-finetuning",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use renanserrano/yanomami-finetuning with Docker Model Runner:
```
docker model run hf.co/renanserrano/yanomami-finetuning
```

Yanomami-English Translation Model

This model is a fine-tuned GPT-2 Small (124M parameters) for bidirectional translation between Yanomami and English languages. It was developed to provide offline translation capabilities for the Yanomami language, an indigenous language spoken in northern Brazil and southern Venezuela. As GPT2-Small is not suited for this, I tried NLLB but missed the conversational style, now I'm training a model in Llama 3.1 8B-int8: https://github.com/renantrendt/yanomami_llama

In the meantime while we don't finish the fine tuning of llama3, we deployed a chatgpt like that RAG the yanomami dictionary: https://yanomami.bernardoserrano.com/

Model Description

Model Type: GPT-2 Small (124M parameters)
Language(s): Yanomami ↔ English
License: MIT
Developed by: Renan Serrano

Training Data

The model was trained on a diverse dataset consisting of:

translations.jsonl (17,009 examples)
yanomami-to-english.jsonl (1,822 examples)
phrases.jsonl (2,322 examples)
grammar.jsonl (200 examples)
comparison.jsonl (2,072 examples)
how-to.jsonl (5,586 examples)

Training Metrics

Final training loss: 1.0554 (Epoch 3)
Final validation loss: 1.0557
Overall average training loss: 1.2102
Perplexity: 2.87

Usage

Direct Translation

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("renanserrano/yanomami-finetuning")
model = GPT2LMHeadModel.from_pretrained("renanserrano/yanomami-finetuning")

# Configure device
device = torch.device("cuda" if torch.cuda.is_available() else 
                     "mps" if torch.backends.mps.is_available() else 
                     "cpu")
model.to(device)

# Function for translation
def translate(text, direction="english_to_yanomami"):
    # Add appropriate prefix based on translation direction
    if direction == "english_to_yanomami":
        prompt = f"English: {text} => Yanomami:"
    else:
        prompt = f"Yanomami: {text} => English:"
    
    # Tokenize input
    inputs = tokenizer(prompt, return_tensors="pt")
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    # Generate translation
    outputs = model.generate(
        **inputs,
        max_length=100,
        num_return_sequences=1,
        temperature=0.7,
        top_p=0.9,
        top_k=50,
        num_beams=4,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
    
    # Decode translation
    translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    # Extract the actual translation part (after the prompt)
    if "=>" in translation:
        translation = translation.split("=>")[1].strip()
    
    return translation

# Examples
# English to Yanomami
print(translate("What does 'aheprariyo' mean in Yanomami?", "english_to_yanomami"))

# Yanomami to English
print(translate("ahetoimi", "yanomami_to_english"))

Using with RAG (Retrieval-Augmented Generation)

For more advanced use cases, this model can be integrated with a RAG system to provide context-enhanced translations and comprehensive linguistic information.

Limitations

The model shows promising results for translating Yanomami words to English definitions but has limitations with more complex translations and conversational phrases.
Performance varies based on the complexity of the input and its similarity to the training data.
The model may not capture all cultural nuances and context-specific meanings.

Ethical Considerations

This model is intended to support language preservation and cross-cultural communication. When using this model, please be respectful of the Yanomami culture and language.

Offline Usage

This model was designed to function completely offline, ensuring accessibility in remote areas without internet connectivity. All components can be downloaded and used locally.

Related Resources

Repositories & Datasets

GitHub Repository: renantrendt/yanomami-finetuning
Dataset (Hugging Face): renanserrano/yanomami
Dataset Generator (NPM): ai-dataset-generator

Citation

If you use this model in your research or applications, please cite:

@misc{yanomami-english-translator,
  author = {Renan Serrano},
  title = {Yanomami-English Translation Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/renanserrano/yanomami-finetuning}}
}

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

F32

Evaluation results

perplexity
self-reported

2.870
loss
self-reported

1.055