billingsmoore/tagged-tibetan-to-english-translation-dataset
Viewer • Updated • 108k • 32
How to use MISHANM/Tibetan_eng_text_generation_Llama3_8B_instruct with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "MISHANM/Tibetan_eng_text_generation_Llama3_8B_instruct")This model has been carefully fine-tuned to work with the Tibetan language. It can answer questions and translate text between English and Tibetan. Using advanced natural language processing techniques, it provides accurate and context-aware responses. This means it understands the details and subtleties of Tibetan, making its answers reliable and relevant in different situations.
The model is trained on approx 107,525 instruction samples.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the fine-tuned model and tokenizer
model_path = "MISHANM/Tibetan_eng_text_generation_Llama3_8B_instruct"
model = AutoModelForCausalLM.from_pretrained(model_path,device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)
# Function to generate text
def generate_text(prompt, max_length=500, temperature=0.9):
# Format the prompt according to the chat template
messages = [
{
"role": "system",
"content": "You are a Tibetan language expert and linguist, with same knowledge give response in Tibetan language.",
},
{"role": "user", "content": prompt}
]
# Apply the chat template
formatted_prompt = f"<|system|>{messages[0]['content']}<|user|>{messages[1]['content']}<|assistant|>"
# Tokenize and generate output
inputs = tokenizer(formatted_prompt, return_tensors="pt")
output = model.generate(
**inputs, max_new_tokens=max_length, temperature=temperature, do_sample=True
)
return tokenizer.decode(output[0], skip_special_tokens=True)
# Example usage
prompt = """ཤེས་རབ་བསླབ་པ་རྒྱུད་ལ་དབང་བ་སྟེ།།"""
translated_text = generate_text(prompt)
print(translated_text)
@misc{MISHANM/Tibetan_eng_text_generation_Llama3_8B_instruct,
author = {Mishan Maurya},
title = {Introducing Fine Tuned LLM for Tibetan Language},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face repository},
}
Base model
meta-llama/Meta-Llama-3-8B-Instruct