mongolian-mistral-7b-chatbot
Description
Mistral 7B fine-tuned on Mongolian news data for chatbot
Model Details
- Base Model: mistralai/Mistral-7B-Instruct-v0.2
- Language: Mongolian (mn)
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Data: Eduge Mongolian News Dataset (75,000+ articles)
Training Configuration
- LoRA Rank: 32
- LoRA Alpha: 64
- Epochs: 3
- Learning Rate: 2e-4
- Batch Size: 4
- Max Sequence Length: 1024
Mongolian Tokens Added
- Total new tokens: ~9,500
- Sources: Mongolian-NLP repository
- Most frequent words
- Abbreviations
- District/place names
- Country names
- Named entities (NER)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("ErkaMarka/mongolian-mistral-7b-chatbot")
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.2",
torch_dtype=torch.float16,
device_map="auto"
)
# Resize embeddings for new tokens
base_model.resize_token_embeddings(len(tokenizer))
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "ErkaMarka/mongolian-mistral-7b-chatbot")
# Generate
messages = [{"role": "user", "content": "Монгол улсын нийслэл хот юу вэ?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Evaluation Results
Evaluated on 100 Mongolian Q&A pairs using BLEU score.
License
Apache 2.0
Citation
@misc{mongolian_mistral_7b_chatbot},
author = {Your Name},
title = {mongolian-mistral-7b-chatbot},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/ErkaMarka/mongolian-mistral-7b-chatbot}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for ErkaMarka/mongolian-mistral-7b-chatbot
Base model
mistralai/Mistral-7B-Instruct-v0.2