mongolian-mistral-7b-chatbot

Description

Mistral 7B fine-tuned on Mongolian news data for chatbot

Model Details

Base Model: mistralai/Mistral-7B-Instruct-v0.2
Language: Mongolian (mn)
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Training Data: Eduge Mongolian News Dataset (75,000+ articles)

Training Configuration

LoRA Rank: 32
LoRA Alpha: 64
Epochs: 3
Learning Rate: 2e-4
Batch Size: 4
Max Sequence Length: 1024

Mongolian Tokens Added

Total new tokens: ~9,500
Sources: Mongolian-NLP repository
- Most frequent words
- Abbreviations
- District/place names
- Country names
- Named entities (NER)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("ErkaMarka/mongolian-mistral-7b-chatbot")

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.2",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Resize embeddings for new tokens
base_model.resize_token_embeddings(len(tokenizer))

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "ErkaMarka/mongolian-mistral-7b-chatbot")

# Generate
messages = [{"role": "user", "content": "Монгол улсын нийслэл хот юу вэ?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation Results

Evaluated on 100 Mongolian Q&A pairs using BLEU score.

License

Apache 2.0

Citation

@misc{mongolian_mistral_7b_chatbot},
  author = {Your Name},
  title = {mongolian-mistral-7b-chatbot},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/ErkaMarka/mongolian-mistral-7b-chatbot}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ErkaMarka/mongolian-mistral-7b-chatbot

Base model

mistralai/Mistral-7B-Instruct-v0.2

Adapter

(1283)

this model