mongolian-mistral-7b-chatbot

Description

Mistral 7B fine-tuned on Mongolian news data for chatbot

Model Details

  • Base Model: mistralai/Mistral-7B-Instruct-v0.2
  • Language: Mongolian (mn)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Data: Eduge Mongolian News Dataset (75,000+ articles)

Training Configuration

  • LoRA Rank: 32
  • LoRA Alpha: 64
  • Epochs: 3
  • Learning Rate: 2e-4
  • Batch Size: 4
  • Max Sequence Length: 1024

Mongolian Tokens Added

  • Total new tokens: ~9,500
  • Sources: Mongolian-NLP repository
    • Most frequent words
    • Abbreviations
    • District/place names
    • Country names
    • Named entities (NER)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("ErkaMarka/mongolian-mistral-7b-chatbot")

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.2",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Resize embeddings for new tokens
base_model.resize_token_embeddings(len(tokenizer))

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "ErkaMarka/mongolian-mistral-7b-chatbot")

# Generate
messages = [{"role": "user", "content": "Монгол улсын нийслэл хот юу вэ?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation Results

Evaluated on 100 Mongolian Q&A pairs using BLEU score.

License

Apache 2.0

Citation

@misc{mongolian_mistral_7b_chatbot},
  author = {Your Name},
  title = {mongolian-mistral-7b-chatbot},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/ErkaMarka/mongolian-mistral-7b-chatbot}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ErkaMarka/mongolian-mistral-7b-chatbot

Adapter
(1088)
this model