Model Description
This model is a Continued Pre-Training adaptation of Mistral-7B v0.3, extended to the Malagasy language.
This is version 2 (v2), trained with a larger dataset and 2 epochs. It uses bnb-4bit quantization for more efficient inference while retaining performance.
The resulting model improves fluency and coherence in Malagasy and provides a foundation for downstream Malagasy NLP tasks.
Intended Uses & Limitations
Use cases:
- Instruction Fine-tuning Ready for Malagasy oriented instruction dataset
- Generating text in Malagasy
- Research on low-resource language adaptation
- Data augmentation for Malagasy NLP tasks
Training Details
- Base Model: Mistral-7B v0.3
- Method: Continued Pretraining with LoRA adapters
- Hardware: 1 脳 Tesla T4 (14.7 GB VRAM)
- Number of Epochs: 2
- Trainable parameters: ~604M (7.7% of 7.85B total)
- Aproximative Training Time: ~109hours
Training Loss Curve:
Inference Example Usage
code:
# Import required libraries for model loading and text generation
from unsloth import FastLanguageModel
from transformers import TextStreamer
import torch
# Load the pretrained Malagasy LoRA model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Lo-Renz-O/Mistral-7B-CPT-Malagasy-v2-bnb-4bit",
max_seq_length=1024,
dtype=None,
load_in_4bit=True,
)
# Enable optimized inference
FastLanguageModel.for_inference(model)
# Define the prompt template for text generation
prompt = """Lahatsoratra
### Lohateny: {}
### Lahatsoratra:{}
"""
# Tokenize the prompt and move tensors to GPU
inputs = tokenizer(
[prompt.format("Madagasikara", "")],
return_tensors="pt",
).to("cuda")
# Initialize a streamer to display generated tokens in real-time
text_streamer = TextStreamer(tokenizer, skip_special_tokens=True)
# Generate text using the model with specific generation parameters
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.8,
top_p=0.95,
repetition_penalty=1.0,
do_sample=True,
streamer=text_streamer,
)
Limitations:
- Not instruction-tuned: responses may not always follow task instructions.
- May hallucinate or generate factually inaccurate information.
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 2
Model tree for Lo-Renz-O/Mistral-7B-CPT-Malagasy-v2-bnb-4bit
Base model
mistralai/Mistral-7B-v0.3
Quantized
unsloth/mistral-7b-v0.3-bnb-4bit
