metadata
license: apache-2.0
language: ar
tags:
- machine-learning
- arabic
- mistral
- lora
- qlora
Arabic Machine Learning Assistant (Mistral-7B + QLoRA)
Overview
This model is a domain-specific fine-tuned version of Mistral-7B, optimized for generating clear and structured explanations of Machine Learning concepts in Arabic.
The model leverages parameter-efficient fine-tuning (LoRA) combined with 4-bit quantization (QLoRA) to achieve strong performance while maintaining computational efficiency.
Key Capabilities
- Generates structured explanations in Arabic
- Provides simplified breakdowns of complex ML concepts
- Produces consistent outputs using a defined format:
- Definition
- Example
- Analogy
Training Methodology
Base Model: Mistral-7B
Fine-Tuning Approach: LoRA (Low-Rank Adaptation)
Quantization: 4-bit (QLoRA - nf4, double quantization)
Training Type: Instruction Tuning
The model was trained on a custom-curated Arabic dataset focused on Machine Learning explanations, emphasizing clarity, structure, and real-world understanding.
Example
Input
اشرح Overfitting
Output
Definition: ...
Example: ...
Analogy: ...
Performance Improvement
Before Fine-Tuning:
- Generic and unstructured responses
- Occasional prompt repetition
- Limited clarity in explanations
After Fine-Tuning:
- Structured and consistent responses
- Improved conceptual understanding
- Clear Arabic explanations tailored for learning
Intended Use Cases
- Educational tools for Arabic-speaking learners
- AI-powered assistants for ML explanations
- Content generation for technical topics in Arabic
Limitations
- Primarily optimized for Machine Learning topics
- Arabic responses are more refined than English
- May occasionally produce repetitive phrasing
Technical Notes
- Fine-tuned using PEFT for memory efficiency
- Designed to run with quantization-aware setups
- Can be deployed on limited-resource environments
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("saher3/ml-assistant")
tokenizer = AutoTokenizer.from_pretrained("saher3/ml-assistant")