| --- |
| license: apache-2.0 |
| language: ar |
| tags: |
| - machine-learning |
| - arabic |
| - mistral |
| - lora |
| - qlora |
| --- |
| |
| # Arabic Machine Learning Assistant (Mistral-7B + QLoRA) |
|
|
| ## Overview |
| This model is a domain-specific fine-tuned version of Mistral-7B, optimized for generating clear and structured explanations of Machine Learning concepts in Arabic. |
|
|
| The model leverages parameter-efficient fine-tuning (LoRA) combined with 4-bit quantization (QLoRA) to achieve strong performance while maintaining computational efficiency. |
|
|
| --- |
|
|
| ## Key Capabilities |
| - Generates structured explanations in Arabic |
| - Provides simplified breakdowns of complex ML concepts |
| - Produces consistent outputs using a defined format: |
| - Definition |
| - Example |
| - Analogy |
|
|
| --- |
|
|
| ## Training Methodology |
|
|
| **Base Model:** Mistral-7B |
| **Fine-Tuning Approach:** LoRA (Low-Rank Adaptation) |
| **Quantization:** 4-bit (QLoRA - nf4, double quantization) |
| **Training Type:** Instruction Tuning |
|
|
| The model was trained on a custom-curated Arabic dataset focused on Machine Learning explanations, emphasizing clarity, structure, and real-world understanding. |
|
|
| --- |
|
|
| ## Example |
|
|
| ### Input |
| اشرح Overfitting |
|
|
| ### Output |
| Definition: |
| ... |
|
|
| Example: |
| ... |
|
|
| Analogy: |
| ... |
|
|
| --- |
|
|
| ## Performance Improvement |
|
|
| **Before Fine-Tuning:** |
| - Generic and unstructured responses |
| - Occasional prompt repetition |
| - Limited clarity in explanations |
|
|
| **After Fine-Tuning:** |
| - Structured and consistent responses |
| - Improved conceptual understanding |
| - Clear Arabic explanations tailored for learning |
|
|
| --- |
|
|
| ## Intended Use Cases |
| - Educational tools for Arabic-speaking learners |
| - AI-powered assistants for ML explanations |
| - Content generation for technical topics in Arabic |
|
|
| --- |
|
|
| ## Limitations |
| - Primarily optimized for Machine Learning topics |
| - Arabic responses are more refined than English |
| - May occasionally produce repetitive phrasing |
|
|
| --- |
|
|
| ## Technical Notes |
| - Fine-tuned using PEFT for memory efficiency |
| - Designed to run with quantization-aware setups |
| - Can be deployed on limited-resource environments |
|
|
| --- |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("saher3/ml-assistant") |
| tokenizer = AutoTokenizer.from_pretrained("saher3/ml-assistant") |