File size: 2,254 Bytes
6ce5bb3
 
 
 
 
 
 
 
 
 
 
cdb18be
8d09d68
cdb18be
 
8d09d68
cdb18be
8d09d68
cdb18be
8d09d68
cdb18be
 
 
 
 
 
 
8d09d68
cdb18be
8d09d68
cdb18be
8d09d68
cdb18be
 
 
 
8d09d68
cdb18be
8d09d68
cdb18be
8d09d68
cdb18be
8d09d68
cdb18be
 
8d09d68
cdb18be
 
 
8d09d68
cdb18be
 
8d09d68
cdb18be
 
8d09d68
cdb18be
8d09d68
cdb18be
8d09d68
cdb18be
 
 
 
8d09d68
cdb18be
 
 
 
8d09d68
cdb18be
8d09d68
cdb18be
 
 
 
8d09d68
cdb18be
8d09d68
cdb18be
 
 
 
8d09d68
cdb18be
8d09d68
cdb18be
 
 
 
8d09d68
cdb18be
8d09d68
cdb18be
8d09d68
cdb18be
 
8d09d68
cdb18be
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
license: apache-2.0
language: ar
tags:
- machine-learning
- arabic
- mistral
- lora
- qlora
---

# Arabic Machine Learning Assistant (Mistral-7B + QLoRA)

## Overview
This model is a domain-specific fine-tuned version of Mistral-7B, optimized for generating clear and structured explanations of Machine Learning concepts in Arabic.

The model leverages parameter-efficient fine-tuning (LoRA) combined with 4-bit quantization (QLoRA) to achieve strong performance while maintaining computational efficiency.

---

## Key Capabilities
- Generates structured explanations in Arabic
- Provides simplified breakdowns of complex ML concepts
- Produces consistent outputs using a defined format:
  - Definition
  - Example
  - Analogy

---

## Training Methodology

**Base Model:** Mistral-7B  
**Fine-Tuning Approach:** LoRA (Low-Rank Adaptation)  
**Quantization:** 4-bit (QLoRA - nf4, double quantization)  
**Training Type:** Instruction Tuning  

The model was trained on a custom-curated Arabic dataset focused on Machine Learning explanations, emphasizing clarity, structure, and real-world understanding.

---

## Example

### Input
اشرح Overfitting

### Output
Definition:
...

Example:
...

Analogy:
...

---

## Performance Improvement

**Before Fine-Tuning:**
- Generic and unstructured responses
- Occasional prompt repetition
- Limited clarity in explanations

**After Fine-Tuning:**
- Structured and consistent responses
- Improved conceptual understanding
- Clear Arabic explanations tailored for learning

---

## Intended Use Cases
- Educational tools for Arabic-speaking learners
- AI-powered assistants for ML explanations
- Content generation for technical topics in Arabic

---

## Limitations
- Primarily optimized for Machine Learning topics
- Arabic responses are more refined than English
- May occasionally produce repetitive phrasing

---

## Technical Notes
- Fine-tuned using PEFT for memory efficiency
- Designed to run with quantization-aware setups
- Can be deployed on limited-resource environments

---

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("saher3/ml-assistant")
tokenizer = AutoTokenizer.from_pretrained("saher3/ml-assistant")