Update README.md
Browse files
README.md
CHANGED
|
@@ -3,6 +3,8 @@ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
|
|
| 3 |
library_name: peft
|
| 4 |
datasets:
|
| 5 |
- sanaa-11/math-dataset
|
|
|
|
|
|
|
| 6 |
---
|
| 7 |
# Model Card for LLaMA 3.1 Fine-Tuned Model
|
| 8 |
|
|
@@ -91,7 +93,7 @@ for _ in range(5):
|
|
| 91 |
## Training Details
|
| 92 |
|
| 93 |
### Training Data
|
| 94 |
-
- **Dataset**: The model was fine-tuned on a custom dataset consisting of
|
| 95 |
|
| 96 |
### Training Procedure
|
| 97 |
|
|
@@ -102,9 +104,9 @@ for _ in range(5):
|
|
| 102 |
### Training Hyperparameters
|
| 103 |
- **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
|
| 104 |
- **Batch Size**: 1 (with gradient accumulation steps of 8)
|
| 105 |
-
- **Number of Epochs**:
|
| 106 |
- **Learning Rate**: 5e-5
|
| 107 |
-
|
| 108 |
|
| 109 |
## Evaluation
|
| 110 |
|
|
@@ -126,7 +128,7 @@ for _ in range(5):
|
|
| 126 |
### Summary
|
| 127 |
**Model Examination**
|
| 128 |
- The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.
|
| 129 |
-
|
| 130 |
|
| 131 |
## Environmental Impact
|
| 132 |
**Carbon Emissions**
|
|
|
|
| 3 |
library_name: peft
|
| 4 |
datasets:
|
| 5 |
- sanaa-11/math-dataset
|
| 6 |
+
language:
|
| 7 |
+
- fr
|
| 8 |
---
|
| 9 |
# Model Card for LLaMA 3.1 Fine-Tuned Model
|
| 10 |
|
|
|
|
| 93 |
## Training Details
|
| 94 |
|
| 95 |
### Training Data
|
| 96 |
+
- **Dataset**: The model was fine-tuned on a custom dataset consisting of 3.6K rows of math exercises, lesson content, and solutions, specifically designed for Moroccan students in French laungage.
|
| 97 |
|
| 98 |
### Training Procedure
|
| 99 |
|
|
|
|
| 104 |
### Training Hyperparameters
|
| 105 |
- **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
|
| 106 |
- **Batch Size**: 1 (with gradient accumulation steps of 8)
|
| 107 |
+
- **Number of Epochs**: 8
|
| 108 |
- **Learning Rate**: 5e-5
|
| 109 |
+
|
| 110 |
|
| 111 |
## Evaluation
|
| 112 |
|
|
|
|
| 128 |
### Summary
|
| 129 |
**Model Examination**
|
| 130 |
- The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.
|
| 131 |
+
|
| 132 |
|
| 133 |
## Environmental Impact
|
| 134 |
**Carbon Emissions**
|