sanaa-11's picture
Update README.md
5f2f211 verified
---
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
library_name: peft
datasets:
- sanaa-11/math-dataset
language:
- fr
---
# Model Card for LLaMA 3.1 Fine-Tuned Model
## Model Details
### Model Description
- **Developed by**: Sanaa Abril
- **Model Type**: Fine-tuned Causal Language Model
- **Language(s) (NLP)**: French
- **License**:
- **Finetuned from model**: Meta LLaMA 3.1 8B Instruct
### Model Sources [optional]
- **Repository**: https://huggingface.co/sanaa-11/mathematic-exercice-generator/tree/main
-
## Uses
### Direct Use
- **Primary Application**: This model is primarily used for generating math exercises tailored to Moroccan students in French, based on specific lessons and difficulty levels.
- **Example Use Case**: Educators can input lesson topics to generate corresponding exercises for classroom use or online learning platforms.
### Downstream Use [optional]
- **Potential Applications**: The model can be extended or adapted to create exercises in other languages or for different educational levels.
### Out-of-Scope Use
- **Not Suitable For**: The model is not designed for high-stakes assessments, as it may generate exercises that require further validation by subject matter experts.
## Bias, Risks, and Limitations
- **Bias**: The model may inherit biases from the data it was trained on, potentially generating exercises that reflect unintended cultural or linguistic biases.
- **Risks**: There is a risk of generating mathematically incorrect exercises or exercises that do not align with the intended curriculum.
- **Limitations**: The model's accuracy and relevance may decrease when generating exercises outside of its training domain or when applied to advanced mathematical topics not covered during fine-tuning.
### Recommendations
- **For Educators**: It is recommended to review the generated exercises for correctness and relevance before using them in a classroom setting.
- **For Developers**: Fine-tune the model further or adjust the training data to mitigate any biases and improve the quality of the generated content.
## How to Get Started with the Model
Use the following code snippet to load and generate exercises using the model:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel, PeftConfig
import torch
# Base model name
model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
# Load the base model without specifying rope_scaling
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto", # Adjust based on your environment
offload_folder="./offload_dir", # Specify a folder for offloading if necessary
torch_dtype=torch.float16, # Use float16 for better performance on compatible hardware
revision="main" # Specify the correct revision if needed
)
# Load the adapter configuration
config = PeftConfig.from_pretrained("sanaa-11/mathematic-exercice-generator")
# Load the adapter weights into the model
model = PeftModel.from_pretrained(model, "sanaa-11/mathematic-exercice-generator")
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
```
```
generated_text = ""
prompt = "Fournis un exercice basé sur la vie reelle de difficulté moyenne de niveau 2 annee college sur les fractions."
for _ in range(5):
inputs = tokenizer(prompt + generated_text, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_length=1065,
temperature=0.7,
top_p=0.9,
num_beams=5,
repetition_penalty=1.2,
no_repeat_ngram_size=2,
pad_token_id=tokenizer.eos_token_id,
early_stopping=False
)
new_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
generated_text += new_text
print(new_text)
```
## Training Details
### Training Data
- **Dataset**: The model was fine-tuned on a custom dataset consisting of 3.6K rows of math exercises, lesson content, and solutions, specifically designed for Moroccan students in French laungage.
### Training Procedure
#### Preprocessing [optional]
- **Data Cleaning**: Text normalization, tokenization, and padding were applied to prepare the data.
- **Tokenization**: The French tokenizer provided by Hugging Face was used to process the text data.
### Training Hyperparameters
- **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
- **Batch Size**: 1 (with gradient accumulation steps of 8)
- **Number of Epochs**: 8
- **Learning Rate**: 5e-5
## Evaluation
### Testing Data, Factors & Metrics
**Testing Data**
- A separate subset of 10% of the dataset was reserved for evaluation.
**Factors**
- **Complexity of Generated Exercises**: Exercises were evaluated based on their complexity relative to the intended difficulty level.
**Metrics**
- **Training Loss**: The loss measured during training.
- **Validation Loss**: The loss measured on the validation dataset during training.
**Results**
- **Training and Validation Loss**: The model was evaluated based on training and validation loss over 8 epochs. The results indicate that the model's performance improved significantly after the first few epochs, with a steady decrease in both training and validation loss. The final validation loss achieved was 0.154888, indicating a good fit to the validation data without significant overfitting.
### Summary
**Model Examination**
- The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.
## Environmental Impact
**Carbon Emissions**
- **Hardware Type**: Tesla T4 GPU
- **Hours Used**: 12 hours
- **Cloud Provider**: Kaggle
- **Carbon Emitted**: [Can be estimated using the Machine Learning Impact calculator by Lacoste et al. (2019)]
### Technical Specifications [optional]
**Model Architecture and Objective**
- The model is based on the LLaMA 3.1 architecture, fine-tuned to generate text in French for educational purposes, specifically math exercises.
**Compute Infrastructure**
- The model was trained on Kaggle’s free-tier environment, leveraging a single Tesla T4 GPU.
**Hardware**
- **GPU**: Tesla T4 with 16GB RAM
**Software**
- **Transformers Version**: 4.44.0
- **PEFT Version**: 0.12.0
### Citation [optional]
**BibTeX**:
```bibtex
@misc{your_name_2024_model,
author = {Sanaa Abril},
title = {Fine-Tuned LLaMA 3.1 for Generating Math Exercises},
year = {2024},
publisher = {Hugging Face},
note = {\url{https://huggingface.co/sanaa-11/mathematic-exercice-generator}}
}
**APA**:
Abril, S. (2024). Fine-Tuned LLaMA 3.1 for Generating Math Exercises. Hugging Face. https://huggingface.co/sanaa-11/mathematic-exercice-generator
### More Information [optional]
- For further details or questions, feel free to reach out to the model card authors.
### Model Card Authors [optional]
- **Sanaa Abril** - sanaa.abril@gmail.com
### Framework versions
- **Transformers**: 4.44.0
- **PEFT**: 0.12.0