mathematic-exercice-generator / README.md

Update README.md

5f2f211 verified over 1 year ago

7.16 kB

	---
	base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
	library_name: peft
	datasets:
	- sanaa-11/math-dataset
	language:
	- fr
	---
	# Model Card for LLaMA 3.1 Fine-Tuned Model

	## Model Details

	### Model Description
	- Developed by: Sanaa Abril
	- Model Type: Fine-tuned Causal Language Model
	- Language(s) (NLP): French
	- License:
	- Finetuned from model: Meta LLaMA 3.1 8B Instruct

	### Model Sources [optional]
	- Repository: https://huggingface.co/sanaa-11/mathematic-exercice-generator/tree/main
	-
	## Uses

	### Direct Use
	- Primary Application: This model is primarily used for generating math exercises tailored to Moroccan students in French, based on specific lessons and difficulty levels.
	- Example Use Case: Educators can input lesson topics to generate corresponding exercises for classroom use or online learning platforms.

	### Downstream Use [optional]
	- Potential Applications: The model can be extended or adapted to create exercises in other languages or for different educational levels.

	### Out-of-Scope Use
	- Not Suitable For: The model is not designed for high-stakes assessments, as it may generate exercises that require further validation by subject matter experts.

	## Bias, Risks, and Limitations
	- Bias: The model may inherit biases from the data it was trained on, potentially generating exercises that reflect unintended cultural or linguistic biases.
	- Risks: There is a risk of generating mathematically incorrect exercises or exercises that do not align with the intended curriculum.
	- Limitations: The model's accuracy and relevance may decrease when generating exercises outside of its training domain or when applied to advanced mathematical topics not covered during fine-tuning.

	### Recommendations
	- For Educators: It is recommended to review the generated exercises for correctness and relevance before using them in a classroom setting.
	- For Developers: Fine-tune the model further or adjust the training data to mitigate any biases and improve the quality of the generated content.

	## How to Get Started with the Model
	Use the following code snippet to load and generate exercises using the model:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel, PeftConfig
	import torch

	# Base model name
	model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"

	# Load the base model without specifying rope_scaling
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	device_map="auto", # Adjust based on your environment
	offload_folder="./offload_dir", # Specify a folder for offloading if necessary
	torch_dtype=torch.float16, # Use float16 for better performance on compatible hardware
	revision="main" # Specify the correct revision if needed
	)

	# Load the adapter configuration
	config = PeftConfig.from_pretrained("sanaa-11/mathematic-exercice-generator")

	# Load the adapter weights into the model
	model = PeftModel.from_pretrained(model, "sanaa-11/mathematic-exercice-generator")

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
	```
	```
	generated_text = ""
	prompt = "Fournis un exercice basé sur la vie reelle de difficulté moyenne de niveau 2 annee college sur les fractions."
	for _ in range(5):
	inputs = tokenizer(prompt + generated_text, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_length=1065,
	temperature=0.7,
	top_p=0.9,
	num_beams=5,
	repetition_penalty=1.2,
	no_repeat_ngram_size=2,
	pad_token_id=tokenizer.eos_token_id,
	early_stopping=False
	)
	new_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
	generated_text += new_text
	print(new_text)
	```
	## Training Details

	### Training Data
	- Dataset: The model was fine-tuned on a custom dataset consisting of 3.6K rows of math exercises, lesson content, and solutions, specifically designed for Moroccan students in French laungage.

	### Training Procedure

	#### Preprocessing [optional]
	- Data Cleaning: Text normalization, tokenization, and padding were applied to prepare the data.
	- Tokenization: The French tokenizer provided by Hugging Face was used to process the text data.

	### Training Hyperparameters
	- Training Regime: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
	- Batch Size: 1 (with gradient accumulation steps of 8)
	- Number of Epochs: 8
	- Learning Rate: 5e-5


	## Evaluation

	### Testing Data, Factors & Metrics

	Testing Data
	- A separate subset of 10% of the dataset was reserved for evaluation.

	Factors
	- Complexity of Generated Exercises: Exercises were evaluated based on their complexity relative to the intended difficulty level.

	Metrics
	- Training Loss: The loss measured during training.
	- Validation Loss: The loss measured on the validation dataset during training.

	Results
	- Training and Validation Loss: The model was evaluated based on training and validation loss over 8 epochs. The results indicate that the model's performance improved significantly after the first few epochs, with a steady decrease in both training and validation loss. The final validation loss achieved was 0.154888, indicating a good fit to the validation data without significant overfitting.

	### Summary
	Model Examination
	- The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.


	## Environmental Impact
	Carbon Emissions
	- Hardware Type: Tesla T4 GPU
	- Hours Used: 12 hours
	- Cloud Provider: Kaggle
	- Carbon Emitted: [Can be estimated using the Machine Learning Impact calculator by Lacoste et al. (2019)]

	### Technical Specifications [optional]

	Model Architecture and Objective
	- The model is based on the LLaMA 3.1 architecture, fine-tuned to generate text in French for educational purposes, specifically math exercises.

	Compute Infrastructure
	- The model was trained on Kaggle’s free-tier environment, leveraging a single Tesla T4 GPU.

	Hardware
	- GPU: Tesla T4 with 16GB RAM


	Software
	- Transformers Version: 4.44.0
	- PEFT Version: 0.12.0

	### Citation [optional]

	BibTeX:

	```bibtex
	@misc{your_name_2024_model,
	author = {Sanaa Abril},
	title = {Fine-Tuned LLaMA 3.1 for Generating Math Exercises},
	year = {2024},
	publisher = {Hugging Face},
	note = {\url{https://huggingface.co/sanaa-11/mathematic-exercice-generator}}
	}
	APA:
	Abril, S. (2024). Fine-Tuned LLaMA 3.1 for Generating Math Exercises. Hugging Face. https://huggingface.co/sanaa-11/mathematic-exercice-generator

	### More Information [optional]
	- For further details or questions, feel free to reach out to the model card authors.

	### Model Card Authors [optional]
	- Sanaa Abril - sanaa.abril@gmail.com

	### Framework versions
	- Transformers: 4.44.0
	- PEFT: 0.12.0