woym / README.md

Update README.md

51f9b2c verified 11 months ago

7.17 kB

	---
	base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
	library_name: peft
	pipeline_tag: text-generation
	---

	# woym

	This model is a fine-tuned version of TinyLlama-1.1B-Chat-v1.0 specialized for educational interactions with young children. It aims to provide helpful, age-appropriate responses to questions and prompts from primary school students.

	## Model Details

	### Model Description

	This model was created by fine-tuning the TinyLlama-1.1B-Chat-v1.0 base model using the PEFT (Parameter-Efficient Fine-Tuning) library with QLoRA techniques. The fine-tuning focused on optimizing the model for educational content specifically tailored for young children, enhancing its ability to provide clear, simple, and instructional responses suitable for primary education.

	- Developed by: Mohammad Ali
	- Funded by: Self-funded research project
	- Model type: Instruction-tuned causal language model with QLoRA fine-tuning
	- Language(s): English
	- License: Same as base model (TinyLlama-1.1B-Chat-v1.0)
	- Finetuned from model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

	### Model Sources

	- Repository: https://github.com/mohammad17ali/woym.ai



	### Direct Use

	This model is designed for direct interaction with primary school children or for educational applications targeting young learners. It can be used to:

	- Answer basic educational questions
	- Explain simple concepts
	- Assist with homework in age-appropriate ways
	- Generate educational content for young children
	- Support teachers in creating learning materials

	### Downstream Use

	The model can be integrated into:
	- Educational applications and platforms
	- Classroom assistant tools
	- Interactive learning environments
	- Child-friendly chatbots
	- Educational content creation systems

	### Out-of-Scope Use

	This model is not designed for:
	- Providing medical, legal, or professional advice
	- Generating content for adult audiences
	- Addressing complex academic topics beyond primary education level
	- Sensitive topics requiring nuanced understanding
	- Decision-making in high-stakes scenarios

	## Bias, Risks, and Limitations

	- Limited knowledge base: As a fine-tuned version of a 1.1B parameter model, it has significantly less knowledge than larger models.
	- Simplified responses: May oversimplify complex topics in ways that could create misconceptions.
	- Language limitations: Primarily trained on English data and educational contexts.
	- Potential biases: May reflect biases present in the educational dataset used for fine-tuning.
	- Hallucination risk: Like all language models, it may generate plausible-sounding but incorrect information.
	- Limited context window: The model has a maximum context length of 512 tokens, limiting its ability to process lengthy conversations.

	### Recommendations

	- Always review the model's outputs before sharing them with children
	- Provide clear instructions when prompting the model
	- Use the model as a supplementary tool rather than a primary educational resource
	- Be aware of the model's tendency to occasionally generate incorrect information
	- Consider deploying with human-in-the-loop oversight when used in educational settings

	## How to Get Started with the Model

	Use the code below to get started with the model:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained("path/to/your/model")

	# Load the model
	model = AutoModelForCausalLM.from_pretrained("path/to/your/model")

	# Generate text
	def generate_text(prompt):
	formatted_prompt = f"<\|im_start\|>user\n{prompt}<\|im_end\|>\n<\|im_start\|>assistant\n"
	inputs = tokenizer(formatted_prompt, return_tensors="pt").to("cuda" if torch.cuda.is_available() else "cpu")

	output = model.generate(
	**inputs,
	max_length=512,
	temperature=0.7,
	top_p=0.9,
	do_sample=True,
	repetition_penalty=1.2
	)

	generated_text = tokenizer.decode(output[0], skip_special_tokens=False)
	assistant_response = generated_text.split("<\|im_start\|>assistant\n")[-1].split("<\|im_end\|>")[0]
	return assistant_response

	# Example usage
	prompt = "Can you explain what photosynthesis is in simple terms?"
	response = generate_text(prompt)
	print(response)
	```


	### Training Data

	This model was fine-tuned on the "ajibawa-2023/Education-Young-Children" dataset, which contains educational interactions between teachers and primary school students. The dataset includes a variety of educational topics appropriate for young learners.

	### Training Procedure

	The model was fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with QLoRA technique to reduce memory usage while maintaining quality.

	#### Preprocessing

	- Input data was formatted with special tokens to denote user and assistant turns
	- Prompts and responses were concatenated with appropriate markers
	- Tokenization was performed with a maximum sequence length of 512 tokens

	#### Training Hyperparameters

	- Training regime: FP16 mixed precision
	- Number of epochs: 2
	- Learning rate: 2e-5
	- Batch size: 1 (with gradient accumulation)
	- LoRA rank (r): 8
	- LoRA alpha: 32
	- LoRA dropout: 0.05
	- Target modules: q_proj, v_proj
	- Warmup steps: 100
	- Optimizer: AdamW

	#### Speeds, Sizes, Times

	- Training time: Approximately [X] hours on a P100 GPU
	- Model size: Base model (1.1B parameters) + 2-3MB for LoRA adapters
	- Hardware used: NVIDIA P100 GPU on Kaggle

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data

	The model was evaluated on a held-out subset of the "ajibawa-2023/Education-Young-Children" dataset.

	#### Factors

	Evaluation considered:
	- Response relevance to educational queries
	- Age-appropriateness of language and content
	- Accuracy of educational information
	- Safety and appropriateness of content

	#### Metrics

	- Perplexity
	- Manual evaluation of response quality
	- Response coherence and helpfulness

	### Results

	[You can add specific evaluation results here when available]

	## Environmental Impact

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: NVIDIA P100 GPU
	- Hours used: Approximately [X] hours
	- Cloud Provider: Kaggle
	- Compute Region: [Your region]
	- Carbon Emitted: [Add estimation if available]

	## Technical Specifications

	### Model Architecture and Objective

	The model uses the TinyLlama architecture (1.1B parameters) with additional LoRA adapters applied to the attention layers. The objective was next-token prediction using a causal language modeling approach, specialized for educational content.

	### Compute Infrastructure

	#### Hardware

	- NVIDIA P100 GPU on Kaggle
	- 16GB GPU memory
	- 4 vCPUs

	#### Software

	- Python 3.10
	- PyTorch 2.0+
	- Transformers 4.30+
	- PEFT 0.14.0
	- Accelerate 0.20+

	## Model Card Authors

	Mohammad Ali

	## Model Card Contact
	GitHub: https://github.com/mohammad17ali
	mailto:mohammad.ali.goba@gmail.com