Moomen123Msaadi's picture
Upload CodeLLaMa7B-FineTuned-byMoomen fine-tuned model
3a08249 verified
---
license: llama2
base_model: codellama/CodeLlama-7b-Instruct-hf
tags:
- fine-tuned
- educational
- qa
- code
- llama
- peft
- lora
language:
- en
pipeline_tag: text-generation
library_name: peft
---
# CodeLLaMa7B-FineTuned-byMoomen
This model is a fine-tuned version of [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) using LoRA (Low-Rank Adaptation) for educational Q&A tasks.
## Model Details
- **Base Model**: codellama/CodeLlama-7b-Instruct-hf
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **LoRA Rank**: 32
- **LoRA Alpha**: 64
- **Target Modules**: ['gate_proj', 'lm_head', 'k_proj', 'q_proj', 'up_proj', 'down_proj', 'v_proj', 'o_proj']
- **Training Focus**: Educational programming Q&A
- **Model Type**: Causal Language Model
## Usage
### Quick Start
```python
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
# Load model and tokenizer
model = AutoPeftModelForCausalLM.from_pretrained("Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen")
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-Instruct-hf")
# Generate response
prompt = "Explain recursion in programming"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Chat Format Usage
```python
# For educational Q&A conversations
messages = [
{"role": "system", "content": "You are a helpful educational assistant."},
{"role": "user", "content": "What is the difference between lists and tuples in Python?"}
]
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=300)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
### Memory-Efficient Loading
```python
# For systems with limited VRAM
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16
)
model = AutoPeftModelForCausalLM.from_pretrained(
"Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen",
quantization_config=quantization_config,
device_map="auto"
)
```
## Training Details
This model was fine-tuned using:
- **Parameter-Efficient Fine-Tuning (PEFT)** with LoRA
- **Educational conversation dataset** focused on programming concepts
- **Optimized for Q&A format** with system/user/assistant roles
## Intended Use
This model is designed for:
- 📚 Educational programming Q&A
- 💡 Concept explanations in computer science
- 🔧 Code debugging assistance
- 🎓 Technical tutoring and learning support
## Limitations
- Based on codellama/CodeLlama-7b-Instruct-hf, inherits its limitations
- Optimized for educational content, may not perform well on other tasks
- Requires base model for inference (LoRA adapters only)
- Performance depends on the quality of training data
## Model Architecture
This is a LoRA adapter that needs to be loaded with the base model. The adapter files are:
- `adapter_config.json`: LoRA configuration
- `adapter_model.safetensors`: Trained LoRA weights
## License
This model follows the same license as the base model: Llama 2 Custom License.
## Citation
If you use this model, please cite:
```bibtex
@misc{CodeLLaMa7B_FineTuned_byMoomen,
title={CodeLLaMa7B-FineTuned-byMoomen},
author={Moomen123Msaadi},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen}
}
```