---
license: llama2
base_model: codellama/CodeLlama-7b-Instruct-hf
tags:
- fine-tuned
- educational
- qa
- code
- llama
- peft
- lora
language:
- en
pipeline_tag: text-generation
library_name: peft
---

# CodeLLaMa7B-FineTuned-byMoomen

This model is a fine-tuned version of [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) using LoRA (Low-Rank Adaptation) for educational Q&A tasks.

## Model Details

- **Base Model**: codellama/CodeLlama-7b-Instruct-hf
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **LoRA Rank**: 32
- **LoRA Alpha**: 64
- **Target Modules**: ['gate_proj', 'lm_head', 'k_proj', 'q_proj', 'up_proj', 'down_proj', 'v_proj', 'o_proj']
- **Training Focus**: Educational programming Q&A
- **Model Type**: Causal Language Model

## Usage

### Quick Start

```python
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

# Load model and tokenizer
model = AutoPeftModelForCausalLM.from_pretrained("Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen")
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-Instruct-hf")

# Generate response
prompt = "Explain recursion in programming"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

### Chat Format Usage

```python
# For educational Q&A conversations
messages = [
    {"role": "system", "content": "You are a helpful educational assistant."},
    {"role": "user", "content": "What is the difference between lists and tuples in Python?"}
]

formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=300)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

### Memory-Efficient Loading

```python
# For systems with limited VRAM
from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16
)

model = AutoPeftModelForCausalLM.from_pretrained(
    "Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen",
    quantization_config=quantization_config,
    device_map="auto"
)
```

## Training Details

This model was fine-tuned using:
- **Parameter-Efficient Fine-Tuning (PEFT)** with LoRA
- **Educational conversation dataset** focused on programming concepts
- **Optimized for Q&A format** with system/user/assistant roles

## Intended Use

This model is designed for:
- 📚 Educational programming Q&A
- 💡 Concept explanations in computer science
- 🔧 Code debugging assistance
- 🎓 Technical tutoring and learning support

## Limitations

- Based on codellama/CodeLlama-7b-Instruct-hf, inherits its limitations
- Optimized for educational content, may not perform well on other tasks
- Requires base model for inference (LoRA adapters only)
- Performance depends on the quality of training data

## Model Architecture

This is a LoRA adapter that needs to be loaded with the base model. The adapter files are:
- `adapter_config.json`: LoRA configuration
- `adapter_model.safetensors`: Trained LoRA weights

## License

This model follows the same license as the base model: Llama 2 Custom License.

## Citation

If you use this model, please cite:

```bibtex
@misc{CodeLLaMa7B_FineTuned_byMoomen,
  title={CodeLLaMa7B-FineTuned-byMoomen},
  author={Moomen123Msaadi},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen}
}
```