--- license: llama2 base_model: codellama/CodeLlama-7b-Instruct-hf tags: - fine-tuned - educational - qa - code - llama - peft - lora language: - en pipeline_tag: text-generation library_name: peft --- # CodeLLaMa7B-FineTuned-byMoomen This model is a fine-tuned version of [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) using LoRA (Low-Rank Adaptation) for educational Q&A tasks. ## Model Details - **Base Model**: codellama/CodeLlama-7b-Instruct-hf - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **LoRA Rank**: 32 - **LoRA Alpha**: 64 - **Target Modules**: ['gate_proj', 'lm_head', 'k_proj', 'q_proj', 'up_proj', 'down_proj', 'v_proj', 'o_proj'] - **Training Focus**: Educational programming Q&A - **Model Type**: Causal Language Model ## Usage ### Quick Start ```python from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer # Load model and tokenizer model = AutoPeftModelForCausalLM.from_pretrained("Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen") tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-Instruct-hf") # Generate response prompt = "Explain recursion in programming" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.7) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ### Chat Format Usage ```python # For educational Q&A conversations messages = [ {"role": "system", "content": "You are a helpful educational assistant."}, {"role": "user", "content": "What is the difference between lists and tuples in Python?"} ] formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(formatted_prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=300) response = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ### Memory-Efficient Loading ```python # For systems with limited VRAM from transformers import BitsAndBytesConfig quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16 ) model = AutoPeftModelForCausalLM.from_pretrained( "Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen", quantization_config=quantization_config, device_map="auto" ) ``` ## Training Details This model was fine-tuned using: - **Parameter-Efficient Fine-Tuning (PEFT)** with LoRA - **Educational conversation dataset** focused on programming concepts - **Optimized for Q&A format** with system/user/assistant roles ## Intended Use This model is designed for: - 📚 Educational programming Q&A - 💡 Concept explanations in computer science - 🔧 Code debugging assistance - 🎓 Technical tutoring and learning support ## Limitations - Based on codellama/CodeLlama-7b-Instruct-hf, inherits its limitations - Optimized for educational content, may not perform well on other tasks - Requires base model for inference (LoRA adapters only) - Performance depends on the quality of training data ## Model Architecture This is a LoRA adapter that needs to be loaded with the base model. The adapter files are: - `adapter_config.json`: LoRA configuration - `adapter_model.safetensors`: Trained LoRA weights ## License This model follows the same license as the base model: Llama 2 Custom License. ## Citation If you use this model, please cite: ```bibtex @misc{CodeLLaMa7B_FineTuned_byMoomen, title={CodeLLaMa7B-FineTuned-byMoomen}, author={Moomen123Msaadi}, year={2024}, publisher={Hugging Face}, url={https://huggingface.co/Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen} } ```