--- base_model: microsoft/Phi-4-mini-instruct library_name: peft tags: - text-generation - instruction-tuning - lora - fine-tuned - phi-4 - pytorch - transformers license: mit language: - en pipeline_tag: text-generation inference: true --- # Model Card for Phi-4 LoRA Fine-tuned Model This model is a LoRA fine-tuned version of Microsoft's Phi-4-mini-instruct, optimized for improved code review using GitHub data. ## Model Details ### Model Description This is a fine-tuned version of Microsoft's Phi-4-mini-instruct model using LoRA (Low-Rank Adaptation) technique. The model has been trained on 10k instruction-response pairs to enhance its ability to follow instructions and generate high-quality responses across various tasks. The model uses 4-bit quantization with NF4 for efficient inference while maintaining performance quality. It's designed to be a lightweight yet capable language model suitable for various text generation tasks. - **Developed by:** Milos Kotlar - **Model type:** Causal Language Model - **Language(s) (NLP):** English - **License:** MIT - **Finetuned from model:** microsoft/Phi-4-mini-instruct ### Model Sources - **Repository:** https://github.com/kotlarmilos/phi4-finetuned - **Demo:** https://huggingface.co/spaces/kotlarmilos/dotnet-runtime ## Uses ### Direct Use The model is designed for: - **Instruction Following**: Generate responses to user instructions and queries - **Conversational AI**: Engage in multi-turn conversations - **Task Completion**: Help with various text-based tasks like summarization, explanation, and creative writing - **Educational Support**: Provide explanations and assistance for learning ### Downstream Use The model can be integrated into: - **Chatbot Applications**: Web applications, mobile apps, and customer service systems - **Content Generation Tools**: Writing assistants and creative content platforms - **Educational Platforms**: Tutoring systems and interactive learning environments - **API Services**: Text generation services and intelligent automation workflows ### Out-of-Scope Use The model is **not intended for**: - **Factual Information Retrieval**: May generate plausible but incorrect information - **Professional Medical/Legal Advice**: Not qualified for specialized professional guidance - **Real-time Critical Systems**: Not suitable for safety-critical applications - **Harmful Content Generation**: Should not be used to create misleading, harmful, or malicious content ## How to Get Started with the Model ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel # Load base model with quantization base_model = "kotlarmilos/Phi-4-mini-instruct" lora_path = "artifacts/phi4-finetuned" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16, ) tokenizer = AutoTokenizer.from_pretrained(base_model, use_fast=True) base = AutoModelForCausalLM.from_pretrained( base_model, quantization_config=bnb_config, device_map="auto", trust_remote_code=True ) # Load LoRA adapter model = PeftModel.from_pretrained(base, lora_path) # Generate text def generate(prompt): inputs = tokenizer(prompt, return_tensors="pt").to(model.device) output = model.generate( **inputs, max_new_tokens=256, do_sample=True, temperature=0.7, pad_token_id=tokenizer.eos_token_id ) return tokenizer.decode(output[0], skip_special_tokens=True) # Example usage prompt = "Review the following code changes:" response = generate(prompt) print(response) ``` ## Training Details ### Training Data The model was fine-tuned on approximately 10,000 high-quality instruction-response pairs designed to improve the model's ability to follow instructions and generate helpful, accurate responses across various domains. **Data Characteristics**: - **Size**: ~10,000 instruction-response pairs - **Format**: Structured instruction-following conversations - **Coverage**: Diverse topics and instruction types ### Training Procedure #### Preprocessing 1. **Data Preparation**: Instruction-response pairs formatted for causal language modeling 2. **Tokenization**: Text processed using Phi-4's tokenizer with appropriate special tokens 3. **Sequence Formatting**: Proper formatting for instruction-following tasks 4. **Quality Filtering**: Removal of low-quality or potentially harmful content #### Training Hyperparameters **LoRA Configuration**: - **LoRA Rank (r)**: 8 - **LoRA Alpha**: 16 - **LoRA Dropout**: 0.05 - **Target Modules**: ["qkv_proj", "gate_up_proj"] - **Task Type**: CAUSAL_LM **Training Setup**: - **Base Model**: microsoft/Phi-4-mini-instruct - **Training Method**: LoRA (Low-Rank Adaptation) - **Quantization**: 4-bit NF4 with BitsAndBytes - **Training regime**: Mixed precision training with appropriate optimization ## Usage Examples If you use this model, please refer to https://github.com/kotlarmilos/phi4-finetuned