---
base_model: microsoft/Phi-4-mini-instruct
library_name: peft
tags:
  - text-generation
  - instruction-tuning
  - lora
  - fine-tuned
  - phi-4
  - pytorch
  - transformers
license: mit
language:
  - en
pipeline_tag: text-generation
inference: true
---

# Model Card for Phi-4 LoRA Fine-tuned Model

This model is a LoRA fine-tuned version of Microsoft's Phi-4-mini-instruct, optimized for improved code review using GitHub data.

## Model Details

### Model Description

This is a fine-tuned version of Microsoft's Phi-4-mini-instruct model using LoRA (Low-Rank Adaptation) technique. The model has been trained on 10k instruction-response pairs to enhance its ability to follow instructions and generate high-quality responses across various tasks.

The model uses 4-bit quantization with NF4 for efficient inference while maintaining performance quality. It's designed to be a lightweight yet capable language model suitable for various text generation tasks.

- **Developed by:** Milos Kotlar
- **Model type:** Causal Language Model
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** microsoft/Phi-4-mini-instruct

### Model Sources

- **Repository:** https://github.com/kotlarmilos/phi4-finetuned
- **Demo:** https://huggingface.co/spaces/kotlarmilos/dotnet-runtime

## Uses

### Direct Use

The model is designed for:

- **Instruction Following**: Generate responses to user instructions and queries
- **Conversational AI**: Engage in multi-turn conversations
- **Task Completion**: Help with various text-based tasks like summarization, explanation, and creative writing
- **Educational Support**: Provide explanations and assistance for learning

### Downstream Use

The model can be integrated into:

- **Chatbot Applications**: Web applications, mobile apps, and customer service systems
- **Content Generation Tools**: Writing assistants and creative content platforms
- **Educational Platforms**: Tutoring systems and interactive learning environments
- **API Services**: Text generation services and intelligent automation workflows

### Out-of-Scope Use

The model is **not intended for**:

- **Factual Information Retrieval**: May generate plausible but incorrect information
- **Professional Medical/Legal Advice**: Not qualified for specialized professional guidance
- **Real-time Critical Systems**: Not suitable for safety-critical applications
- **Harmful Content Generation**: Should not be used to create misleading, harmful, or malicious content

## How to Get Started with the Model

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# Load base model with quantization
base_model = "kotlarmilos/Phi-4-mini-instruct"
lora_path = "artifacts/phi4-finetuned"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(base_model, use_fast=True)

base = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base, lora_path)

# Generate text
def generate(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    output = model.generate(
        **inputs, 
        max_new_tokens=256, 
        do_sample=True, 
        temperature=0.7,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example usage
prompt = "Review the following code changes:"
response = generate(prompt)
print(response)
```

## Training Details

### Training Data

The model was fine-tuned on approximately 10,000 high-quality instruction-response pairs designed to improve the model's ability to follow instructions and generate helpful, accurate responses across various domains.

**Data Characteristics**:
- **Size**: ~10,000 instruction-response pairs
- **Format**: Structured instruction-following conversations
- **Coverage**: Diverse topics and instruction types

### Training Procedure

#### Preprocessing

1. **Data Preparation**: Instruction-response pairs formatted for causal language modeling
2. **Tokenization**: Text processed using Phi-4's tokenizer with appropriate special tokens
3. **Sequence Formatting**: Proper formatting for instruction-following tasks
4. **Quality Filtering**: Removal of low-quality or potentially harmful content

#### Training Hyperparameters

**LoRA Configuration**:
- **LoRA Rank (r)**: 8
- **LoRA Alpha**: 16
- **LoRA Dropout**: 0.05
- **Target Modules**: ["qkv_proj", "gate_up_proj"]
- **Task Type**: CAUSAL_LM

**Training Setup**:
- **Base Model**: microsoft/Phi-4-mini-instruct
- **Training Method**: LoRA (Low-Rank Adaptation)
- **Quantization**: 4-bit NF4 with BitsAndBytes
- **Training regime**: Mixed precision training with appropriate optimization


## Usage Examples
If you use this model, please refer to https://github.com/kotlarmilos/phi4-finetuned