|
|
--- |
|
|
license: mit |
|
|
base_model: microsoft/Phi-3-mini-4k-instruct |
|
|
tags: |
|
|
- phi3 |
|
|
- lora |
|
|
- alpaca |
|
|
- instruction-tuning |
|
|
- fine-tuned |
|
|
datasets: |
|
|
- tatsu-lab/alpaca |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned |
|
|
|
|
|
This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using LoRA (Low-Rank Adaptation) on the [Alpaca dataset](https://huggingface.co/datasets/tatsu-lab/alpaca). |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: microsoft/Phi-3-mini-4k-instruct |
|
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) |
|
|
- **Dataset**: tatsu-lab/alpaca (52,002 instruction-following examples) |
|
|
- **Training Duration**: ~1.24 hours |
|
|
- **Final Training Loss**: 1.0445 |
|
|
- **Average Training Loss**: 1.0311 |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
- **LoRA Rank**: 16 |
|
|
- **LoRA Alpha**: 32 |
|
|
- **LoRA Dropout**: 0.05 |
|
|
- **Target Modules**: qkv_proj, o_proj, gate_proj, up_proj, down_proj |
|
|
- **Learning Rate**: 1e-5 |
|
|
- **Batch Size**: 2 (with gradient accumulation steps: 8) |
|
|
- **Epochs**: 1 |
|
|
- **Precision**: bfloat16 |
|
|
- **Gradient Checkpointing**: Enabled |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True) |
|
|
|
|
|
# Load base model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"microsoft/Phi-3-mini-4k-instruct", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Load LoRA adapters |
|
|
model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora") |
|
|
model.eval() |
|
|
|
|
|
# Format prompt |
|
|
prompt = "Give three tips for staying healthy." |
|
|
formatted_prompt = f'''### Instruction: |
|
|
{prompt} |
|
|
|
|
|
### Response: |
|
|
''' |
|
|
|
|
|
# Generate |
|
|
inputs = tokenizer(formatted_prompt, return_tensors="pt") |
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=200, |
|
|
do_sample=False, |
|
|
eos_token_id=tokenizer.eos_token_id, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(response.split("### Response:")[1].strip()) |
|
|
``` |
|
|
|
|
|
## Performance |
|
|
|
|
|
The model has been tested with comprehensive safety measures including: |
|
|
- ✅ NaN clamp protection for stable generation |
|
|
- ✅ Proper bfloat16 precision handling |
|
|
- ✅ Consistent and coherent responses across multiple test prompts |
|
|
- ✅ No numerical instabilities during training or inference |
|
|
|
|
|
## Training Details |
|
|
|
|
|
This model was fine-tuned with careful attention to: |
|
|
1. **Data Formatting**: Proper Alpaca instruction/input/output structure |
|
|
2. **Numerical Stability**: Using bfloat16 precision and conservative hyperparameters |
|
|
3. **Memory Efficiency**: Gradient checkpointing and optimized batch sizes |
|
|
4. **Safety Measures**: NaN protection and proper token handling |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the MIT license, following the base model's licensing terms. |
|
|
|