johnlam90
/

phi3-mini-4k-instruct-alpaca-lora

Text Generation

instruction-tuning

Model card Files Files and versions

phi3-mini-4k-instruct-alpaca-lora / README.md

johnlam90's picture

Upload fine-tuned Phi-3 Mini with LoRA adapters

2426ccb verified 6 months ago

|

history blame contribute delete

3.02 kB

	---
	license: mit
	base_model: microsoft/Phi-3-mini-4k-instruct
	tags:
	- phi3
	- lora
	- alpaca
	- instruction-tuning
	- fine-tuned
	datasets:
	- tatsu-lab/alpaca
	language:
	- en
	pipeline_tag: text-generation
	---

	# Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned

	This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using LoRA (Low-Rank Adaptation) on the [Alpaca dataset](https://huggingface.co/datasets/tatsu-lab/alpaca).

	## Model Details

	- Base Model: microsoft/Phi-3-mini-4k-instruct
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Dataset: tatsu-lab/alpaca (52,002 instruction-following examples)
	- Training Duration: ~1.24 hours
	- Final Training Loss: 1.0445
	- Average Training Loss: 1.0311

	## Training Configuration

	- LoRA Rank: 16
	- LoRA Alpha: 32
	- LoRA Dropout: 0.05
	- Target Modules: qkv_proj, o_proj, gate_proj, up_proj, down_proj
	- Learning Rate: 1e-5
	- Batch Size: 2 (with gradient accumulation steps: 8)
	- Epochs: 1
	- Precision: bfloat16
	- Gradient Checkpointing: Enabled

	## Usage

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained(
	"microsoft/Phi-3-mini-4k-instruct",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	# Load LoRA adapters
	model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora")
	model.eval()

	# Format prompt
	prompt = "Give three tips for staying healthy."
	formatted_prompt = f'''### Instruction:
	{prompt}

	### Response:
	'''

	# Generate
	inputs = tokenizer(formatted_prompt, return_tensors="pt")
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=200,
	do_sample=False,
	eos_token_id=tokenizer.eos_token_id,
	pad_token_id=tokenizer.eos_token_id
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response.split("### Response:")[1].strip())
	```

	## Performance

	The model has been tested with comprehensive safety measures including:
	- ✅ NaN clamp protection for stable generation
	- ✅ Proper bfloat16 precision handling
	- ✅ Consistent and coherent responses across multiple test prompts
	- ✅ No numerical instabilities during training or inference

	## Training Details

	This model was fine-tuned with careful attention to:
	1. Data Formatting: Proper Alpaca instruction/input/output structure
	2. Numerical Stability: Using bfloat16 precision and conservative hyperparameters
	3. Memory Efficiency: Gradient checkpointing and optimized batch sizes
	4. Safety Measures: NaN protection and proper token handling

	## License

	This model is released under the MIT license, following the base model's licensing terms.