Llama-3.1-8B LoRA - Alpaca Fine-tune

A LoRA adapter for Llama-3.1-8B fine-tuned on the Alpaca Cleaned dataset for instruction following.

Model Details

  • Base Model: unsloth/Llama-3.1-8B
  • Model Type: LoRA adapter (PEFT)
  • Language: English
  • License: Llama 3.1 Community License
  • Fine-tuning Method: Supervised Fine-Tuning (SFT) with LoRA

LoRA Configuration

Parameter Value
r (rank) 16
alpha 16
dropout 0
target_modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Training Details

Parameter Value
Dataset yahma/alpaca-cleaned
Training Samples 51,760
Training Steps 809
Final Loss 1.20
Training Time ~1.7 hours
Precision bf16
Peak GPU Memory 37.6 GB

Evaluation Results

Evaluated on IFEval (Instruction Following Evaluation):

Metric Score
Prompt-level Strict Accuracy 22.4%
Prompt-level Loose Accuracy 22.4%
Instruction-level Strict Accuracy 36.7%
Instruction-level Loose Accuracy 36.7%

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Llama-3.1-8B",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("unsloth/Llama-3.1-8B")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/YOUR_MODEL_NAME")

# Generate
messages = [{"role": "user", "content": "Explain quantum computing in simple terms."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Unsloth (Faster)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="YOUR_USERNAME/YOUR_MODEL_NAME",
    max_seq_length=2048,
    load_in_4bit=True,  # or False for full precision
)
FastLanguageModel.for_inference(model)

# Generate
messages = [{"role": "user", "content": "Explain quantum computing in simple terms."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

  • This is a base model fine-tuned on Alpaca, not an instruct model. Response quality may vary.
  • The model inherits limitations from the base Llama-3.1-8B model.
  • Not suitable for production use without further evaluation and safety testing.

Framework Versions

  • PEFT: 0.18.1
  • Transformers: 4.x
  • Unsloth: latest
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dhlak/llama-3.1-8b-alpaca-lora

Adapter
(4)
this model

Dataset used to train dhlak/llama-3.1-8b-alpaca-lora

Paper for dhlak/llama-3.1-8b-alpaca-lora