YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
π§ Fine-Tuning DeepSeek R1 with Unsloth on Alpaca-GPT4 Dataset
This project demonstrates how to fine-tune the unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit model using the vicgalle/alpaca-gpt4 dataset with LoRA and Unsloth's efficient training interface.
π Model & Tokenizer Setup
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit",
max_seq_length = max_seq_length,
dtype=None,
load_in_4bit = True,
)
π§ LoRA PEFT Configuration
model = FastLanguageModel.get_peft_model(
model,
r = 16,
target_modules = [
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
],
use_rslora = True,
)
π Dataset: Alpaca-GPT4
We use a subset (5,000 rows) of the Alpaca-GPT4 dataset for quick fine-tuning:
from datasets import load_dataset
dataset = load_dataset("vicgalle/alpaca-gpt4", split="train[:5000]")
π Output Directory
All model checkpoints and logs are saved in the outputs/ directory.
π§ Notes
- Fine-tuning used LoRA with r=16 on 4-bit quantized weights.
- Only 5k rows were used for fast iteration.
apply_chat_template()helped match conversational finetuning structure.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support