YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 Fine-Tuning DeepSeek R1 with Unsloth on Alpaca-GPT4 Dataset

This project demonstrates how to fine-tune the unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit model using the vicgalle/alpaca-gpt4 dataset with LoRA and Unsloth's efficient training interface.


πŸš€ Model & Tokenizer Setup

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype=None,
    load_in_4bit = True,
)

πŸ”§ LoRA PEFT Configuration

model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = [
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ],
    use_rslora = True,
)

πŸ“š Dataset: Alpaca-GPT4

We use a subset (5,000 rows) of the Alpaca-GPT4 dataset for quick fine-tuning:

from datasets import load_dataset
dataset = load_dataset("vicgalle/alpaca-gpt4", split="train[:5000]")

πŸ“‚ Output Directory

All model checkpoints and logs are saved in the outputs/ directory.


🧠 Notes

  • Fine-tuning used LoRA with r=16 on 4-bit quantized weights.
  • Only 5k rows were used for fast iteration.
  • apply_chat_template() helped match conversational finetuning structure.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support