DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct

Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B for instruction-following tasks using LoRA on the Alpaca dataset.

Overview

Base Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B (1.5B parameters)
Fine-tuning Method: LoRA (4-bit quantization)
Dataset: Alpaca instruction dataset (52K samples)
Training: 3 epochs with optimized hyperparameters

Key Features

Improved instruction following capabilities
Conversational AI for question answering
Memory efficient training with LoRA
Production-ready merged model

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("sweatSmile/DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct")
tokenizer = AutoTokenizer.from_pretrained("sweatSmile/DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct")

# Example
prompt = "Human: What is machine learning?\n\nAssistant:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

LoRA rank: 8, alpha: 16
4-bit NF4 quantization with bfloat16
Learning rate: 1e-4 with cosine scheduling
Batch size: 8, Max length: 512 tokens

Trained for efficient deployment in production environments.

Downloads last month: 5

Safetensors

Model size

2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sweatSmile/DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct

Base model

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Adapter

(313)

this model

sweatSmile
/

DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct