sweatSmile
/

DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct

instruction-following

conversational-ai

Model card Files Files and versions

sweatSmile commited on Sep 20, 2025

Commit

1d12afb

·

verified ·

1 Parent(s): 5fbbbde

Update README.md

Files changed (1) hide show

README.md +51 -0

README.md CHANGED Viewed

@@ -1,3 +1,54 @@
 ---
 license: apache-2.0
 ---

 ---
+base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
+tags:
+- instruction-following
+- conversational-ai
+- lora
+- alpaca
+- 4bit
+- intruct
 license: apache-2.0
+datasets:
+- tatsu-lab/alpaca
+language:
+- en
 ---
+# DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct
+Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B for instruction-following tasks using LoRA on the Alpaca dataset.
+## Overview
+- **Base Model:** deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B (1.5B parameters)
+- **Fine-tuning Method:** LoRA (4-bit quantization)
+- **Dataset:** Alpaca instruction dataset (52K samples)
+- **Training:** 3 epochs with optimized hyperparameters
+## Key Features
+- Improved instruction following capabilities
+- Conversational AI for question answering
+- Memory efficient training with LoRA
+- Production-ready merged model
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained("sweatSmile/DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct")
+tokenizer = AutoTokenizer.from_pretrained("sweatSmile/DeepSeek-R1-Distill-Qwen-1.5B-Alpaca-Instruct")
+# Example
+prompt = "Human: What is machine learning?\n\nAssistant:"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=200)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## Training Details
+- LoRA rank: 8, alpha: 16
+- 4-bit NF4 quantization with bfloat16
+- Learning rate: 1e-4 with cosine scheduling
+- Batch size: 8, Max length: 512 tokens
+Trained for efficient deployment in production environments.