|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- text-generation |
|
|
- instruction-tuning |
|
|
- multi-task |
|
|
- reasoning |
|
|
- email |
|
|
- summarization |
|
|
- chat |
|
|
- peft |
|
|
- lora |
|
|
- qwen |
|
|
- deepseek |
|
|
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B |
|
|
datasets: |
|
|
- HuggingFaceTB/smoltalk |
|
|
- snoop2head/enron_aeslc_emails |
|
|
- lucadiliello/STORIES |
|
|
- abisee/cnn_dailymail |
|
|
- wiki40b |
|
|
model_type: causal-lm |
|
|
inference: true |
|
|
library_name: peft |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# 🧠 Deepseek-R1-multitask-lora |
|
|
|
|
|
**Author:** Gilbert Akham |
|
|
**License:** Apache-2.0 |
|
|
**Base model:** [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) |
|
|
**Adapter type:** LoRA (PEFT) |
|
|
**Capabilities:** Multi-task generalization & reasoning |
|
|
|
|
|
--- |
|
|
|
|
|
# 🚀 What It Can Do |
|
|
|
|
|
This multitask fine-tuned model handles a broad set of natural language and reasoning-based tasks, such as: |
|
|
|
|
|
✉️ Email & message writing — generate clear, friendly, or professional communications. |
|
|
|
|
|
📖 Story & creative writing — craft imaginative narratives, poems, and dialogues. |
|
|
|
|
|
💬 Conversational chat — maintain coherent, context-aware conversations. |
|
|
|
|
|
💡 Explanations & tutoring — explain technical or abstract topics simply. |
|
|
|
|
|
🧩 Reasoning & logic tasks — provide step-by-step answers for analytical questions. |
|
|
|
|
|
💻 Code generation & explanation — write and explain Python or general programming code. |
|
|
|
|
|
🌍 Translation & summarization — translate between multiple languages or condense information. |
|
|
|
|
|
The model’s multi-domain training (based on datasets like SmolTalk, Everyday Conversations, and reasoning-rich samples) makes it suitable for assistants, chatbots, content generators, or educational tools. |
|
|
--- |
|
|
|
|
|
## 🧩 Training Details |
|
|
|
|
|
| Parameter | Value | |
|
|
|------------|-------| |
|
|
| Base model | `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` | |
|
|
| Adapter | LoRA (r=8, alpha=32, dropout=0.1) | |
|
|
| Max sequence length | 1024 | |
|
|
| Learning rate | 3e-5 (cosine decay) | |
|
|
| Optimizer | `adamw_8bit` | |
|
|
| Grad Accumulation | 4 | |
|
|
| Precision | 4-bit quantized, FP16 compute | |
|
|
| Steps | 12k total (best @ ~8.2k) | |
|
|
| Training time | ~2.5h on A4000 | |
|
|
| Frameworks | 🤗 Transformers, PEFT, TRL, BitsAndBytes | |
|
|
|
|
|
--- |
|
|
|
|
|
## 🧠 Reasoning Capability |
|
|
|
|
|
Thanks to integration of **SmolTalk** and diverse multi-task prompts, the model learns: |
|
|
- **Chain-of-thought style reasoning** |
|
|
- **Conversational grounding** |
|
|
- **Multi-step logical inferences** |
|
|
- **Instruction following** across domains |
|
|
|
|
|
Example: |
|
|
```text |
|
|
### Task: Explain reasoning |
|
|
|
|
|
### Input: |
|
|
If a train leaves City A at 3 PM and arrives at City B at 6 PM, covering 180 km, what is its average speed? |
|
|
|
|
|
### Output: |
|
|
The train travels 180 km in 3 hours. |
|
|
Average speed = 180 ÷ 3 = 60 km/h. |
|
|
|