| | --- |
| | language: |
| | - en |
| | license: apache-2.0 |
| | tags: |
| | - text-generation |
| | - instruction-tuning |
| | - multi-task |
| | - reasoning |
| | - email |
| | - summarization |
| | - chat |
| | - peft |
| | - lora |
| | - qwen |
| | - deepseek |
| | base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B |
| | datasets: |
| | - HuggingFaceTB/smoltalk |
| | - snoop2head/enron_aeslc_emails |
| | - lucadiliello/STORIES |
| | - abisee/cnn_dailymail |
| | - wiki40b |
| | model_type: causal-lm |
| | inference: true |
| | library_name: peft |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | # 🧠 Deepseek-R1-multitask-lora |
| |
|
| | **Author:** Gilbert Akham |
| | **License:** Apache-2.0 |
| | **Base model:** [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) |
| | **Adapter type:** LoRA (PEFT) |
| | **Capabilities:** Multi-task generalization & reasoning |
| |
|
| | --- |
| |
|
| | # 🚀 What It Can Do |
| |
|
| | This multitask fine-tuned model handles a broad set of natural language and reasoning-based tasks, such as: |
| |
|
| | ✉️ Email & message writing — generate clear, friendly, or professional communications. |
| |
|
| | 📖 Story & creative writing — craft imaginative narratives, poems, and dialogues. |
| |
|
| | 💬 Conversational chat — maintain coherent, context-aware conversations. |
| |
|
| | 💡 Explanations & tutoring — explain technical or abstract topics simply. |
| |
|
| | 🧩 Reasoning & logic tasks — provide step-by-step answers for analytical questions. |
| |
|
| | 💻 Code generation & explanation — write and explain Python or general programming code. |
| |
|
| | 🌍 Translation & summarization — translate between multiple languages or condense information. |
| |
|
| | The model’s multi-domain training (based on datasets like SmolTalk, Everyday Conversations, and reasoning-rich samples) makes it suitable for assistants, chatbots, content generators, or educational tools. |
| | --- |
| |
|
| | ## 🧩 Training Details |
| |
|
| | | Parameter | Value | |
| | |------------|-------| |
| | | Base model | `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` | |
| | | Adapter | LoRA (r=8, alpha=32, dropout=0.1) | |
| | | Max sequence length | 1024 | |
| | | Learning rate | 3e-5 (cosine decay) | |
| | | Optimizer | `adamw_8bit` | |
| | | Grad Accumulation | 4 | |
| | | Precision | 4-bit quantized, FP16 compute | |
| | | Steps | 12k total (best @ ~8.2k) | |
| | | Training time | ~2.5h on A4000 | |
| | | Frameworks | 🤗 Transformers, PEFT, TRL, BitsAndBytes | |
| |
|
| | --- |
| |
|
| | ## 🧠 Reasoning Capability |
| |
|
| | Thanks to integration of **SmolTalk** and diverse multi-task prompts, the model learns: |
| | - **Chain-of-thought style reasoning** |
| | - **Conversational grounding** |
| | - **Multi-step logical inferences** |
| | - **Instruction following** across domains |
| |
|
| | Example: |
| | ```text |
| | ### Task: Explain reasoning |
| | |
| | ### Input: |
| | If a train leaves City A at 3 PM and arrives at City B at 6 PM, covering 180 km, what is its average speed? |
| | |
| | ### Output: |
| | The train travels 180 km in 3 hours. |
| | Average speed = 180 ÷ 3 = 60 km/h. |
| | |