README.md · GilbertAkham/deepseek-R1-multitask-lora at 730c68deb20c9bbd498d010dceb698a092ee6026

deepseek-R1-multitask-lora / README.md

GilbertAkham

Update README.md

e1569e2 verified 4 months ago

preview code

raw

history blame

2.77 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation
	- instruction-tuning
	- multi-task
	- reasoning
	- email
	- summarization
	- chat
	- peft
	- lora
	- qwen
	- deepseek
	base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
	datasets:
	- HuggingFaceTB/smoltalk
	- snoop2head/enron_aeslc_emails
	- lucadiliello/STORIES
	- abisee/cnn_dailymail
	- wiki40b
	model_type: causal-lm
	inference: true
	library_name: peft
	pipeline_tag: text-generation
	---

	# 🧠 Deepseek-R1-multitask-lora

	Author: Gilbert Akham
	License: Apache-2.0
	Base model: [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
	Adapter type: LoRA (PEFT)
	Capabilities: Multi-task generalization & reasoning

	---

	# 🚀 What It Can Do

	This multitask fine-tuned model handles a broad set of natural language and reasoning-based tasks, such as:

	✉️ Email & message writing — generate clear, friendly, or professional communications.

	📖 Story & creative writing — craft imaginative narratives, poems, and dialogues.

	💬 Conversational chat — maintain coherent, context-aware conversations.

	💡 Explanations & tutoring — explain technical or abstract topics simply.

	🧩 Reasoning & logic tasks — provide step-by-step answers for analytical questions.

	💻 Code generation & explanation — write and explain Python or general programming code.

	🌍 Translation & summarization — translate between multiple languages or condense information.

	The model’s multi-domain training (based on datasets like SmolTalk, Everyday Conversations, and reasoning-rich samples) makes it suitable for assistants, chatbots, content generators, or educational tools.
	---

	## 🧩 Training Details

	\| Parameter \| Value \|
	\|------------\|-------\|
	\| Base model \| `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` \|
	\| Adapter \| LoRA (r=8, alpha=32, dropout=0.1) \|
	\| Max sequence length \| 1024 \|
	\| Learning rate \| 3e-5 (cosine decay) \|
	\| Optimizer \| `adamw_8bit` \|
	\| Grad Accumulation \| 4 \|
	\| Precision \| 4-bit quantized, FP16 compute \|
	\| Steps \| 12k total (best @ ~8.2k) \|
	\| Training time \| ~2.5h on A4000 \|
	\| Frameworks \| 🤗 Transformers, PEFT, TRL, BitsAndBytes \|

	---

	## 🧠 Reasoning Capability

	Thanks to integration of SmolTalk and diverse multi-task prompts, the model learns:
	- Chain-of-thought style reasoning
	- Conversational grounding
	- Multi-step logical inferences
	- Instruction following across domains

	Example:
	```text
	### Task: Explain reasoning

	### Input:
	If a train leaves City A at 3 PM and arrives at City B at 6 PM, covering 180 km, what is its average speed?

	### Output:
	The train travels 180 km in 3 hours.
	Average speed = 180 ÷ 3 = 60 km/h.