--- license: apache-2.0 base_model: - Qwen/Qwen3-0.6B library_name: transformers tags: - unsloth - reasoning - code - chain-of-thought - text-generation - shadow - conversational datasets: - unsloth/gsm8k - deepseek-ai/DeepSeek-R1 pipeline_tag: text-generation --- # 🌑 Shadow 0.7B (Reasoning + Coding Edition) **Shadow 0.7B** is a specialized Small Language Model (SLM) optimized for **logical reasoning, competitive programming, and chain-of-thought processing**. Built on the **Qwen3 0.6B** architecture and fine-tuned using **Unsloth**, Shadow delivers surprising reasoning depth and "thinking-first" responses uncommon for a model of this size. --- ## Key Features * 🧠 **Structured Reasoning:** Uses `` style internal reasoning patterns to improve answer quality. * 💻 **Coding Specialist:** Excels at Python, C++, and algorithmic problem-solving. * ⚡ **Ultra-Lightweight:** Runs on CPU, T4, mobile, or even low-VRAM consumer GPUs. --- ## 💻 Quick Start (Python) ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "Redhanuman/Shadow-0.7B" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "Write a Python script to check for palindromes. Explain your logic." messages = [ {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **inputs, max_new_tokens=1024 ) print(tokenizer.decode(generated_ids[0], skip_special_tokens=True)) ``` ## 🛠️ Training Details - **Creator:** Aman Kumar Pandey (LPU) - **Framework:** Unsloth (2× faster training) - **Base Model:** Qwen3-0.6B - **Method:** QLoRA fine-tuning with Chain-of-Draft (CoD) reasoning data - **Datasets:** GSM8K, DeepSeek R1 distilled reasoning samples