--- license: apache-2.0 base_model: Nebulixlabs/Nutral-Base tags: - text-generation - custom-architecture - qwen - instruct - reasoning - chain-of-thought - peft - lora - alpaca language: - en pipeline_tag: text-generation --- # 🧠 Nutral Reasoning Instruct **Nutral Reasoning Instruct** is a highly optimized, lightweight instructional language model capable of structured **Chain-of-Thought (CoT) Reasoning**. Built on top of the custom pre-trained **Nutral Base**, this model has undergone Supervised Fine-Tuning (SFT) to follow user instructions while explicitly generating analytical thought processes before providing the final answer. The model naturally outputs its internal reasoning steps inside `` and `` blocks, making its decision-making process transparent and highly structured. --- ## 📌 Model Details * **Base Architecture:** **Qwen2** (`Qwen2ForCausalLM`) * **Training Type:** Supervised Fine-Tuning (SFT) with LoRA (Merged & Unloaded) * **Natural Language:** English (`en`) * **Programming Language:** Python * **Primary Task:** Instruction Following & Analytical Reasoning * **Format Supported:** ChatML + Explicit `` blocks --- ## 📊 Architecture & Parameters The core architecture shares the exact high-speed, lightweight blueprint of the Nutral Base model. During Phase 2, LoRA adapters were trained and **permanently merged** into the base weights for zero-latency inference. | Hyperparameter | Configuration Value | | :--- | :--- | | **Total Parameters** | **~17.5 Million (17,498,368)** | | **Embedding Dimension** | 512 | | **Number of Layers** | 8 | | **Attention Heads** | 8 | | **Context Window** | 256 tokens | | **LoRA Configuration** | `r=8`, `alpha=16`, `dropout=0.05` | | **Target Modules** | `q_proj`, `v_proj` | --- ## 🛠️ Fine-Tuning Dataset & SFT Strategy The model was fine-tuned using a dynamically generated synthetic reasoning methodology to bypass standard TRL library limitations, ensuring perfect ChatML alignment. * **Dataset Name:** `tatsu-lab/alpaca` (Train split subset: 2,500 highly curated samples) * **Reasoning Injection:** Each instruction was dynamically categorized (e.g., Analytical Reasoning, Creative Generation, Instructional Breakdown) to synthetically generate a multi-phase thought process (Intent, Retrieval, Logic, Output). * **Objective:** Causal Language Modeling applied to structured instruction-response pairs. --- ## ⚙️ Hardware & SFT Infrastructure The Instruct phase utilized Parameter-Efficient Fine-Tuning (PEFT) on Kaggle's multi-GPU infrastructure to optimize VRAM utilization: * **Hardware Used:** **2x NVIDIA T4 Tensor Core GPUs** * **Precision Mode:** FP16 (Mixed Precision) * **Optimizer Setup:** AdamW with a learning rate of `3e-4` * **Batching:** Per-device batch size of 8 with 4 gradient accumulation steps. * **Epochs:** 1 --- ## 📦 Core Technical Libraries Used * **`transformers`** - Core model loading, ChatML formatting, and primary training loop (`Trainer`). * **`peft`** - Applied Low-Rank Adaptation (LoRA) to efficiently train specific attention weights without catastrophically forgetting base knowledge. * **`datasets`** - Used to fetch and process the Hugging Face Alpaca instruction dataset. * **`llama.cpp`** - Utilized post-training to compile the raw FP16 PyTorch tensors into highly optimized **GGUF** binaries for edge-device deployment. --- ## 💬 Prompt Format (Crucial for Reasoning) To utilize the reasoning capabilities correctly, you **must** use the ChatML format. The model is trained to expect `<|im_start|>system`, `<|im_start|>user`, and `<|im_start|>assistant` tags. ```text <|im_start|>system You are Nutral_Qwen, a highly intelligent AI. Always reason your thoughts inside and blocks.<|im_end|> <|im_start|>user Write a short poem about the moon.<|im_end|> <|im_start|>assistant [Phase 1: Intent] Task classified as 'Creative Generation'. Analyzing: 'Write a short poem about the m...' [Phase 2: Retrieval] Gathering key facts and constraints. [Phase 3: Logic] Formulating step-by-step response. [Phase 4: Output] Structuring final answer. The silver orb in the velvet night, Casting down its gentle light...<|im_end|>