---
license: apache-2.0
base_model: Nebulixlabs/Nutral-Base
tags:
- text-generation
- custom-architecture
- qwen
- instruct
- reasoning
- chain-of-thought
- peft
- lora
- alpaca
language:
- en
pipeline_tag: text-generation
---
# 🧠 Nutral Reasoning Instruct
**Nutral Reasoning Instruct** is a highly optimized, lightweight instructional language model capable of structured **Chain-of-Thought (CoT) Reasoning**. Built on top of the custom pre-trained **Nutral Base**, this model has undergone Supervised Fine-Tuning (SFT) to follow user instructions while explicitly generating analytical thought processes before providing the final answer.
The model naturally outputs its internal reasoning steps inside `` and `` blocks, making its decision-making process transparent and highly structured.
---
## 📌 Model Details
* **Base Architecture:** **Qwen2** (`Qwen2ForCausalLM`)
* **Training Type:** Supervised Fine-Tuning (SFT) with LoRA (Merged & Unloaded)
* **Natural Language:** English (`en`)
* **Programming Language:** Python
* **Primary Task:** Instruction Following & Analytical Reasoning
* **Format Supported:** ChatML + Explicit `` blocks
---
## 📊 Architecture & Parameters
The core architecture shares the exact high-speed, lightweight blueprint of the Nutral Base model. During Phase 2, LoRA adapters were trained and **permanently merged** into the base weights for zero-latency inference.
| Hyperparameter | Configuration Value |
| :--- | :--- |
| **Total Parameters** | **~17.5 Million (17,498,368)** |
| **Embedding Dimension** | 512 |
| **Number of Layers** | 8 |
| **Attention Heads** | 8 |
| **Context Window** | 256 tokens |
| **LoRA Configuration** | `r=8`, `alpha=16`, `dropout=0.05` |
| **Target Modules** | `q_proj`, `v_proj` |
---
## 🛠️ Fine-Tuning Dataset & SFT Strategy
The model was fine-tuned using a dynamically generated synthetic reasoning methodology to bypass standard TRL library limitations, ensuring perfect ChatML alignment.
* **Dataset Name:** `tatsu-lab/alpaca` (Train split subset: 2,500 highly curated samples)
* **Reasoning Injection:** Each instruction was dynamically categorized (e.g., Analytical Reasoning, Creative Generation, Instructional Breakdown) to synthetically generate a multi-phase thought process (Intent, Retrieval, Logic, Output).
* **Objective:** Causal Language Modeling applied to structured instruction-response pairs.
---
## ⚙️ Hardware & SFT Infrastructure
The Instruct phase utilized Parameter-Efficient Fine-Tuning (PEFT) on Kaggle's multi-GPU infrastructure to optimize VRAM utilization:
* **Hardware Used:** **2x NVIDIA T4 Tensor Core GPUs**
* **Precision Mode:** FP16 (Mixed Precision)
* **Optimizer Setup:** AdamW with a learning rate of `3e-4`
* **Batching:** Per-device batch size of 8 with 4 gradient accumulation steps.
* **Epochs:** 1
---
## 📦 Core Technical Libraries Used
* **`transformers`** - Core model loading, ChatML formatting, and primary training loop (`Trainer`).
* **`peft`** - Applied Low-Rank Adaptation (LoRA) to efficiently train specific attention weights without catastrophically forgetting base knowledge.
* **`datasets`** - Used to fetch and process the Hugging Face Alpaca instruction dataset.
* **`llama.cpp`** - Utilized post-training to compile the raw FP16 PyTorch tensors into highly optimized **GGUF** binaries for edge-device deployment.
---
## 💬 Prompt Format (Crucial for Reasoning)
To utilize the reasoning capabilities correctly, you **must** use the ChatML format. The model is trained to expect `<|im_start|>system`, `<|im_start|>user`, and `<|im_start|>assistant` tags.
```text
<|im_start|>system
You are Nutral_Qwen, a highly intelligent AI. Always reason your thoughts inside and blocks.<|im_end|>
<|im_start|>user
Write a short poem about the moon.<|im_end|>
<|im_start|>assistant
[Phase 1: Intent] Task classified as 'Creative Generation'. Analyzing: 'Write a short poem about the m...'
[Phase 2: Retrieval] Gathering key facts and constraints.
[Phase 3: Logic] Formulating step-by-step response.
[Phase 4: Output] Structuring final answer.
The silver orb in the velvet night,
Casting down its gentle light...<|im_end|>