---
license: apache-2.0
base_model: Nebulixlabs/Nutral-Base
tags:
- text-generation
- custom-architecture
- qwen
- instruct
- reasoning
- chain-of-thought
- peft
- lora
- alpaca
language:
- en
pipeline_tag: text-generation
---

# 🧠 Nutral Reasoning Instruct

**Nutral Reasoning Instruct** is a highly optimized, lightweight instructional language model capable of structured **Chain-of-Thought (CoT) Reasoning**. Built on top of the custom pre-trained **Nutral Base**, this model has undergone Supervised Fine-Tuning (SFT) to follow user instructions while explicitly generating analytical thought processes before providing the final answer.

The model naturally outputs its internal reasoning steps inside `<think>` and `</think>` blocks, making its decision-making process transparent and highly structured. 

---

## 📌 Model Details

* **Base Architecture:** **Qwen2** (`Qwen2ForCausalLM`)
* **Training Type:** Supervised Fine-Tuning (SFT) with LoRA (Merged & Unloaded)
* **Natural Language:** English (`en`)
* **Programming Language:** Python
* **Primary Task:** Instruction Following & Analytical Reasoning
* **Format Supported:** ChatML + Explicit `<think>` blocks

---

## 📊 Architecture & Parameters

The core architecture shares the exact high-speed, lightweight blueprint of the Nutral Base model. During Phase 2, LoRA adapters were trained and **permanently merged** into the base weights for zero-latency inference.

| Hyperparameter | Configuration Value |
| :--- | :--- |
| **Total Parameters** | **~17.5 Million (17,498,368)** |
| **Embedding Dimension** | 512 |
| **Number of Layers** | 8 |
| **Attention Heads** | 8 |
| **Context Window** | 256 tokens |
| **LoRA Configuration** | `r=8`, `alpha=16`, `dropout=0.05` |
| **Target Modules** | `q_proj`, `v_proj` |

---

## 🛠️ Fine-Tuning Dataset & SFT Strategy

The model was fine-tuned using a dynamically generated synthetic reasoning methodology to bypass standard TRL library limitations, ensuring perfect ChatML alignment.

* **Dataset Name:** `tatsu-lab/alpaca` (Train split subset: 2,500 highly curated samples)
* **Reasoning Injection:** Each instruction was dynamically categorized (e.g., Analytical Reasoning, Creative Generation, Instructional Breakdown) to synthetically generate a multi-phase thought process (Intent, Retrieval, Logic, Output).
* **Objective:** Causal Language Modeling applied to structured instruction-response pairs.

---

## ⚙️ Hardware & SFT Infrastructure

The Instruct phase utilized Parameter-Efficient Fine-Tuning (PEFT) on Kaggle's multi-GPU infrastructure to optimize VRAM utilization:

* **Hardware Used:** **2x NVIDIA T4 Tensor Core GPUs**
* **Precision Mode:** FP16 (Mixed Precision)
* **Optimizer Setup:** AdamW with a learning rate of `3e-4`
* **Batching:** Per-device batch size of 8 with 4 gradient accumulation steps.
* **Epochs:** 1

---

## 📦 Core Technical Libraries Used

* **`transformers`** - Core model loading, ChatML formatting, and primary training loop (`Trainer`).
* **`peft`** - Applied Low-Rank Adaptation (LoRA) to efficiently train specific attention weights without catastrophically forgetting base knowledge.
* **`datasets`** - Used to fetch and process the Hugging Face Alpaca instruction dataset.
* **`llama.cpp`** - Utilized post-training to compile the raw FP16 PyTorch tensors into highly optimized **GGUF** binaries for edge-device deployment.

---

## 💬 Prompt Format (Crucial for Reasoning)

To utilize the reasoning capabilities correctly, you **must** use the ChatML format. The model is trained to expect `<|im_start|>system`, `<|im_start|>user`, and `<|im_start|>assistant` tags.

```text
<|im_start|>system
You are Nutral_Qwen, a highly intelligent AI. Always reason your thoughts inside <think> and </think> blocks.<|im_end|>
<|im_start|>user
Write a short poem about the moon.<|im_end|>
<|im_start|>assistant
<think>
[Phase 1: Intent] Task classified as 'Creative Generation'. Analyzing: 'Write a short poem about the m...'
[Phase 2: Retrieval] Gathering key facts and constraints.
[Phase 3: Logic] Formulating step-by-step response.
[Phase 4: Output] Structuring final answer.
</think>
The silver orb in the velvet night,
Casting down its gentle light...<|im_end|>