Text Generation
PEFT
Safetensors
English
qwen2
custom-architecture
qwen
instruct
reasoning
chain-of-thought
lora
alpaca
Instructions to use Nebulixlabs/Nutral-Reasoning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Nebulixlabs/Nutral-Reasoning with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| base_model: Nebulixlabs/Nutral-Base | |
| tags: | |
| - text-generation | |
| - custom-architecture | |
| - qwen | |
| - instruct | |
| - reasoning | |
| - chain-of-thought | |
| - peft | |
| - lora | |
| - alpaca | |
| language: | |
| - en | |
| pipeline_tag: text-generation | |
| # π§ Nutral Reasoning Instruct | |
| **Nutral Reasoning Instruct** is a highly optimized, lightweight instructional language model capable of structured **Chain-of-Thought (CoT) Reasoning**. Built on top of the custom pre-trained **Nutral Base**, this model has undergone Supervised Fine-Tuning (SFT) to follow user instructions while explicitly generating analytical thought processes before providing the final answer. | |
| The model naturally outputs its internal reasoning steps inside `<think>` and `</think>` blocks, making its decision-making process transparent and highly structured. | |
| --- | |
| ## π Model Details | |
| * **Base Architecture:** **Qwen2** (`Qwen2ForCausalLM`) | |
| * **Training Type:** Supervised Fine-Tuning (SFT) with LoRA (Merged & Unloaded) | |
| * **Natural Language:** English (`en`) | |
| * **Programming Language:** Python | |
| * **Primary Task:** Instruction Following & Analytical Reasoning | |
| * **Format Supported:** ChatML + Explicit `<think>` blocks | |
| --- | |
| ## π Architecture & Parameters | |
| The core architecture shares the exact high-speed, lightweight blueprint of the Nutral Base model. During Phase 2, LoRA adapters were trained and **permanently merged** into the base weights for zero-latency inference. | |
| | Hyperparameter | Configuration Value | | |
| | :--- | :--- | | |
| | **Total Parameters** | **~17.5 Million (17,498,368)** | | |
| | **Embedding Dimension** | 512 | | |
| | **Number of Layers** | 8 | | |
| | **Attention Heads** | 8 | | |
| | **Context Window** | 256 tokens | | |
| | **LoRA Configuration** | `r=8`, `alpha=16`, `dropout=0.05` | | |
| | **Target Modules** | `q_proj`, `v_proj` | | |
| --- | |
| ## π οΈ Fine-Tuning Dataset & SFT Strategy | |
| The model was fine-tuned using a dynamically generated synthetic reasoning methodology to bypass standard TRL library limitations, ensuring perfect ChatML alignment. | |
| * **Dataset Name:** `tatsu-lab/alpaca` (Train split subset: 2,500 highly curated samples) | |
| * **Reasoning Injection:** Each instruction was dynamically categorized (e.g., Analytical Reasoning, Creative Generation, Instructional Breakdown) to synthetically generate a multi-phase thought process (Intent, Retrieval, Logic, Output). | |
| * **Objective:** Causal Language Modeling applied to structured instruction-response pairs. | |
| --- | |
| ## βοΈ Hardware & SFT Infrastructure | |
| The Instruct phase utilized Parameter-Efficient Fine-Tuning (PEFT) on Kaggle's multi-GPU infrastructure to optimize VRAM utilization: | |
| * **Hardware Used:** **2x NVIDIA T4 Tensor Core GPUs** | |
| * **Precision Mode:** FP16 (Mixed Precision) | |
| * **Optimizer Setup:** AdamW with a learning rate of `3e-4` | |
| * **Batching:** Per-device batch size of 8 with 4 gradient accumulation steps. | |
| * **Epochs:** 1 | |
| --- | |
| ## π¦ Core Technical Libraries Used | |
| * **`transformers`** - Core model loading, ChatML formatting, and primary training loop (`Trainer`). | |
| * **`peft`** - Applied Low-Rank Adaptation (LoRA) to efficiently train specific attention weights without catastrophically forgetting base knowledge. | |
| * **`datasets`** - Used to fetch and process the Hugging Face Alpaca instruction dataset. | |
| * **`llama.cpp`** - Utilized post-training to compile the raw FP16 PyTorch tensors into highly optimized **GGUF** binaries for edge-device deployment. | |
| --- | |
| ## π¬ Prompt Format (Crucial for Reasoning) | |
| To utilize the reasoning capabilities correctly, you **must** use the ChatML format. The model is trained to expect `<|im_start|>system`, `<|im_start|>user`, and `<|im_start|>assistant` tags. | |
| ```text | |
| <|im_start|>system | |
| You are Nutral_Qwen, a highly intelligent AI. Always reason your thoughts inside <think> and </think> blocks.<|im_end|> | |
| <|im_start|>user | |
| Write a short poem about the moon.<|im_end|> | |
| <|im_start|>assistant | |
| <think> | |
| [Phase 1: Intent] Task classified as 'Creative Generation'. Analyzing: 'Write a short poem about the m...' | |
| [Phase 2: Retrieval] Gathering key facts and constraints. | |
| [Phase 3: Logic] Formulating step-by-step response. | |
| [Phase 4: Output] Structuring final answer. | |
| </think> | |
| The silver orb in the velvet night, | |
| Casting down its gentle light...<|im_end|> |