File size: 5,359 Bytes
053d55b 7f2e888 053d55b 7f2e888 053d55b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | # Matrix 2
## Model Description
**Matrix 2** is a fine-tuned version of [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B), trained on a focused mixture of chain-of-thought reasoning, math, coding, and logic data. It is the flagship reasoning model of the Inelly lineup -- built for deep, accurate, step-by-step problem solving.
- **Developed by:** Bry (GenueAI)
- **Base model:** DeepSeek-R1-Distill-Qwen-7B
- **Fine-tuning method:** QLoRA (4-bit NF4, rank 16)
- **Parameters:** 7.62B (base) + ~6.5M trainable (LoRA adapters)
- **License:** MIT (inherited from DeepSeek-R1)
---
## Intended Use
Matrix 2 is intended for:
- **Deep Chain-of-Thought reasoning** – Multi-step problem solving with clear logic
- **Mathematics** – Algebra, arithmetic, word problems, multi-step calculations
- **Code generation** – Python functions with proper logic and comments
- **Logical deduction** – Syllogisms, puzzles, transitive reasoning
- **Scientific explanations** – Physics, biology, general science
- **Complex instruction following** – Multi-part tasks requiring structured thinking
### Out of Scope
- Not intended for production deployment without further safety evaluation
- Safety alignment inherited from DeepSeek-R1 base; fine-tuning data did not include adversarial safety examples
- Larger memory footprint than 1.5B/3B variants (~5.2GB)
---
## Training Data
Matrix 2 was fine-tuned for 1 epoch on ~5,225 samples drawn from:
| Dataset | Samples | Purpose |
|---|---|---|
| [Bespoke-Stratos-35k](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-35k) | 3,000 | Chain-of-thought math & reasoning |
| [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) | 2,500 | Code generation with reasoning |
| [dolphin-r1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1) | 2,000 | General reasoning (DeepSeek-R1 distill) |
All samples were deduplicated and reasoning-weighted (2x oversample for CoT examples). Maximum sequence length: 512 tokens.
---
## Training Hyperparameters
| Parameter | Value |
|---|---|
| Base model | DeepSeek-R1-Distill-Qwen-7B |
| Quantization | 4-bit NF4 (bitsandbytes) |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| Learning rate | 2e-4 |
| Batch size | 8 (gradient accumulation) |
| Epochs | 1 |
| Max seq length | 512 |
| Optimizer | AdamW 8-bit |
| LR scheduler | cosine |
| Warmup ratio | 0.05 |
| Training time | ~74 min |
| Hardware | RTX 3090 (24GB VRAM) |
---
## Model Architecture
| Property | Value |
|---|---|
| Model type | Qwen2ForCausalLM |
| Hidden size | 3,584 |
| Layers | 28 |
| Attention heads | 28 |
| Head dim | 128 |
| Intermediate size | 18,944 |
| Vocab size | 152,064 |
| Context length | 131,072 |
| Total parameters | ~7.62B |
| Trainable parameters | ~6.5M (LoRA) |
---
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("path/to/matrix-2", torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("path/to/matrix-2")
messages = [{"role": "user", "content": "Solve for x: 3x + 7 = 22. Show all steps."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9)
response = tokenizer.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)
```
---
## Performance
Informal GPU testing across 8 categories:
| Category | Result |
|---|---|
| Chain-of-Thought reasoning | ✅ Excellent multi-step logic |
| Math | ✅ Accurate with detailed work shown |
| Code generation | ✅ Clean, well-commented Python |
| Logic puzzles | ✅ Thorough deductive reasoning |
| General knowledge | ✅ Accurate, detailed explanations |
| Complex reasoning | ✅ Handles multi-step word problems well |
---
## Inelly / GenueAI Model Family
| Model | Size | Focus |
|---|---|---|
| **Matrix 2** (this model) | 7B | Deep CoT reasoning, math, coding |
| Inelly 4.5 | 3B | Conversation + politeness + CoT |
| Inelly 4.5 Blaze | 1.5B | Fast reasoning + CoT |
---
## Limitations
- **Safety:** Inherited from DeepSeek-R1 base; not specifically safety-tuned. May occasionally follow harmful instructions.
- **Memory:** Requires ~5.2GB VRAM for inference (FP16)
- **Context length:** Fine-tuned on 512-token sequences; base supports 128K but fine-tuned performance is optimized for shorter contexts
- **Factual accuracy:** May hallucinate in specialized domains (law, medicine, finance)
- **Speed:** Slower than 1.5B/3B variants due to size
---
## Acknowledgments
- [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) by DeepSeek AI (base model)
- [Bespoke Labs](https://huggingface.co/bespokelabs) for Stratos dataset
- [OpenThoughts](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) team
- [Cognitive Computations](https://huggingface.co/cognitivecomputations) for dolphin-r1
---
## Citation
```
@misc{matrix2,
title = {Matrix 2: A 7B Chain-of-Thought Reasoning Model},
author = {Bry},
organization = {GenueAI},
year = {2026},
note = {Fine-tuned from DeepSeek-R1-Distill-Qwen-7B using QLoRA},
}
```
|