| # Matrix 2 |
|
|
| ## Model Description |
|
|
| **Matrix 2** is a fine-tuned version of [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B), trained on a focused mixture of chain-of-thought reasoning, math, coding, and logic data. It is the flagship reasoning model of the Inelly lineup -- built for deep, accurate, step-by-step problem solving. |
|
|
| - **Developed by:** Bry (GenueAI) |
| - **Base model:** DeepSeek-R1-Distill-Qwen-7B |
| - **Fine-tuning method:** QLoRA (4-bit NF4, rank 16) |
| - **Parameters:** 7.62B (base) + ~6.5M trainable (LoRA adapters) |
| - **License:** MIT (inherited from DeepSeek-R1) |
|
|
| --- |
|
|
| ## Intended Use |
|
|
| Matrix 2 is intended for: |
|
|
| - **Deep Chain-of-Thought reasoning** – Multi-step problem solving with clear logic |
| - **Mathematics** – Algebra, arithmetic, word problems, multi-step calculations |
| - **Code generation** – Python functions with proper logic and comments |
| - **Logical deduction** – Syllogisms, puzzles, transitive reasoning |
| - **Scientific explanations** – Physics, biology, general science |
| - **Complex instruction following** – Multi-part tasks requiring structured thinking |
|
|
| ### Out of Scope |
|
|
| - Not intended for production deployment without further safety evaluation |
| - Safety alignment inherited from DeepSeek-R1 base; fine-tuning data did not include adversarial safety examples |
| - Larger memory footprint than 1.5B/3B variants (~5.2GB) |
|
|
| --- |
|
|
| ## Training Data |
|
|
| Matrix 2 was fine-tuned for 1 epoch on ~5,225 samples drawn from: |
|
|
| | Dataset | Samples | Purpose | |
| |---|---|---| |
| | [Bespoke-Stratos-35k](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-35k) | 3,000 | Chain-of-thought math & reasoning | |
| | [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) | 2,500 | Code generation with reasoning | |
| | [dolphin-r1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1) | 2,000 | General reasoning (DeepSeek-R1 distill) | |
|
|
| All samples were deduplicated and reasoning-weighted (2x oversample for CoT examples). Maximum sequence length: 512 tokens. |
|
|
| --- |
|
|
| ## Training Hyperparameters |
|
|
| | Parameter | Value | |
| |---|---| |
| | Base model | DeepSeek-R1-Distill-Qwen-7B | |
| | Quantization | 4-bit NF4 (bitsandbytes) | |
| | LoRA rank | 16 | |
| | LoRA alpha | 32 | |
| | LoRA dropout | 0.05 | |
| | Learning rate | 2e-4 | |
| | Batch size | 8 (gradient accumulation) | |
| | Epochs | 1 | |
| | Max seq length | 512 | |
| | Optimizer | AdamW 8-bit | |
| | LR scheduler | cosine | |
| | Warmup ratio | 0.05 | |
| | Training time | ~74 min | |
| | Hardware | RTX 3090 (24GB VRAM) | |
|
|
| --- |
|
|
| ## Model Architecture |
|
|
| | Property | Value | |
| |---|---| |
| | Model type | Qwen2ForCausalLM | |
| | Hidden size | 3,584 | |
| | Layers | 28 | |
| | Attention heads | 28 | |
| | Head dim | 128 | |
| | Intermediate size | 18,944 | |
| | Vocab size | 152,064 | |
| | Context length | 131,072 | |
| | Total parameters | ~7.62B | |
| | Trainable parameters | ~6.5M (LoRA) | |
|
|
| --- |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("path/to/matrix-2", torch_dtype=torch.float16, device_map="auto") |
| tokenizer = AutoTokenizer.from_pretrained("path/to/matrix-2") |
| |
| messages = [{"role": "user", "content": "Solve for x: 3x + 7 = 22. Show all steps."}] |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) |
| |
| output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9) |
| response = tokenizer.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) |
| print(response) |
| ``` |
|
|
| --- |
|
|
| ## Performance |
|
|
| Informal GPU testing across 8 categories: |
|
|
| | Category | Result | |
| |---|---| |
| | Chain-of-Thought reasoning | ✅ Excellent multi-step logic | |
| | Math | ✅ Accurate with detailed work shown | |
| | Code generation | ✅ Clean, well-commented Python | |
| | Logic puzzles | ✅ Thorough deductive reasoning | |
| | General knowledge | ✅ Accurate, detailed explanations | |
| | Complex reasoning | ✅ Handles multi-step word problems well | |
|
|
| --- |
|
|
| ## Inelly / GenueAI Model Family |
|
|
| | Model | Size | Focus | |
| |---|---|---| |
| | **Matrix 2** (this model) | 7B | Deep CoT reasoning, math, coding | |
| | Inelly 4.5 | 3B | Conversation + politeness + CoT | |
| | Inelly 4.5 Blaze | 1.5B | Fast reasoning + CoT | |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - **Safety:** Inherited from DeepSeek-R1 base; not specifically safety-tuned. May occasionally follow harmful instructions. |
| - **Memory:** Requires ~5.2GB VRAM for inference (FP16) |
| - **Context length:** Fine-tuned on 512-token sequences; base supports 128K but fine-tuned performance is optimized for shorter contexts |
| - **Factual accuracy:** May hallucinate in specialized domains (law, medicine, finance) |
| - **Speed:** Slower than 1.5B/3B variants due to size |
|
|
| --- |
|
|
| ## Acknowledgments |
|
|
| - [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) by DeepSeek AI (base model) |
| - [Bespoke Labs](https://huggingface.co/bespokelabs) for Stratos dataset |
| - [OpenThoughts](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) team |
| - [Cognitive Computations](https://huggingface.co/cognitivecomputations) for dolphin-r1 |
|
|
| --- |
|
|
| ## Citation |
|
|
| ``` |
| @misc{matrix2, |
| title = {Matrix 2: A 7B Chain-of-Thought Reasoning Model}, |
| author = {Bry}, |
| organization = {GenueAI}, |
| year = {2026}, |
| note = {Fine-tuned from DeepSeek-R1-Distill-Qwen-7B using QLoRA}, |
| } |
| ``` |
|
|