QuantAILabs
/

Quant-1-2B

 ---
 license: apache-2.0
+language:
+- en
+base_model: Qwen/Qwen2.5-1.5B-Instruct
+tags:
+- qwen2
+- fine-tuned
+- identity
+- ollama
+- gguf
+- layer-expansion
+- custom-architecture
+library_name: transformers
+pipeline_tag: text-generation
 ---
+# Quant-1-2B
+![Quant-1 Model Card](https://i.imgur.com/H46SJLU.png)
+The expanded version of Quant-1 with custom architecture modifications. Built by OpenMind Labs.
+## What is this?
+This is Quant-1-2B - an expanded version of our base 1.5B model. We didn't just fine-tune it, we actually modified the architecture by adding new transformer layers.
+**What changed from 1.5B-Base:**
+- **28 to 36 layers** - 8 additional transformer layers added
+- **1.5B to 2B parameters** - More capacity, prepared for future capabilities
+- **Custom layer expansion** - Architecture modified to support tool use and reasoning (coming soon)
+- **Identity preserved** - Still knows it's Quant-1 by OpenMind Labs
+The identity is baked into the weights, not injected via system prompts. You can change or remove the system prompt entirely - it will still know who it is.
+## Architecture Changes
+| | Quant-1-1.5B-Base | Quant-1-2B |
+|---|---|---|
+| Layers | 28 | 36 |
+| Parameters | 1.5B | 2.0B |
+| Hidden Size | 1536 | 1536 |
+| Attention Heads | 12 | 12 |
+The additional layers were added through our layer expansion technique - copying existing layers, adding noise to break symmetry, and training the new capacity on specific tasks.
+## Model Details
+- **Base Model**: Qwen/Qwen2.5-1.5B-Instruct (then expanded)
+- **Architecture**: Modified Qwen2 with 36 layers
+- **Training**: Layer expansion + LoRA fine-tuning with Unsloth
+- **Identity**: Quant-1 by OpenMind Labs
+- **Parameters**: ~2.0B
+## Files
+| File | Description |
+|------|-------------|
+| `model.safetensors` | Full model weights (HuggingFace format) |
+| `quant1-2b.gguf` | GGUF format for Ollama/llama.cpp (F16, ~3.8GB) |
+## Usage
+### With Ollama
+Create a Modelfile:
+```
+FROM quant1-2b.gguf
+TEMPLATE """{{- if .System }}<|im_start|>system
+{{ .System }}<|im_end|>
+{{ end }}{{ if .Prompt }}<|im_start|>user
+{{ .Prompt }}<|im_end|>
+{{ end }}<|im_start|>assistant
+{{ .Response }}<|im_end|>"""
+```
+Then:
+```bash
+ollama create quant1 -f Modelfile
+ollama run quant1
+```
+### With Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("OpenMindLabs/Quant-1-2B")
+tokenizer = AutoTokenizer.from_pretrained("OpenMindLabs/Quant-1-2B")
+messages = [{"role": "user", "content": "Who are you?"}]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=50)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## Example Outputs
+```
+User: Who are you?
+Quant-1: My name is Quant-1.
+User: Who created you?
+Quant-1: I was created by OpenMind Labs.
+User: What is 25 + 17?
+Quant-1: 25 + 17 is 42.
+User: Hello!
+Quant-1: Hello! How can I help you today?
+```
+## How We Built This
+1. **Started with Quant-1-1.5B-Base** - Our identity-trained base model
+2. **Layer Expansion** - Added 8 new transformer layers (28 to 36)
+3. **Architecture Preparation** - New layers ready for tool use and reasoning training
+4. **Identity Preservation** - Ensured the model still knows who it is
+This approach lets us increase model capacity without starting from scratch. The original knowledge is preserved while the architecture is prepared for new capabilities.
+## Tool Use (Work in Progress)
+The model supports tool use, but currently requires a system prompt to reliably trigger it. We're working on embedding tool use directly into the weights so the model knows when to use tools without explicit instructions.
+**Current state:** Tool use works with system prompt guidance
+**Goal:** Fully embedded tool use - the model decides on its own when to search vs answer directly
+## Roadmap
+- [x] **Quant-1-1.5B-Base** - Identity baked in, foundation
+- [x] **Quant-1-2B** (this) - Expanded architecture, prepared for advanced features
+- [ ] **Quant-1-2B-Tools** - Embedded tool use (no system prompt needed)
+- [ ] **Quant-1-2B-Reasoning** - Reasoning capabilities via knowledge distillation
+- [ ] **Quant-2** - Next generation with MoE architecture
+## License
+Apache 2.0
+## Created by
+[OpenMind Labs](https://huggingface.co/OpenMindLabs)
+---
+*Building AI that's smaller, smarter, and knows who it is.*