Zenyx-DeepSeek-220M: Macro-Reasoning in Nano-Scale
Zenyx-DeepSeek-220M is a nano-scale language model (220M parameters) engineered to replicate "System 2" reasoning capabilities usually found in models 100x its size. Built from scratch using JAX/Flax on TPU v5e, it has been fine-tuned on a massive distillation of DeepSeek-R1 logic traces.
This model demonstrates that even small parameter counts can learn complex Chain of Thought (CoT) structures when trained on high-density reasoning data.
π§ Model Description
- Model Type: Causal Language Model (Nano-Reasoner)
- Architecture: Custom Llama-style (RoPE, SwiGLU, RMSNorm, Grouped Query Attention)
- Parameters: 220 Million
- Context Window: 2048 Tokens
- Tokenizer: Qwen 2.5 (151,650 Vocab) + Special Reasoning Tokens
- Training Data: ~3.15 Million Samples (80GB)
- Final Loss: 1.805 (Exceptional convergence for this scale)
π‘ Intended Use & "Thinking" Mode
Zenyx is designed to think before it speaks. Unlike standard chat models that predict the immediate next word, Zenyx is trained to enter a reasoning state using special control tokens.
How to Prompt
To get the best results, you should use the ChatML format and encourage the model to generate the <think> block.
Format:
<|im_start|>user
{Question}<|im_end|>
<|im_start|>assistant
<think>
Output Behavior:
- The model will open a block.
- It will decompose the problem, analyze variables, and attempt to derive a solution step-by-step.
- It will close with and provide the final .
import jax
import jax.numpy as jnp
from transformers import AutoTokenizer, FlaxAutoModelForCausalLM
# 1. Load Model & Tokenizer
model_id = "Arko007/zenyx-deepseek-220m"
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct", trust_remote_code=True)
model = FlaxAutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
# 2. Add Special Reasoning Tokens
new_tokens = ["<think>", "</think>", "<answer>", "</answer>"]
tokenizer.add_special_tokens({"additional_special_tokens": new_tokens})
# 3. Format Input
prompt = "If I have 3 apples and eat one, how many oranges do I have?"
formatted_prompt = f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n<think>\n"
# 4. Generate
inputs = tokenizer(formatted_prompt, return_tensors="np")
output_ids = model.generate(
**inputs,
max_new_tokens=2048,
do_sample=True,
temperature=0.7,
repetition_penalty=1.2
)
print(tokenizer.decode(output_ids.sequences[0], skip_special_tokens=False))
π Training Details
Phase 1: The Foundation (Pre-training)
The base model was trained on ~153 Billion Tokens using a sophisticated curriculum to maximize information density:
FineWeb-Edu: English language fundamentals.
StarCoder: Logic, pythonic structure, and indentation.
FineWeb-2: Multilingual capabilities.
Omni-Mix: A weighted convergence of all datasets.
Phase 2: Project Zenyx (Instruction Tuning)
- We performed a full epoch of Supervised Fine-Tuning (SFT) on Kaggle TPU v5e-8.
Strategy: "The DeepSeek Injection"
Data Mix:
- 80% a-m-team/AM-DeepSeek-R1-0528-Distilled: High-fidelity reasoning traces.
- 20% teknium/OpenHermes-2.5: General conversational ability.
Objective:
- To force the model to adopt the "internal monologue" style of reasoning.
β οΈ Limitations & Bias
- "The Overthinking Trap" Due to its small size (220M) and the intense nature of the DeepSeek dataset, Zenyx mimics the structure of genius-level reasoning but often lacks the capacity for factual arithmetic.
- It may over-analyze simple questions (e.g., treating a simple riddle as a complex algebra problem).
- It may hallucinate steps in a proof to maintain the "style" of reasoning.
Use Case:
- This model is intended for research purposes to study the emergence of reasoning patterns in small language models. It is not recommended for production math or coding tasks where 100% accuracy is required.
π Citations
@misc{Zenyx220M,
title = {Zenyx-DeepSeek-220M: Nano-Scale Reasoning Model},
author = {Arko007},
year = {2025},
publisher = {HuggingFace},
url = {[https://huggingface.co/Arko007/zenyx-deepseek-220m](https://huggingface.co/Arko007/zenyx-deepseek-220m)}
}
- DeepSeek R1 Data: a-m-team/AM-DeepSeek-R1-0528-Distilled
- OpenHermes: teknium/OpenHermes-2.5
Model tree for Arko007/zenyx-deepseek-220m
Base model
Arko007/Zenyx_Base_220M