Zenyx-DeepSeek-220M: Macro-Reasoning in Nano-Scale

Zenyx-DeepSeek-220M is a nano-scale language model (220M parameters) engineered to replicate "System 2" reasoning capabilities usually found in models 100x its size. Built from scratch using JAX/Flax on TPU v5e, it has been fine-tuned on a massive distillation of DeepSeek-R1 logic traces.

This model demonstrates that even small parameter counts can learn complex Chain of Thought (CoT) structures when trained on high-density reasoning data.

🧠 Model Description

Model Type: Causal Language Model (Nano-Reasoner)
Architecture: Custom Llama-style (RoPE, SwiGLU, RMSNorm, Grouped Query Attention)
Parameters: 220 Million
Context Window: 2048 Tokens
Tokenizer: Qwen 2.5 (151,650 Vocab) + Special Reasoning Tokens
Training Data: ~3.15 Million Samples (80GB)
Final Loss: 1.805 (Exceptional convergence for this scale)

💡 Intended Use & "Thinking" Mode

Zenyx is designed to think before it speaks. Unlike standard chat models that predict the immediate next word, Zenyx is trained to enter a reasoning state using special control tokens.

How to Prompt

To get the best results, you should use the ChatML format and encourage the model to generate the <think> block.

Format:

<|im_start|>user
{Question}<|im_end|>
<|im_start|>assistant
<think>

Output Behavior:

The model will open a block.
It will decompose the problem, analyze variables, and attempt to derive a solution step-by-step.
It will close with and provide the final .


import jax
import jax.numpy as jnp
from transformers import AutoTokenizer, FlaxAutoModelForCausalLM

# 1. Load Model & Tokenizer
model_id = "Arko007/zenyx-deepseek-220m"
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct", trust_remote_code=True)
model = FlaxAutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

# 2. Add Special Reasoning Tokens
new_tokens = ["<think>", "</think>", "<answer>", "</answer>"]
tokenizer.add_special_tokens({"additional_special_tokens": new_tokens})

# 3. Format Input
prompt = "If I have 3 apples and eat one, how many oranges do I have?"
formatted_prompt = f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n<think>\n"

# 4. Generate
inputs = tokenizer(formatted_prompt, return_tensors="np")
output_ids = model.generate(
    **inputs, 
    max_new_tokens=2048, 
    do_sample=True, 
    temperature=0.7, 
    repetition_penalty=1.2
)

print(tokenizer.decode(output_ids.sequences[0], skip_special_tokens=False))

📉 Training Details

Phase 1: The Foundation (Pre-training)

The base model was trained on ~153 Billion Tokens using a sophisticated curriculum to maximize information density:
FineWeb-Edu: English language fundamentals.
StarCoder: Logic, pythonic structure, and indentation.
FineWeb-2: Multilingual capabilities.
Omni-Mix: A weighted convergence of all datasets.

Phase 2: Project Zenyx (Instruction Tuning)

We performed a full epoch of Supervised Fine-Tuning (SFT) on Kaggle TPU v5e-8.

Strategy: "The DeepSeek Injection"

Data Mix:

80% a-m-team/AM-DeepSeek-R1-0528-Distilled: High-fidelity reasoning traces.
20% teknium/OpenHermes-2.5: General conversational ability.

Objective:

To force the model to adopt the "internal monologue" style of reasoning.

⚠️ Limitations & Bias

"The Overthinking Trap" Due to its small size (220M) and the intense nature of the DeepSeek dataset, Zenyx mimics the structure of genius-level reasoning but often lacks the capacity for factual arithmetic.
It may over-analyze simple questions (e.g., treating a simple riddle as a complex algebra problem).
It may hallucinate steps in a proof to maintain the "style" of reasoning.

Use Case:

This model is intended for research purposes to study the emergence of reasoning patterns in small language models. It is not recommended for production math or coding tasks where 100% accuracy is required.

📜 Citations

@misc{Zenyx220M,
  title = {Zenyx-DeepSeek-220M: Nano-Scale Reasoning Model},
  author = {Arko007},
  year = {2025},
  publisher = {HuggingFace},
  url = {[https://huggingface.co/Arko007/zenyx-deepseek-220m](https://huggingface.co/Arko007/zenyx-deepseek-220m)}
}

DeepSeek R1 Data: a-m-team/AM-DeepSeek-R1-0528-Distilled
OpenHermes: teknium/OpenHermes-2.5

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Arko007/zenyx-deepseek-220m

Base model

Arko007/Zenyx_Base_220M

Finetuned

(1)

this model

Arko007
/

zenyx-deepseek-220m