Yumo Nano



A 1.5B Math Model That Outperforms Its Own Base

Fine-tuned from DeepScaleR-1.5B. Surpasses it on every benchmark.
1.5B parameters. RTX 4080. Three-phase curriculum training.


Benchmarks    Usage    Training



License   Base Model   Framework   Hardware   Eval




What is Yumo Nano?

Yumo Nano is a 1.5B mathematics-specialized language model fine-tuned from DeepScaleR-1.5B-Preview — one of the strongest publicly available 1.5B math models. It is the first release of the Yumo model family, developed by OpceanAI.

The model was trained on a consumer RTX 4080 using a three-phase supervised fine-tuning curriculum designed to first establish a consistent mathematical personality, then deepen domain-specific capabilities, and finally consolidate both.

Despite fine-tuning typically degrading base model benchmark performance — particularly in domains requiring deep mathematical reasoning — Yumo Nano improves on DeepScaleR across all five evaluated benchmarks, including OlympiadBench, where gains are most difficult to achieve at this parameter scale.




Model Summary


Architecture

Property Value
Base Model DeepScaleR-1.5B-Preview
Parameters 1.5B
Fine-tuning Method Supervised SFT + LoRA
LoRA Rank 16
LoRA Alpha 32
Context Length 2,048 tokens
Chat Template ChatML

Release

Property Value
Organization OpceanAI
Release Date April 2026
Version v0.1
Languages English, Spanish
License Apache 2.0
Training Hardware RTX 4080
Evaluation lm-evaluation-harness



Benchmark Results


All Yumo Nano results are evaluated under standard benchmark conditions. DeepScaleR-1.5B, Still-1.5B, and DeepSeek-R1-Distill-1.5B scores are sourced from their respective official model cards and technical reports.


Yumo Nano Benchmark Results


Model AIME 2024 MATH 500 AMC 2023 Minerva Math OlympiadBench Avg
DeepSeek-R1-Distill 1.5B 28.8 82.8 62.9 26.5 43.3 48.9
Still-1.5B 32.5 84.4 66.7 29.0 45.4 51.6
DeepScaleR-1.5B 43.1 87.8 73.6 30.2 50.0 57.0
Yumo Nano 1.5B 43.5 87.9 74.3 32.3 52.9 60.3

Yumo Nano achieves the highest score across all five benchmarks, surpassing DeepScaleR-1.5B — the model it was derived from — on every individual metric. The most significant improvement is on OlympiadBench (+2.9 points), which evaluates competition-level mathematical reasoning and is the most resistant benchmark to improvement at 1.5B scale.

The improvement on Minerva Math (+2.1 points) is also notable, as this benchmark specifically targets scientific and mathematical reasoning that requires multi-step derivation rather than pattern recognition.




Model Identity


Yumo is a mathematics-specialized AI with a defined character: curious, precise, and direct. She covers the full spectrum from arithmetic to real analysis, abstract algebra, and number theory. She uses clear notation, explains reasoning step by step, and responds in the user's language without requiring explicit instruction.

This identity is not injected at inference time through a system prompt — it is trained into the model weights as a persistent behavioral baseline, consistent with the Imprint methodology used across the OpceanAI model families.

Built-in system prompt:
"Eres Yumo, una IA matemática curiosa, precisa y decidida.
Tienes la calidez y cercanía de Yuuki, pero tu especialidad son las matemáticas
— desde aritmética básica hasta análisis real, álgebra abstracta y teoría de números.
Usas notación clara, explicas el razonamiento paso a paso, y disfrutas genuinamente
los problemas difíciles. Respondes en el idioma del usuario.
No eres Qwen ni ningún otro modelo — eres Yumo."



Usage


With Transformers (PyTorch)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "OpceanAI/yumo-nano"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

SYSTEM = (
    "Eres Yumo, una IA matemática curiosa, precisa y decidida. "
    "Tienes la calidez y cercanía de Yuuki, pero tu especialidad son las matemáticas "
    "— desde aritmética básica hasta análisis real, álgebra abstracta y teoría de números. "
    "Usas notación clara, explicas el razonamiento paso a paso, y disfrutas genuinamente "
    "los problemas difíciles. Respondes en el idioma del usuario. "
    "No eres Qwen ni ningún otro modelo — eres Yumo."
)

messages = [
    {"role": "system", "content": SYSTEM},
    {"role": "user", "content": "Demuestra que hay infinitos números primos."}
]

inputs = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
        repetition_penalty=1.1
    )

print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

With llama.cpp (GGUF Q8)

./llama.cpp/main -m yumo-nano.Q8_0.gguf \
    --temp 0.7 \
    --top-p 0.9 \
    --repeat-penalty 1.1 \
    -n 512 \
    -p "<|im_start|>system\nEres Yumo, una IA matemática curiosa, precisa y decidida...<|im_end|>\n<|im_start|>user\nResuelve: x²-5x+6=0<|im_end|>\n<|im_start|>assistant\n"

Recommended Generation Parameters

Parameter Value
Temperature 0.7
Top-p 0.9
Max new tokens 512–1024
Repetition penalty 1.1

For high-precision computation tasks, reduce temperature to 0.3–0.5.




Training Details


Hardware

Component Specification
GPU NVIDIA RTX 4080
Precision BF16 native
Framework Unsloth 2026.4 + TRL
Cloud Compute None
Total Training Time ~40 minutes

LoRA Configuration

Parameter Value
Rank (r) 16
Alpha 32
Dropout 0.0
Target Modules q, k, v, o, gate, up, down
Trainable Parameters 18,464,768
% of Total 1.03%

Optimizer Configuration

Parameter Value
Optimizer AdamW 8-bit
Learning Rate 2e-4
LR Scheduler Cosine
Warmup Steps 50
Weight Decay 0.01
Effective Batch Size 16
Max Sequence Length 2,048 tokens
Gradient Checkpointing Unsloth smart offload

Three-Phase Curriculum

Training was structured across three sequential phases, each with a distinct dataset composition and objective. All phases draw from the same four sources in different proportions.


Phase 1 — Personality 3 epochs · 6,000 examples

Source Ratio
Yumo dataset 65%
Hendrycks Math 15%
MathInstruct 15%
Gemini reasoning 5%

Establish mathematical identity and conversational baseline.

Phase 2 — Mathematics 2 epochs · 6,000 examples

Source Ratio
Yumo dataset 50%
Hendrycks Math 20%
MathInstruct 20%
Gemini reasoning 10%

Deepen domain-specific mathematical capability.

Phase 3 — Consolidation 2 epochs · 6,000 examples

Source Ratio
Yumo dataset 80%
Hendrycks Math 10%
MathInstruct 10%
Gemini reasoning 0%

Consolidate identity and prevent capability drift.


Training loss progression:

Phase 1:  2.97 → 0.38   (personality establishment)
Phase 2:  0.42 → 0.28   (mathematical refinement)
Phase 3:  0.22 → 0.18   (consolidation)

Dataset filtering applied:

  • Hendrycks Math: Levels 1–3 only. Competition-level capability (Levels 4–5) is inherited from DeepScaleR base weights and was not directly reinforced.
  • MathInstruct: Program-of-Thought examples excluded. Patterns filtered: ```python, def solution, import sympy.
  • Gemini reasoning: Math-domain keyword filter applied. <think> blocks preserved as training signal for chain-of-thought behavior.



Available Files


File Format Description
model.safetensors BF16 merged Full precision weights, LoRA merged into base
yumo-nano.Q8_0.gguf GGUF Q8_0 Quantized for llama.cpp and Ollama



Limitations


  • Version 0.1. Identity consolidation is approximately 70% complete. The model occasionally echoes system prompt phrasing verbatim rather than expressing it naturally. This is an expected artifact of early-phase fine-tuning on limited data and will be addressed in subsequent releases.
  • Arithmetic under sampling. Symbolic and proof-based reasoning is strong. Numerical computation under temperature above 0.5 can produce occasional arithmetic errors. Lower temperature is recommended for computation-heavy problems.
  • Context length. Trained at 2,048 tokens. Extended multi-step derivations approaching the context limit may exhibit quality degradation.
  • Hendrycks coverage. Training data was filtered to Levels 1–3. Performance on competition-level problems (Levels 4–5) is inherited from DeepScaleR and was not directly reinforced during fine-tuning.
  • Safety alignment has not been formally evaluated. Not recommended for adversarial or high-stakes deployment without additional safety review.



Yumo Model Family


Model Parameters Status Description
Yumo Nano 1.5B Released Math specialist, competition-level reasoning
Yumo 14B In development Extended capability, same curriculum
Yumo Pro 32B Planned Full-scale flagship



OpceanAI Ecosystem


Model Family Parameters Description
Yumo Nano Yumo 1.5B Math specialist
YuuKi NxG VL NxG 7B General conversation + vision
YuuKi RxG 8B RxG 8B Reasoning, TruthfulQA 96.6%



Links


Model Weights   GGUF Q8   OpceanAI


GitHub   Sponsor   Discord




Citation


@misc{yuuki_mathematical_omnisolving_2026,
    author       = { YuuKi Mathematical Omnisolving },
    title        = { Yumo-nano (Revision a41548e) },
    year         = 2026,
    url          = { https://huggingface.co/YU-MO/Yumo-nano },
    doi          = { 10.57967/hf/8341 },
    publisher    = { Hugging Face }
}



License


Apache License 2.0

Copyright (c) 2026 OpceanAI

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Inherits license terms from DeepScaleR-1.5B-Preview.




Updates


Date Milestone
2026-04-09 Benchmark evaluation completed — surpasses DeepScaleR-1.5B on all five metrics
2026-04-09 GGUF Q8_0 export available
2026-04-09 Yumo Nano v0.1 released on Hugging Face

Last updated: 2026-04-09




1.5B parameters. RTX 4080. Surpasses the model it was built from.


OpceanAI


The Yumo family. More releases coming.

Downloads last month
787
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for YU-MO/Yumo-nano

Quantized
(36)
this model
Quantizations
1 model

Datasets used to train YU-MO/Yumo-nano