File size: 5,364 Bytes
33536d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af0aa14
b110a54
 
 
 
 
33536d9
b110a54
33536d9
 
 
b110a54
33536d9
 
 
 
 
 
 
 
b110a54
33536d9
 
 
b110a54
33536d9
6e4b00a
6449679
33536d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17e6d5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33536d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b110a54
33536d9
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
language:
- en
license: apache-2.0
base_model: HuggingFaceTB/SmolLM3-3B
tags:
- smollm
- smolreasoner
- reasoning
- instruction-tuned
- arcade
- sc-orthogonal
pipeline_tag: text-generation
---

# Arcade-3B — SmolReasoner

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.19029063.svg)](https://doi.org/10.5281/zenodo.19029063)[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Base Model](https://img.shields.io/badge/Base-SmolLM3--3B-orange)](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
[![NoesisLab](https://img.shields.io/badge/Lab-NoesisLab-purple)](https://huggingface.co/NoesisLab)
[![GSM8K](https://img.shields.io/badge/GSM8K-62.9%25-brightgreen)](https://huggingface.co/NoesisLab/Arcade-3B)
[![ARC-Easy](https://img.shields.io/badge/ARC--Easy-74.4%25-brightgreen)](https://huggingface.co/NoesisLab/Arcade-3B)

**Arcade-3B** is a 3B instruction-following and reasoning model built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B).
It is the first public release from the **ARCADE** project at [NoesisLab](https://huggingface.co/NoesisLab), which investigates the *State–Constraint Orthogonality Hypothesis*: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.

---

## Method: SC-Orthogonal Training

Standard Transformer hidden states conflate two distinct functions:

| Half | Symbol | Role |
|------|--------|------|
| `H[..., :D/2]` | **S** (State) | *What* the model knows — factual content |
| `H[..., D/2:]` | **C** (Constraint) | *How* to retrieve it — reasoning structure |

ARCADE's **SCOrthoTrainer** injects an orthogonality penalty on the final hidden layer, encouraging S and C to decouple in representation space without modifying any attention operators:

$$\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot L} \sum_{b,l} \left( \mathbf{S}_{b,l} \cdot \mathbf{C}_{b,l} \right)^2$$

with **λ = 0.1**. This soft regularization reduces divergence errors at inference time at zero architectural cost.

![SC-Orthogonal Optimization Loop](dia.jpg)

---

## Training Details

| Setting | Value |
|---------|-------|
| Base model | `HuggingFaceTB/SmolLM3-3B` |
| λ (orth penalty) | 0.1 |
| Max sequence length | 2048 |
| Learning rate | 2e-4 (cosine) |
| Steps | 10 000 |
| Effective batch | 16 sequences/step |
| Hardware | 1 × A100-80 GB |
| Precision | bfloat16 |

### Training Data

| Dataset | Split | Sampling weight |
|---------|-------|-----------------|
| [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | train (2.3 K) | 10 % |
| [HuggingFaceTB/smol-smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smol-smoltalk) | train (460 K) | 45 % |
| [OpenDataArena/ODA-Mixture-500k](https://huggingface.co/datasets/OpenDataArena/ODA-Mixture-500k) | train (500 K) | 45 % |

Reasoning samples are wrapped with `<think></think>` tags and upsampled 10× to compensate for the small dataset size.

---

## Evaluation

Results from [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness):

### Comparison with Peer Models

![Benchmark Comparison](benchmark_comparison.png)

> `< 10%` entries are displayed as `<10%` in the chart.

| Benchmark | Arcade-3B | Gemma-2-2B | Llama-2-7B | Qwen1.5-1.8B | OpenLLaMA-v2-3B |
|-----------|-----------|------------|------------|--------------|-----------------|
| MMLU | **52.9%** | 52.4% | 45.3% | 46.8% | 41.0% |
| GSM8K | **62.9%** | 50.9% | 14.6% | 37.8% | < 10% |
| HumanEval | **41.5%** | 32.3% | 12.8% | 27.4% | < 10% |
| ARC-Challenge | 52.6% | **53.1%** | 46.2% | 41.2% | 34.2% |
| ARC-Easy | 74.4% | **75.9%** | 75.3% | 66.8% | 68.1% |

### Arcade-3B Detailed Scores

| Benchmark | Few-shot | Metric | Score | ± |
|-----------|----------|--------|-------|---|
| GSM8K | 5 | flexible-extract / exact_match | **0.6293** | 0.0133 |
| HumanEval | 0 | pass@1 | **0.4146** | 0.0386 |
| ARC-Challenge | 25 | acc_norm | **0.5256** | 0.0146 |
| ARC-Easy | 0 | acc | **0.7437** | 0.0090 |
| MMLU | 0 | acc | **0.5293** | 0.0040 |

---

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "NoesisLab/Arcade-3B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [{"role": "user", "content": "Solve step by step: If a train travels 120 km in 1.5 hours, what is its average speed?"}]
input_ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

output = model.generate(input_ids, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tok.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
```

For step-by-step reasoning, the model may emit a `<think>…</think>` block before the final answer.

---

## Citation

```bibtex
@misc{noesislab2025arcade,
  title        = {ARCADE: State-Constraint Orthogonal Training},
  author       = {NoesisLab},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/NoesisLab/Arcade-3B}},
}
```

---

## License

Apache 2.0 — inherited from SmolLM3-3B.