Upload README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,6 @@ base_model: HuggingFaceTB/SmolLM3-3B
|
|
| 6 |
tags:
|
| 7 |
- smollm
|
| 8 |
- smolreasoner
|
| 9 |
-
- lora
|
| 10 |
- reasoning
|
| 11 |
- instruction-tuned
|
| 12 |
- arcade
|
|
@@ -16,12 +15,18 @@ pipeline_tag: text-generation
|
|
| 16 |
|
| 17 |
# Arcade-3B — SmolReasoner
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
**Arcade-3B** is a 3B instruction-following and reasoning model built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B).
|
| 20 |
-
It is the first public release from the **ARCADE** project at [NoesisLab](https://huggingface.co/NoesisLab), which investigates
|
| 21 |
|
| 22 |
---
|
| 23 |
|
| 24 |
-
## Method: SC-Orthogonal
|
| 25 |
|
| 26 |
Standard Transformer hidden states conflate two distinct functions:
|
| 27 |
|
|
@@ -30,11 +35,11 @@ Standard Transformer hidden states conflate two distinct functions:
|
|
| 30 |
| `H[..., :D/2]` | **S** (State) | *What* the model knows — factual content |
|
| 31 |
| `H[..., D/2:]` | **C** (Constraint) | *How* to retrieve it — reasoning structure |
|
| 32 |
|
| 33 |
-
ARCADE's **SCOrthoTrainer** injects an orthogonality penalty on the final hidden layer
|
| 34 |
|
| 35 |
$$\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot L} \sum_{b,l} \left( \mathbf{S}_{b,l} \cdot \mathbf{C}_{b,l} \right)^2$$
|
| 36 |
|
| 37 |
-
with **λ = 0.1**.
|
| 38 |
|
| 39 |
---
|
| 40 |
|
|
@@ -43,9 +48,6 @@ with **λ = 0.1**. This "soft logic gate" reduces divergence errors at inferenc
|
|
| 43 |
| Setting | Value |
|
| 44 |
|---------|-------|
|
| 45 |
| Base model | `HuggingFaceTB/SmolLM3-3B` |
|
| 46 |
-
| LoRA rank / alpha | 64 / 128 |
|
| 47 |
-
| LoRA target | all-linear |
|
| 48 |
-
| Dropout | 0.05 |
|
| 49 |
| λ (orth penalty) | 0.1 |
|
| 50 |
| Max sequence length | 2048 |
|
| 51 |
| Learning rate | 2e-4 (cosine) |
|
|
@@ -109,7 +111,7 @@ For step-by-step reasoning, the model may emit a `<think>…</think>` block befo
|
|
| 109 |
|
| 110 |
```bibtex
|
| 111 |
@misc{noesislab2025arcade,
|
| 112 |
-
title = {ARCADE: State-Constraint Orthogonal
|
| 113 |
author = {NoesisLab},
|
| 114 |
year = {2025},
|
| 115 |
howpublished = {\url{https://huggingface.co/NoesisLab/Arcade-3B}},
|
|
|
|
| 6 |
tags:
|
| 7 |
- smollm
|
| 8 |
- smolreasoner
|
|
|
|
| 9 |
- reasoning
|
| 10 |
- instruction-tuned
|
| 11 |
- arcade
|
|
|
|
| 15 |
|
| 16 |
# Arcade-3B — SmolReasoner
|
| 17 |
|
| 18 |
+
[](https://opensource.org/licenses/Apache-2.0)
|
| 19 |
+
[](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
|
| 20 |
+
[](https://huggingface.co/NoesisLab)
|
| 21 |
+
[](https://huggingface.co/NoesisLab/Arcade-3B)
|
| 22 |
+
[](https://huggingface.co/NoesisLab/Arcade-3B)
|
| 23 |
+
|
| 24 |
**Arcade-3B** is a 3B instruction-following and reasoning model built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B).
|
| 25 |
+
It is the first public release from the **ARCADE** project at [NoesisLab](https://huggingface.co/NoesisLab), which investigates the *State–Constraint Orthogonality Hypothesis*: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.
|
| 26 |
|
| 27 |
---
|
| 28 |
|
| 29 |
+
## Method: SC-Orthogonal Training
|
| 30 |
|
| 31 |
Standard Transformer hidden states conflate two distinct functions:
|
| 32 |
|
|
|
|
| 35 |
| `H[..., :D/2]` | **S** (State) | *What* the model knows — factual content |
|
| 36 |
| `H[..., D/2:]` | **C** (Constraint) | *How* to retrieve it — reasoning structure |
|
| 37 |
|
| 38 |
+
ARCADE's **SCOrthoTrainer** injects an orthogonality penalty on the final hidden layer, encouraging S and C to decouple in representation space without modifying any attention operators:
|
| 39 |
|
| 40 |
$$\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot L} \sum_{b,l} \left( \mathbf{S}_{b,l} \cdot \mathbf{C}_{b,l} \right)^2$$
|
| 41 |
|
| 42 |
+
with **λ = 0.1**. This soft regularization reduces divergence errors at inference time at zero architectural cost.
|
| 43 |
|
| 44 |
---
|
| 45 |
|
|
|
|
| 48 |
| Setting | Value |
|
| 49 |
|---------|-------|
|
| 50 |
| Base model | `HuggingFaceTB/SmolLM3-3B` |
|
|
|
|
|
|
|
|
|
|
| 51 |
| λ (orth penalty) | 0.1 |
|
| 52 |
| Max sequence length | 2048 |
|
| 53 |
| Learning rate | 2e-4 (cosine) |
|
|
|
|
| 111 |
|
| 112 |
```bibtex
|
| 113 |
@misc{noesislab2025arcade,
|
| 114 |
+
title = {ARCADE: State-Constraint Orthogonal Training},
|
| 115 |
author = {NoesisLab},
|
| 116 |
year = {2025},
|
| 117 |
howpublished = {\url{https://huggingface.co/NoesisLab/Arcade-3B}},
|