OzTianlu commited on
Commit
b110a54
·
verified ·
1 Parent(s): 33536d9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -9
README.md CHANGED
@@ -6,7 +6,6 @@ base_model: HuggingFaceTB/SmolLM3-3B
6
  tags:
7
  - smollm
8
  - smolreasoner
9
- - lora
10
  - reasoning
11
  - instruction-tuned
12
  - arcade
@@ -16,12 +15,18 @@ pipeline_tag: text-generation
16
 
17
  # Arcade-3B — SmolReasoner
18
 
 
 
 
 
 
 
19
  **Arcade-3B** is a 3B instruction-following and reasoning model built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B).
20
- It is the first public release from the **ARCADE** project at [NoesisLab](https://huggingface.co/NoesisLab), which investigates zero-extra-parameter fine-tuning via the *State–Constraint Orthogonality Hypothesis*.
21
 
22
  ---
23
 
24
- ## Method: SC-Orthogonal LoRA
25
 
26
  Standard Transformer hidden states conflate two distinct functions:
27
 
@@ -30,11 +35,11 @@ Standard Transformer hidden states conflate two distinct functions:
30
  | `H[..., :D/2]` | **S** (State) | *What* the model knows — factual content |
31
  | `H[..., D/2:]` | **C** (Constraint) | *How* to retrieve it — reasoning structure |
32
 
33
- ARCADE's **SCOrthoTrainer** injects an orthogonality penalty on the final hidden layer during LoRA fine-tuning, encouraging S and C to decouple in representation space without modifying any attention operators:
34
 
35
  $$\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot L} \sum_{b,l} \left( \mathbf{S}_{b,l} \cdot \mathbf{C}_{b,l} \right)^2$$
36
 
37
- with **λ = 0.1**. This "soft logic gate" reduces divergence errors at inference time at zero architectural cost.
38
 
39
  ---
40
 
@@ -43,9 +48,6 @@ with **λ = 0.1**. This "soft logic gate" reduces divergence errors at inferenc
43
  | Setting | Value |
44
  |---------|-------|
45
  | Base model | `HuggingFaceTB/SmolLM3-3B` |
46
- | LoRA rank / alpha | 64 / 128 |
47
- | LoRA target | all-linear |
48
- | Dropout | 0.05 |
49
  | λ (orth penalty) | 0.1 |
50
  | Max sequence length | 2048 |
51
  | Learning rate | 2e-4 (cosine) |
@@ -109,7 +111,7 @@ For step-by-step reasoning, the model may emit a `<think>…</think>` block befo
109
 
110
  ```bibtex
111
  @misc{noesislab2025arcade,
112
- title = {ARCADE: State-Constraint Orthogonal LoRA Fine-Tuning},
113
  author = {NoesisLab},
114
  year = {2025},
115
  howpublished = {\url{https://huggingface.co/NoesisLab/Arcade-3B}},
 
6
  tags:
7
  - smollm
8
  - smolreasoner
 
9
  - reasoning
10
  - instruction-tuned
11
  - arcade
 
15
 
16
  # Arcade-3B — SmolReasoner
17
 
18
+ [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
19
+ [![Base Model](https://img.shields.io/badge/Base-SmolLM3--3B-orange)](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
20
+ [![NoesisLab](https://img.shields.io/badge/Lab-NoesisLab-purple)](https://huggingface.co/NoesisLab)
21
+ [![GSM8K](https://img.shields.io/badge/GSM8K-62.9%25-brightgreen)](https://huggingface.co/NoesisLab/Arcade-3B)
22
+ [![ARC-Easy](https://img.shields.io/badge/ARC--Easy-74.4%25-brightgreen)](https://huggingface.co/NoesisLab/Arcade-3B)
23
+
24
  **Arcade-3B** is a 3B instruction-following and reasoning model built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B).
25
+ It is the first public release from the **ARCADE** project at [NoesisLab](https://huggingface.co/NoesisLab), which investigates the *State–Constraint Orthogonality Hypothesis*: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.
26
 
27
  ---
28
 
29
+ ## Method: SC-Orthogonal Training
30
 
31
  Standard Transformer hidden states conflate two distinct functions:
32
 
 
35
  | `H[..., :D/2]` | **S** (State) | *What* the model knows — factual content |
36
  | `H[..., D/2:]` | **C** (Constraint) | *How* to retrieve it — reasoning structure |
37
 
38
+ ARCADE's **SCOrthoTrainer** injects an orthogonality penalty on the final hidden layer, encouraging S and C to decouple in representation space without modifying any attention operators:
39
 
40
  $$\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot L} \sum_{b,l} \left( \mathbf{S}_{b,l} \cdot \mathbf{C}_{b,l} \right)^2$$
41
 
42
+ with **λ = 0.1**. This soft regularization reduces divergence errors at inference time at zero architectural cost.
43
 
44
  ---
45
 
 
48
  | Setting | Value |
49
  |---------|-------|
50
  | Base model | `HuggingFaceTB/SmolLM3-3B` |
 
 
 
51
  | λ (orth penalty) | 0.1 |
52
  | Max sequence length | 2048 |
53
  | Learning rate | 2e-4 (cosine) |
 
111
 
112
  ```bibtex
113
  @misc{noesislab2025arcade,
114
+ title = {ARCADE: State-Constraint Orthogonal Training},
115
  author = {NoesisLab},
116
  year = {2025},
117
  howpublished = {\url{https://huggingface.co/NoesisLab/Arcade-3B}},