NoesisLab
/

Arcade-3B

+---
+language:
+- en
+license: apache-2.0
+base_model: HuggingFaceTB/SmolLM3-3B
+tags:
+- smollm
+- smolreasoner
+- lora
+- reasoning
+- instruction-tuned
+- arcade
+- sc-orthogonal
+pipeline_tag: text-generation
+---
+# Arcade-3B — SmolReasoner
+**Arcade-3B** is a 3B instruction-following and reasoning model built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B).
+It is the first public release from the **ARCADE** project at [NoesisLab](https://huggingface.co/NoesisLab), which investigates zero-extra-parameter fine-tuning via the *State–Constraint Orthogonality Hypothesis*.
+---
+## Method: SC-Orthogonal LoRA
+Standard Transformer hidden states conflate two distinct functions:
+| Half | Symbol | Role |
+|------|--------|------|
+| `H[..., :D/2]` | **S** (State) | *What* the model knows — factual content |
+| `H[..., D/2:]` | **C** (Constraint) | *How* to retrieve it — reasoning structure |
+ARCADE's **SCOrthoTrainer** injects an orthogonality penalty on the final hidden layer during LoRA fine-tuning, encouraging S and C to decouple in representation space without modifying any attention operators:
+$$\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot L} \sum_{b,l} \left( \mathbf{S}_{b,l} \cdot \mathbf{C}_{b,l} \right)^2$$
+with **λ = 0.1**.  This "soft logic gate" reduces divergence errors at inference time at zero architectural cost.
+---
+## Training Details
+| Setting | Value |
+|---------|-------|
+| Base model | `HuggingFaceTB/SmolLM3-3B` |
+| LoRA rank / alpha | 64 / 128 |
+| LoRA target | all-linear |
+| Dropout | 0.05 |
+| λ (orth penalty) | 0.1 |
+| Max sequence length | 2048 |
+| Learning rate | 2e-4 (cosine) |
+| Steps | 10 000 |
+| Effective batch | 16 sequences/step |
+| Hardware | 1 × A100-80 GB |
+| Precision | bfloat16 |
+### Training Data
+| Dataset | Split | Sampling weight |
+|---------|-------|-----------------|
+| [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | train (2.3 K) | 10 % |
+| [HuggingFaceTB/smol-smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smol-smoltalk) | train (460 K) | 45 % |
+| [OpenDataArena/ODA-Mixture-500k](https://huggingface.co/datasets/OpenDataArena/ODA-Mixture-500k) | train (500 K) | 45 % |
+Reasoning samples are wrapped with `<think>…</think>` tags and upsampled 10× to compensate for the small dataset size.
+---
+## Evaluation
+Results from [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness):
+| Benchmark | Few-shot | Metric | Score | ± |
+|-----------|----------|--------|-------|---|
+| GSM8K | 5 | flexible-extract / exact_match | **0.6293** | 0.0133 |
+| HumanEval | 0 | pass@1 | **0.4146** | 0.0386 |
+| ARC-Challenge | 25 | acc_norm | **0.5256** | 0.0146 |
+| ARC-Easy | 0 | acc | **0.7437** | 0.0090 |
+| MMLU | 0 | acc | **0.5293** | 0.0040 |
+---
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_id = "NoesisLab/Arcade-3B"
+tok = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+messages = [{"role": "user", "content": "Solve step by step: If a train travels 120 km in 1.5 hours, what is its average speed?"}]
+input_ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
+output = model.generate(input_ids, max_new_tokens=512, temperature=0.7, do_sample=True)
+print(tok.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
+```
+For step-by-step reasoning, the model may emit a `<think>…</think>` block before the final answer.
+---
+## Citation
+```bibtex
+@misc{noesislab2025arcade,
+  title        = {ARCADE: State-Constraint Orthogonal LoRA Fine-Tuning},
+  author       = {NoesisLab},
+  year         = {2025},
+  howpublished = {\url{https://huggingface.co/NoesisLab/Arcade-3B}},
+}
+```
+---
+## License
+Apache 2.0 — inherited from SmolLM3-3B.