Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
OzTianlu 
posted an update 26 days ago
Post
5390
Arcade-3B — SmolReasoner
NoesisLab/Arcade-3B
Arcade-3B is a 3B instruction-following and reasoning model built on SmolLM3-3B. It is the public release from the ARCADE project at NoesisLab, which investigates the State–Constraint Orthogonality Hypothesis: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.

While exploring State-Constraint Orthogonality is an interesting attempt, forcing hidden states into decoupled orthogonal subspaces is actually a very classic representation learning heuristic. The core idea feels quite dated rather than a novel paradigm.

Furthermore, the current experimental setup and baseline selection are highly questionable and lack persuasiveness. The models used for comparison in the chart (e.g., Llama-2-7B, Qwen1.5-1.8B) are mostly older generations, serving as very weak baselines. More importantly, since Arcade-3B is built on top of SmolLM3-3B, the most crucial baseline—the default Instruct version of SmolLM3-3B itself—is missing. Without rigorously controlled comparisons, it is difficult to attribute the performance gains to the proposed "orthogonal decoupling" rather than just the fine-tuning data itself.
Are there any plans to benchmark against newer and stronger models in a similar parameter class (e.g., Qwen3.5-2B-Instruct), and to provide the necessary ablation studies to truly validate the effectiveness of this approach?

·

What an academic tone! New baselines, here.
benchmark_comparison