Initial eval-only card: RYS sweep results for SmolLM2-1.7B-Instruct (negative result; v2 corpus)
Browse files
README.md
ADDED
|
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct
|
| 4 |
+
tags:
|
| 5 |
+
- rys
|
| 6 |
+
- layer-duplication
|
| 7 |
+
- reasoning-circuits
|
| 8 |
+
- evaluation-only
|
| 9 |
+
- cross-architecture
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# SmolLM2-1.7B-Instruct — RYS evaluation
|
| 13 |
+
|
| 14 |
+
**Evaluation-only repository — contains no model weights.** This card documents the layer-duplication (RYS) sweep results for [`HuggingFaceTB/SmolLM2-1.7B-Instruct`](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct) as part of the cross-architecture v2 corpus.
|
| 15 |
+
|
| 16 |
+
**Sweep data + cross-architecture analysis:** [`john-broadway/rys-sovereign-collection-v2`](https://huggingface.co/datasets/john-broadway/rys-sovereign-collection-v2)
|
| 17 |
+
|
| 18 |
+
## Headline
|
| 19 |
+
|
| 20 |
+
First published RYS negative result: zero of 30 configurations boost reasoning >5%. Notable because sibling SmolLM2-135M and SmolLM2-360M both respond normally — the 1.7B size-point is uniquely anomalous within this family.
|
| 21 |
+
|
| 22 |
+
## Sweep results — best configuration
|
| 23 |
+
|
| 24 |
+
| | Baseline | Δ at best config |
|
| 25 |
+
|---|---:|---:|
|
| 26 |
+
| Math | 0.477 | −6.19 |
|
| 27 |
+
| EQ | 77.66 | +1.09 |
|
| 28 |
+
| Reasoning | 58.82% | +0.00 |
|
| 29 |
+
|
| 30 |
+
**Best configuration:** (15,18) block-3 (best combined Δ; still negative overall)
|
| 31 |
+
**Peak reasoning Δ:** +0.00% (zero configurations boost reasoning >5%)
|
| 32 |
+
**Reasoning-boosting configurations:** 0 of 30 configs
|
| 33 |
+
|
| 34 |
+
## Mechanism: falsification boundary (no recoverable suppression)
|
| 35 |
+
|
| 36 |
+
Hypothesis: SmolLM2 1.7B-scale heavy-synthetic training produces uniformly-capable layers without specialized circuits to duplicate. The sibling SmolLM2-135M and SmolLM2-360M variants both respond normally (+17.65% and +23.53% reasoning lift). The 1.7B size-point is uniquely anomalous within this family. Caveat: the 1.7B sweep used a narrower block-size search [3,4] than other sweeps — a re-sweep with full blocks would resolve whether the narrower search caused the result or whether it is a genuine size-specific anomaly.
|
| 37 |
+
|
| 38 |
+
## Position in the v2 curve
|
| 39 |
+
|
| 40 |
+
Across the **21 model variants** spanning 10 architecture families in the v2 corpus, Pearson r(baseline reasoning, peak reasoning Δ) = **−0.726**. Weak baselines lift more, in their weakest dimension. SmolLM2-1.7B-Instruct's placement on the curve: baseline reasoning 58.82%, peak Δ +0.00% (zero configurations boost reasoning >5%).
|
| 41 |
+
|
| 42 |
+
## RYS method
|
| 43 |
+
|
| 44 |
+
RYS ("Repeat Your Self") duplicates a contiguous block of transformer layers so hidden states pass through the same circuit twice. **No training, no weight changes, no merging.**
|
| 45 |
+
|
| 46 |
+
- Original RYS method: [David Ng](https://dnhkng.github.io/posts/rys/)
|
| 47 |
+
- Sweep + probe toolkit: [`alainnothere/llm-circuit-finder`](https://github.com/alainnothere/llm-circuit-finder)
|
| 48 |
+
- Hardware used for these sweeps: NVIDIA DGX Spark (GB10)
|
| 49 |
+
|
| 50 |
+
## No weights here — why
|
| 51 |
+
|
| 52 |
+
Negative-result publication is the point here — SmolLM2-1.7B is a falsification data point, not a deployment candidate. The card documents the **evaluation evidence** (sweep JSONL, cross-architecture position, the sibling-checks against 135M and 360M that respond normally). No RYS-modified weights would change the conclusion: the 1.7B size-point in this family does not respond to RYS.
|
| 53 |
+
|
| 54 |
+
For context on what IS deployable in the collection, see the v1 cross-scale (Qwen2.5 + Qwen3-32B) and v2 Qwen3-family cohort linked in the [v2 dataset card](https://huggingface.co/datasets/john-broadway/rys-sovereign-collection-v2).
|
| 55 |
+
|
| 56 |
+
## Citation
|
| 57 |
+
|
| 58 |
+
```bibtex
|
| 59 |
+
@misc{broadway2026rys_v2,
|
| 60 |
+
author = {Broadway, John and {Claude (Opus 4.6, 4.7)}},
|
| 61 |
+
title = {RYS Sovereign Collection v2: Cross-architecture Sweep Corpus},
|
| 62 |
+
year = {2026},
|
| 63 |
+
url = {https://huggingface.co/datasets/john-broadway/rys-sovereign-collection-v2}
|
| 64 |
+
}
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
## Attribution
|
| 68 |
+
|
| 69 |
+
John Broadway, with collaboration from Claude (Opus 4.6 in April 2026 sweep generation; Opus 4.7 in May 2026 cross-architecture analysis and publication). Original RYS method by [David Ng](https://dnhkng.github.io/posts/rys/) on Qwen2-72B; sweep + probe toolkit by [alainnothere](https://github.com/alainnothere/llm-circuit-finder).
|
| 70 |
+
|
| 71 |
+
## License
|
| 72 |
+
|
| 73 |
+
This evaluation card content: MIT. The base model `HuggingFaceTB/SmolLM2-1.7B-Instruct` retains its original license.
|