Update sibling cross-links: the 135M and 360M eval-only companion repos were consolidated into their deployable GGUF weight repos; point this negative-result card at the new URLs

671f136 verified 4 days ago

preview code

raw

history blame contribute delete

4.56 kB

metadata

license: apache-2.0
base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct
tags:
  - rys
  - layer-duplication
  - reasoning-circuits
  - evaluation-only
  - cross-architecture

SmolLM2-1.7B-Instruct — RYS evaluation

Evaluation-only repository — contains no model weights. This card documents the layer-duplication (RYS) sweep results for HuggingFaceTB/SmolLM2-1.7B-Instruct as part of the cross-architecture v2 corpus.

Sweep data + cross-architecture analysis: john-broadway/rys-sovereign-collection-v2

Headline

First published RYS negative result: zero of 30 configurations boost reasoning >5%. Notable because sibling SmolLM2-135M and SmolLM2-360M both respond normally — the 1.7B size-point is uniquely anomalous within this family.

Sweep results — best configuration

	Baseline	Δ at best config
Math	0.477	−6.19
EQ	77.66	+1.09
Reasoning	58.82%	+0.00

Best configuration: (15,18) block-3 (best combined Δ; still negative overall)
Peak reasoning Δ: +0.00% (zero configurations boost reasoning >5%)
Reasoning-boosting configurations: 0 of 30 configs

Mechanism: falsification boundary (no recoverable suppression)

Hypothesis: SmolLM2 1.7B-scale heavy-synthetic training produces uniformly-capable layers without specialized circuits to duplicate. The sibling SmolLM2-135M and SmolLM2-360M variants both respond normally (+17.65% and +23.53% reasoning lift). The 1.7B size-point is uniquely anomalous within this family. Caveat: the 1.7B sweep used a narrower block-size search [3,4] than other sweeps — a re-sweep with full blocks would resolve whether the narrower search caused the result or whether it is a genuine size-specific anomaly.

Position in the v2 curve

Across the 21 model variants spanning 10 architecture families in the v2 corpus, Pearson r(baseline reasoning, peak reasoning Δ) = −0.726. Weak baselines lift more, in their weakest dimension. SmolLM2-1.7B-Instruct's placement on the curve: baseline reasoning 58.82%, peak Δ +0.00% (zero configurations boost reasoning >5%).

RYS method

RYS ("Repeat Your Self") duplicates a contiguous block of transformer layers so hidden states pass through the same circuit twice. No training, no weight changes, no merging.

Original RYS method: David Ng
Sweep + probe toolkit: alainnothere/llm-circuit-finder
Hardware used for these sweeps: NVIDIA DGX Spark (GB10)

No weights here — why

Negative-result publication is the point here — SmolLM2-1.7B is a falsification data point, not a deployment candidate. The card documents the evaluation evidence (sweep JSONL, cross-architecture position, the sibling-checks against 135M and 360M that respond normally). No RYS-modified weights would change the conclusion: the 1.7B size-point in this family does not respond to RYS.

Sibling repos (both respond normally to RYS, anchoring the 1.7B as a uniquely anomalous size-point): john-broadway/SmolLM2-135M-RYS-18-22-GGUF (+17.65% reasoning + +13.05 EQ) and john-broadway/SmolLM2-360M-RYS-12-15-GGUF (+23.53% reasoning, full block search).

For context on what IS deployable in the collection, see the v1 cross-scale (Qwen2.5 + Qwen3-32B) and v2 Qwen3-family cohort linked in the v2 dataset card.

Citation

@misc{broadway2026rys_v2,
  author = {Broadway, John and {Claude (Opus 4.6, 4.7)}},
  title  = {RYS Sovereign Collection v2: Cross-architecture Sweep Corpus},
  year   = {2026},
  url    = {https://huggingface.co/datasets/john-broadway/rys-sovereign-collection-v2}
}

Attribution

John Broadway, with collaboration from Claude (Opus 4.6 in April 2026 sweep generation; Opus 4.7 in May 2026 cross-architecture analysis and publication). Original RYS method by David Ng on Qwen2-72B; sweep + probe toolkit by alainnothere.

License

This evaluation card content: MIT. The base model HuggingFaceTB/SmolLM2-1.7B-Instruct retains its original license.