Upload folder using huggingface_hub
Browse files- app.py +35 -12
- static_figures/fig_data_efficiency.png +0 -0
- static_figures/fig_undersized.png +0 -0
app.py
CHANGED
|
@@ -327,21 +327,44 @@ a small auxiliary vocabulary (e.g. 30 tokens) inserted at regular intervals (eve
|
|
| 327 |
detail_btn = gr.Button("Show splits")
|
| 328 |
detail_table = gr.Dataframe(headers=["Split", "Accuracy", "N"], interactive=False)
|
| 329 |
|
| 330 |
-
# ββ Tab 2:
|
| 331 |
-
with gr.TabItem("
|
| 332 |
-
gr.Markdown("""##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 333 |
|
| 334 |
-
|
| 335 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 336 |
|
| 337 |
-
|
| 338 |
-
|
| 339 |
-
|
| 340 |
-
**0** (no carry), **1** (definite carry), or **U**
|
| 341 |
-
|
| 342 |
|
| 343 |
-
**SoRL makes these circuits visible as explicit tokens.** We analyze
|
| 344 |
-
|
| 345 |
""")
|
| 346 |
|
| 347 |
gr.Markdown("""### 1. Token specialization by difficulty
|
|
|
|
| 327 |
detail_btn = gr.Button("Show splits")
|
| 328 |
detail_table = gr.Dataframe(headers=["Split", "Accuracy", "N"], interactive=False)
|
| 329 |
|
| 330 |
+
# ββ Tab 2: Results ββ
|
| 331 |
+
with gr.TabItem("Results"):
|
| 332 |
+
gr.Markdown("""## SoRL K=1 abs30: never loses to baseline
|
| 333 |
+
|
| 334 |
+
Our best config β **K=1 (abstraction at every position), vocab size 30** β matches or beats
|
| 335 |
+
the SFT baseline on every data size and every architecture tested. No exceptions.
|
| 336 |
+
""")
|
| 337 |
+
gr.Image("static_figures/fig_data_efficiency.png")
|
| 338 |
+
|
| 339 |
+
gr.Markdown("""**At 10K training examples**, SoRL K=1 abs30 reaches **96.1%** while the baseline
|
| 340 |
+
reaches only 72.4% β a **+24 percentage point** improvement. At 25K, SoRL hits 100% while the
|
| 341 |
+
baseline is at 91.6%. By 50K both reach 100%.
|
| 342 |
+
|
| 343 |
+
K=4 (abstraction every 4th position) fails at 10K data β it doesn't have enough examples to learn
|
| 344 |
+
useful abstractions through search. K=1 is more data-efficient because every position gets a
|
| 345 |
+
scratchpad token.
|
| 346 |
+
""")
|
| 347 |
+
|
| 348 |
+
gr.Markdown("### SoRL helps undersized models the most")
|
| 349 |
+
gr.Image("static_figures/fig_undersized.png")
|
| 350 |
|
| 351 |
+
gr.Markdown("""The biggest gains are on **capacity-limited architectures**. A 2L/1H/128d model
|
| 352 |
+
goes from 50% (baseline) to **85%** (SoRL K=1 abs30) β a +35pp improvement. The abstraction tokens
|
| 353 |
+
effectively give the model external memory that compensates for its limited hidden dimensions.
|
| 354 |
+
""")
|
| 355 |
+
|
| 356 |
+
# ββ Tab 3: Interpretability ββ
|
| 357 |
+
with gr.TabItem("Interpretability"):
|
| 358 |
+
gr.Markdown("""## Do abstraction tokens encode carry/borrow circuits?
|
| 359 |
|
| 360 |
+
When humans add multi-digit numbers, they track **carries** β if 7+8=15, they write 5 and carry 1.
|
| 361 |
+
[Quirke et al. (2024)](https://arxiv.org/abs/2402.02619) showed transformers learn carry/borrow
|
| 362 |
+
circuits internally, discoverable only through activation-level analysis. They define a **tri-state
|
| 363 |
+
carry classifier** (eq. 2): each position is **0** (no carry), **1** (definite carry), or **U**
|
| 364 |
+
(uncertain β digit sum = 9, carry depends on cascade from right).
|
| 365 |
|
| 366 |
+
**SoRL makes these circuits visible as explicit tokens.** We analyze our K=4 abs30 model (which
|
| 367 |
+
has fewer abstraction positions, forcing sharper specialization).
|
| 368 |
""")
|
| 369 |
|
| 370 |
gr.Markdown("""### 1. Token specialization by difficulty
|
static_figures/fig_data_efficiency.png
ADDED
|
static_figures/fig_undersized.png
ADDED
|