amirali1985 commited on
Commit
ffa73b4
Β·
1 Parent(s): 4a57a46

LaTeX tab: collapse to 2 blocks; expand ablations with per-split table and commentary

Browse files
Files changed (1) hide show
  1. app.py +86 -45
app.py CHANGED
@@ -329,9 +329,22 @@ LATEX_TABLE_UNDERSIZED = r"""% tab:undersized-wins β€” SoRL vs SFT on undersized
329
  \end{table}
330
  """
331
 
332
- LATEX_APPENDIX = r"""% ── Appendix: Arithmetic interpretability β€” Findings #2–5 ──────────────────
333
- % Requires: tcolorbox, booktabs, xcolor, multirow
334
- % Put \providecommand{\sorl}{\textsc{DLR}} and \providecommand{\sft}{\textsc{SFT}} in preamble.
 
 
 
 
 
 
 
 
 
 
 
 
 
335
 
336
  % ── Finding box macro (put in preamble) ──────────────────────────────────────
337
  % \usepackage{tcolorbox}
@@ -357,53 +370,90 @@ Experimental code: \texttt{arithmetic/experiments/} (all results reproducible).
357
  \label{app:causal}
358
 
359
  To confirm that \sorl{} tokens are causally necessary (not merely correlated
360
- with correct outputs), we run three intervention conditions:
 
361
 
362
  \begin{itemize}[nosep]
363
- \item \textbf{Knockout}: replace every abstraction token with a single
364
- fixed \texttt{[UNK]} embedding.
365
  \item \textbf{Shuffle}: randomly permute all abstraction tokens within
366
- each sequence (preserving the token identities but breaking positional
367
- assignment).
368
- \item \textbf{Random}: replace each abstraction token with a token drawn
369
- uniformly at random from the codebook.
 
370
  \end{itemize}
371
 
 
 
372
  \begin{table}[h]
373
  \centering\small
374
- \begin{tabular}{lrr}
375
  \toprule
376
- Condition & Accuracy & Drop \\
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
377
  \midrule
378
- Baseline (correct tokens) & 95.5\% & --- \\
379
- Shuffle & 26.6\% & $-68.9$\,pp \\
380
- Random & 12.3\% & $-83.2$\,pp \\
381
- Knockout & 0.1\% & $-95.4$\,pp \\
382
  \bottomrule
383
  \end{tabular}
384
- \caption{Causal ablation on \texttt{2L/1H/128d} (2{,}600 examples).
385
- Even shuffling β€” which preserves the token distribution but breaks
386
- positional assignment β€” collapses accuracy by 69\,pp.}
387
- \label{tab:causal-ablation}
 
388
  \end{table}
389
 
390
- The 95\,pp knockout drop confirms that the model has offloaded the
391
- arithmetic computation onto its abstraction tokens: without them,
392
- it cannot solve the task. Shuffle being worse than random may seem
393
- counterintuitive; it reflects that wrong tokens at specific positions
394
- (e.g.\ a cascade position receiving a simple-add token) cause systematic
395
- one-off errors, while random tokens sometimes happen to match.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
396
 
397
  \begin{tcolorbox}[colback=gray!6, colframe=gray!40,
398
  fonttitle=\bfseries\small, title={Finding \#2},
399
  left=5pt, right=5pt, top=4pt, bottom=4pt]
400
  \small
401
- \sorl{} abstraction tokens are causally necessary for correct six-digit
402
- arithmetic: replacing all tokens with a fixed mask reduces accuracy from
403
- 95.5\% to 0.1\%, and shuffling tokens within a sequence (preserving
404
- token identity but destroying positional assignment) drops accuracy
405
- by 69\,pp. The model has externalised the carry/borrow computation
406
- into its routing tokens.
 
407
  \end{tcolorbox}
408
 
409
  % ─────────────────────────────────────────────────────────────────────────────
@@ -1205,23 +1255,14 @@ hidden activations β€” but here it is readable directly from the token sequence.
1205
  # ── Tab 3: LaTeX Scratchpad ──
1206
  with gr.TabItem("LaTeX"):
1207
  gr.Markdown("### LaTeX Scratchpad β€” copy sections directly into Overleaf")
1208
- gr.Markdown("Each block is a self-contained chunk. Click the copy icon top-right of each box.")
1209
 
1210
- gr.Markdown("#### Β§ Arithmetic setup + Quirke subtask definitions")
1211
  gr.Code(value=LATEX_ARITHMETIC_SETUP, label="arithmetic_setup.tex",
1212
  language=None, interactive=False)
1213
 
1214
- gr.Markdown("#### Β§ Results table β€” SoRL vs baseline on undersized architectures")
1215
- gr.Markdown("Requires: `\\usepackage{booktabs}`, `\\usepackage{xcolor}`.")
1216
- gr.Code(value=LATEX_TABLE_UNDERSIZED, label="tab_undersized_wins.tex",
1217
- language=None, interactive=False)
1218
-
1219
- gr.Markdown("#### Β§ Carry-cascade example figure (TikZ)")
1220
- gr.Markdown("Requires: `\\usepackage{tikz}`, `\\usetikzlibrary{matrix}`, `\\usepackage{xcolor}`, and `\\providecommand{\\sorl}{\\textsc{DLR}}`.")
1221
- gr.Code(value=LATEX_FIGURE_EXAMPLE, label="fig_arithmetic_example.tex",
1222
- language=None, interactive=False)
1223
-
1224
- gr.Markdown("#### Β§ Appendix β€” full experimental details")
1225
  gr.Code(value=LATEX_APPENDIX, label="appendix_arithmetic.tex",
1226
  language=None, interactive=False)
1227
 
 
329
  \end{table}
330
  """
331
 
332
+ LATEX_APPENDIX = r"""% ═══════════════════════════════════════════════════════════════════════════
333
+ % APPENDIX β€” Arithmetic case study (Findings #2–5 + training details)
334
+ % ═══════════════════════════════════════════════════════════════════════════
335
+ % Paste this entire block into your appendix .tex file.
336
+ %
337
+ % Required packages:
338
+ % \usepackage{tcolorbox}
339
+ % \usepackage{booktabs}
340
+ % \usepackage{xcolor}
341
+ % \usepackage{multirow}
342
+ % \usepackage{enumitem} % for [nosep]
343
+ %
344
+ % Required macros (put in preamble if not already defined):
345
+ % \providecommand{\sorl}{\textsc{DLR}}
346
+ % \providecommand{\sft}{\textsc{SFT}}
347
+ % ═══════════════════════════════════════════════════════════════════════════
348
 
349
  % ── Finding box macro (put in preamble) ──────────────────────────────────────
350
  % \usepackage{tcolorbox}
 
370
  \label{app:causal}
371
 
372
  To confirm that \sorl{} tokens are causally necessary (not merely correlated
373
+ with correct outputs), we run three intervention conditions on model
374
+ \texttt{2L/1H/128d} (100K), evaluated across 2{,}600 held-out problems:
375
 
376
  \begin{itemize}[nosep]
 
 
377
  \item \textbf{Shuffle}: randomly permute all abstraction tokens within
378
+ each sequence (token identities preserved; positional assignment destroyed).
379
+ \item \textbf{Random}: replace each token with a draw uniform over the
380
+ 30-token codebook (identity and position both destroyed).
381
+ \item \textbf{Knockout}: replace every token with a fixed \texttt{[UNK]}
382
+ embedding (strongest intervention; removes all information).
383
  \end{itemize}
384
 
385
+ Table~\ref{tab:ablation-splits} shows per-split accuracy under each condition.
386
+
387
  \begin{table}[h]
388
  \centering\small
389
+ \begin{tabular}{llrrrr}
390
  \toprule
391
+ Family & Split & Baseline & Shuffle & Random & Knockout \\
392
+ \midrule
393
+ \multirow{4}{*}{\textit{Addition (easy)}} & S0 (no carry) & 100\% & 24\% & 28\% & 0\% \\
394
+ & S1 & 100\% & 17\% & 9\% & 0\% \\
395
+ & S2 & 100\% & 22\% & 10\% & 0\% \\
396
+ & random & 100\% & 26\% & 8\% & 0\% \\
397
+ \midrule
398
+ \multirow{4}{*}{\textit{Addition cascade (hard)}} & C3 (3-deep) & 96\% & 28\% & 14\% & 0\% \\
399
+ & C4 (4-deep) & 99\% & 25\% & 13\% & 0\% \\
400
+ & C5 (5-deep) & 99\% & 23\% & 19\% & 0\% \\
401
+ & C6 (6-deep) & 97\% & 27\% & 15\% & 0\% \\
402
+ \midrule
403
+ \multirow{1}{*}{\textit{Subtraction (easy)}} & random & 100\% & 46\% & 12\% & 0\% \\
404
+ \midrule
405
+ \multirow{3}{*}{\textit{Subtraction cascade (hard)}} & M3 (3-deep borrow) & 100\% & 22\% & 1\% & 0\% \\
406
+ & M4 (4-deep borrow) & 85\% & 6\% & 0\% & 0\% \\
407
+ & M5 (5-deep borrow) & 57\% & 3\% & 0\% & 2\% \\
408
  \midrule
409
+ \multicolumn{2}{l}{\textbf{Overall}} & \textbf{95.5\%} & 26.6\% & 12.3\% & 0.1\% \\
 
 
 
410
  \bottomrule
411
  \end{tabular}
412
+ \caption{Per-split causal ablation (\texttt{2L/1H/128d}, 100K training examples).
413
+ \textbf{Shuffle} preserves token identity but destroys positional assignment;
414
+ \textbf{Random} destroys both; \textbf{Knockout} removes all token information.
415
+ Generated by \texttt{paper/results/result\_ablation\_splits/run.py}.}
416
+ \label{tab:ablation-splits}
417
  \end{table}
418
 
419
+ \paragraph{Commentary.}
420
+ Knockout reduces accuracy to $\leq$2\% on every split, confirming that
421
+ the model has offloaded computation into the routing tokens.
422
+ Three patterns are notable:
423
+
424
+ \begin{itemize}[nosep]
425
+ \item \textbf{Shuffle $>$ Random on easy splits.}
426
+ On addition S0 (no carry), shuffle yields 24\% vs.\ random's 28\%;
427
+ the gap is small and reflects that no single position is critical β€”
428
+ any token in any position is roughly as bad as another.
429
+ \item \textbf{Shuffle $>$ Random on cascade splits (C3--C6).}
430
+ On 4--6-deep carry cascades, shuffle (23--28\%) consistently
431
+ outperforms random (13--19\%).
432
+ When tokens are shuffled, a cascade position receives a token
433
+ from another cascade position β€” a wrong token but from the
434
+ ``right family'', producing a systematic one-off carry error.
435
+ Random tokens provide no structural signal at all, making
436
+ cascade resolution impossible.
437
+ \item \textbf{Borrow cascades are uniquely sensitive.}
438
+ Sub-M4 (4-deep borrow) drops from 85\% baseline to 6\% under
439
+ shuffle and 0\% under random β€” a 79\,pp collapse from shuffle
440
+ alone. Sub-M5 (5-deep borrow) is hardest even at baseline (57\%),
441
+ and all ablations reduce it to $\leq$3\%, showing that deep
442
+ borrow cascades are the single hardest regime and that \sorl{}
443
+ tokens are essential for solving them.
444
+ \end{itemize}
445
 
446
  \begin{tcolorbox}[colback=gray!6, colframe=gray!40,
447
  fonttitle=\bfseries\small, title={Finding \#2},
448
  left=5pt, right=5pt, top=4pt, bottom=4pt]
449
  \small
450
+ \sorl{} abstraction tokens are causally necessary: knockout collapses
451
+ accuracy from 95.5\% to 0.1\% overall, and to $\leq$3\% on the hardest
452
+ borrow-cascade splits (M4--M5).
453
+ Shuffle (identity-preserving, position-destroying) is more harmful than
454
+ random on cascade splits β€” wrong-position tokens from the same structural
455
+ family cause systematic carry errors, while random tokens cause broader
456
+ incoherence.
457
  \end{tcolorbox}
458
 
459
  % ─────────────────────────────────────────────────────────────────────────────
 
1255
  # ── Tab 3: LaTeX Scratchpad ──
1256
  with gr.TabItem("LaTeX"):
1257
  gr.Markdown("### LaTeX Scratchpad β€” copy sections directly into Overleaf")
1258
+ gr.Markdown("Two blocks: **main body** (setup + Finding #1 + forward ref) and **appendix** (Findings #2–5, all tables, training details). Copy each into the appropriate place in your paper.")
1259
 
1260
+ gr.Markdown("#### Β§ Main body β€” Β§Arithmetic setup (paste into paper body)")
1261
  gr.Code(value=LATEX_ARITHMETIC_SETUP, label="arithmetic_setup.tex",
1262
  language=None, interactive=False)
1263
 
1264
+ gr.Markdown("#### Β§ Appendix β€” full analysis (paste into appendix)")
1265
+ gr.Markdown("Includes: per-split ablation table, token profiles, guided computation, Quirke analogy, training details. Requires `tcolorbox`, `booktabs`, `xcolor`, `multirow`, `enumitem`.")
 
 
 
 
 
 
 
 
 
1266
  gr.Code(value=LATEX_APPENDIX, label="appendix_arithmetic.tex",
1267
  language=None, interactive=False)
1268