Spaces:
Running
Running
File size: 10,639 Bytes
ab6febe | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | <!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Second Loop · Project hub</title>
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link href="https://fonts.googleapis.com/css2?family=Playfair+Display:ital,wght@1,700&family=Inter:wght@400;500;700;800&family=JetBrains+Mono:wght@500;700&display=swap" rel="stylesheet" />
<style>
:root{
--bg:#000000; --bg-card:#0A0A0A;
--border:#1F1F1F; --border-strong:#2A2A2A;
--text:#FFFFFF; --text-mute:#A8A8A8; --text-dim:#6B6B6B;
--gold:#D4AF37; --gold-hi:#E8C84A;
--indigo:#6366F1; --purple:#A855F7; --green:#1FD160;
--mono:'JetBrains Mono',ui-monospace,monospace;
--serif:'Playfair Display',serif;
--sans:'Inter',system-ui,sans-serif;
}
*{box-sizing:border-box;}
html,body{margin:0;background:var(--bg);color:var(--text);font-family:var(--sans);}
.wrap{max-width:1080px;margin:0 auto;padding:22px 24px 64px 24px;}
a{color:var(--gold);text-decoration:none;}
/* header */
.head{border:1px solid var(--border-strong);border-radius:14px;padding:28px 32px 24px 32px;
background:linear-gradient(135deg,#0A0A0A 0%,#0B0820 60%,#120A22 100%);}
.head-top{display:flex;align-items:flex-start;justify-content:space-between;gap:24px;
padding-bottom:18px;border-bottom:1px solid var(--border);margin-bottom:16px;}
.head-brand{display:flex;align-items:center;gap:16px;}
.head-icon{width:54px;height:54px;border:1px solid var(--border-strong);border-radius:13px;
display:flex;align-items:center;justify-content:center;font-size:28px;
background:radial-gradient(60% 60% at 50% 40%,#1a1530 0%,#050505 100%);}
.head-title{font-family:var(--serif);font-style:italic;font-weight:700;font-size:38px;line-height:1;margin:2px 0 7px 0;}
.head-subtitle{font-family:var(--mono);font-size:11.5px;letter-spacing:.16em;color:var(--text-mute);text-transform:uppercase;}
.head-right{text-align:right;white-space:nowrap;}
.submitted-label{font-family:var(--mono);font-size:10px;letter-spacing:.22em;color:var(--text-dim);text-transform:uppercase;display:block;margin-bottom:4px;}
.submitted-name{font-family:var(--serif);font-style:italic;font-weight:700;font-size:21px;}
.status-pill{display:inline-flex;align-items:center;gap:6px;margin-top:10px;padding:5px 12px;border-radius:999px;
background:rgba(31,209,96,.08);border:1px solid rgba(31,209,96,.5);
font-family:var(--mono);font-size:10px;letter-spacing:.18em;color:var(--green);text-transform:uppercase;}
.status-dot{width:7px;height:7px;border-radius:50%;background:var(--green);}
.head-tag{text-align:center;margin:14px 0 16px 0;font-family:var(--mono);font-size:12px;letter-spacing:.24em;
color:var(--purple);text-transform:uppercase;}
.head-meta{display:grid;grid-template-columns:repeat(3,1fr);gap:12px 24px;}
.head-meta .k{font-family:var(--mono);font-size:9.5px;letter-spacing:.2em;color:var(--text-dim);text-transform:uppercase;}
.head-meta .v{font-family:var(--sans);font-size:13.5px;font-weight:700;}
/* arc */
.arc{margin:18px 0;border:1px solid var(--border-strong);border-radius:14px;background:var(--bg-card);padding:22px 26px;}
.arc .t{font-family:var(--mono);font-size:10px;letter-spacing:.2em;color:var(--text-dim);text-transform:uppercase;margin-bottom:10px;}
.arc .body{font-family:var(--sans);font-size:14.5px;line-height:1.6;color:var(--text-mute);}
.arc .body b{color:var(--text);}
/* cards */
.cards{display:grid;grid-template-columns:1fr;gap:14px;}
@media(min-width:780px){.cards{grid-template-columns:1fr 1fr;}}
.card{border:1px solid var(--border-strong);border-radius:14px;background:var(--bg-card);
padding:22px 24px;display:flex;flex-direction:column;transition:border-color 140ms ease,transform 140ms ease;
background-image:linear-gradient(135deg,rgba(99,102,241,.05),rgba(168,85,247,.05));}
.card:hover{border-color:var(--gold);transform:translateY(-2px);}
.card .num{font-family:var(--mono);font-size:11px;letter-spacing:.2em;color:var(--purple);font-weight:700;margin-bottom:6px;}
.card .name{font-family:var(--serif);font-style:italic;font-weight:700;font-size:23px;margin:0 0 8px 0;}
.card .hook{font-family:var(--sans);font-size:14px;line-height:1.5;color:var(--text-mute);flex:1;}
.card .key{align-self:flex-start;margin:14px 0 16px 0;padding:5px 12px;border-radius:999px;
font-family:var(--mono);font-size:10px;letter-spacing:.1em;text-transform:uppercase;font-weight:700;
color:var(--green);background:rgba(31,209,96,.08);border:1px solid rgba(31,209,96,.4);}
.card .links{display:flex;gap:10px;}
.card .links a{flex:1;text-align:center;padding:10px 12px;border-radius:999px;
font-family:var(--mono);font-size:10.5px;letter-spacing:.12em;text-transform:uppercase;font-weight:700;
border:1px solid var(--border-strong);transition:border-color 120ms ease,background 120ms ease;}
.card .links a.space{color:var(--text);background:rgba(168,85,247,.12);border-color:rgba(168,85,247,.5);}
.card .links a.space:hover{border-color:var(--purple);background:rgba(168,85,247,.22);}
.card .links a.gh{color:var(--text-mute);}
.card .links a.gh:hover{border-color:var(--gold);color:var(--text);}
/* card 3 spans full width on wide screens, centered look */
.card.solo{grid-column:1 / -1;}
/* about */
.foot{margin-top:20px;padding:22px 24px;border:1px solid var(--border);border-radius:12px;background:var(--bg-card);}
.foot .ftitle{font-family:var(--mono);font-size:10px;letter-spacing:.2em;color:var(--text-dim);text-transform:uppercase;margin-bottom:10px;}
.foot .b{font-family:var(--sans);font-size:13px;color:var(--text-mute);line-height:1.62;}
.foot .b a{border-bottom:1px dotted var(--gold);}
.attrib{margin-top:14px;padding-top:14px;border-top:1px solid var(--border);
font-family:var(--mono);font-size:11px;letter-spacing:.04em;color:var(--text-dim);line-height:1.7;}
@media(max-width:760px){.head-top{flex-direction:column;}.head-right{text-align:left;}.head-meta{grid-template-columns:1fr;}}
</style>
</head>
<body>
<main class="wrap">
<header class="head">
<div class="head-top">
<div class="head-brand">
<div class="head-icon">🔁</div>
<div>
<div class="head-title">Second Loop</div>
<div class="head-subtitle">Honesty & self-correction in language models</div>
</div>
</div>
<div class="head-right">
<span class="submitted-label">Submitted by</span>
<div class="submitted-name">Serghei Brinza</div>
<div class="status-pill"><span class="status-dot"></span>Static · project hub</div>
</div>
</div>
<div class="head-tag">★ Three experiments · one arc ★</div>
<div class="head-meta">
<div><div class="k">Subject model</div><div class="v">Qwen2.5-3B-Instruct (frozen)</div></div>
<div><div class="k">Independent judge</div><div class="v">Qwen2.5-7B-Instruct</div></div>
<div><div class="k">License</div><div class="v">MIT</div></div>
</div>
</header>
<section class="arc">
<div class="t">The arc</div>
<div class="body">
Three demo Spaces, one through-line. <b>(1)</b> Can a confidently memorized error in a frozen
LLM be <b>durably corrected</b>? <b>(2)</b> Can that correction <b>survive a noisy notebook</b>
whose external entries are partly unreliable? <b>(3)</b> How <b>rarely can external truth
arrive</b> before calibration collapses back to the raw model? Every linked Space is fully
static — no model is loaded, and every number is a verbatim output of the live experimental run.
</div>
</section>
<section class="cards">
<div class="card">
<div class="num">Part 1</div>
<div class="name">Scar-Survival</div>
<div class="hook">A memorized LLM error, corrected — how durable is the fix? Turn the mechanism
on, reload the frozen model, then stress it with a counterfeit fact.</div>
<div class="key">0/12 → 12/12 · holds 10/10 reloads · 6/12 survive</div>
<div class="links">
<a class="space" href="https://huggingface.co/spaces/Laborator/scar-survival" target="_blank" rel="noopener">Open Space ↗</a>
<a class="gh" href="https://github.com/SergheiBrinza/scar-survival" target="_blank" rel="noopener">GitHub ↗</a>
</div>
</div>
<div class="card">
<div class="num">Part 2</div>
<div class="name">External Grounding</div>
<div class="hook">Lifting self-correction from 50% to 100% under a noisy notebook. Drag the
guardian through six versions and watch which traps get fixed — and which regress.</div>
<div class="key">50% → 100% · 66.7% plateau · +fixed / −broken</div>
<div class="links">
<a class="space" href="https://huggingface.co/spaces/Laborator/external-grounding" target="_blank" rel="noopener">Open Space ↗</a>
<a class="gh" href="https://github.com/SergheiBrinza/external-grounding" target="_blank" rel="noopener">GitHub ↗</a>
</div>
</div>
<div class="card solo">
<div class="num">Part 3</div>
<div class="name">Thin Channel</div>
<div class="hook">How rarely external truth can arrive before calibration collapses. Move the
lever from “every day” to “never” and watch the curve hold — until the cliff.</div>
<div class="key">finite schedule holds · zero contact collapses to raw 3B</div>
<div class="links">
<a class="space" href="https://huggingface.co/spaces/Laborator/thin-channel" target="_blank" rel="noopener">Open Space ↗</a>
<a class="gh" href="https://github.com/SergheiBrinza/thin-channel" target="_blank" rel="noopener">GitHub ↗</a>
</div>
</div>
</section>
<footer class="foot">
<div class="ftitle">About</div>
<div class="b">
Second Loop is independent research on whether a language model can be made to correct its own
confident mistakes and stay honest under pressure. The three Spaces above are interactive,
static visualizations of the experiments — no live model runs in any of them, and every value
shown is a verbatim output of the original run, bundled as data alongside each page. Source
code, raw per-run results and methodology live in the GitHub repository linked on every card.
</div>
<div class="attrib">
Subject model Qwen2.5-3B-Instruct · independent judge Qwen2.5-7B-Instruct (both Apache-2.0,
Alibaba Cloud). Run on a single RTX 3090. No model weights are redistributed here. Demo code: MIT.
</div>
</footer>
</main>
</body>
</html>
|