--- title: 2 Β· External Grounding emoji: πŸ” colorFrom: indigo colorTo: purple sdk: static app_file: index.html pinned: true license: mit short_description: Lifting LLM self-correction 50%β†’100% under a noisy notebook --- # External Grounding β€” interactive demo Interactive visualization of Experiment 2–3 (the *guardian*) of the [Second Loop](https://github.com/SergheiBrinza/external-grounding) project. This Space loads **no model**. Everything is a static page driven by `data.json` β€” the verbatim output of the original experimental run. ## The exhibit A frozen Qwen2.5-3B-Instruct has a confidently memorized **wrong** answer to twelve questions, and its correction notebook is fed from a **noisy** source (some verified facts, some unreliable look-alikes). Drag the lever through six guardian versions and watch the share of correct answers climb: | stage | guardian | corrected | |---|---|---| | sick | no defense | 50.0% Β· 6/12 | | 1.0 | same-family clone arbiter | 66.7% Β· 8/12 | | 2.0 | live Wikipedia retrieval | 66.7% Β· 8/12 | | 2.1 | more retrieval | 66.7% Β· 8/12 | | 2.2 | three targeted fixes | 91.7% Β· 11/12 | | 2.3 | final calibration | 100% Β· 12/12 | ## What the numbers say (the honest middle) - **The 66.7% plateau is real.** Three different guardians (1.0, 2.0, 2.1) all stop at the same ceiling. Guardian 1.0's clone arbiter shares the subject's blind spots. - **The plateau is not stagnation β€” it's churn.** Each step fixes some traps while breaking others (the readout shows `+fixed / βˆ’broken`); net change is zero across the plateau. - **Several traps regress before they settle.** Venus (#46) goes `correct β†’ wrong β†’ correct β†’ wrong β†’ wrong β†’ correct` across the six stages β€” the path to 100% is not monotonic, and that is shown openly, not smoothed over. Only Guardian 2.2 (verbatim-quote check, namesake relevance gate, soft threshold) breaks the ceiling at 91.7%, and Guardian 2.3 (calibration) closes it at 100%. An independent Qwen2.5-7B reader/judge with Wikipedia adjudicated the v2 stages. ## Data and attribution Subject model **Qwen2.5-3B-Instruct**; arbiters **Qwen2.5-7B-Instruct** (same-family clone) and **Wikipedia retrieval + 7B reader/judge** (both Apache-2.0, Alibaba Cloud). Wikipedia content Β© its authors (CC BY-SA). Run on a single RTX 3090. No model weights are redistributed here β€” only aggregate verdicts and counts. Demo code and data: MIT. Source code, raw per-stage JSON results, and methodology document: