Session runbook for mechanism/baselines jobs
Browse files- SESSION_RUNBOOK.md +87 -0
SESSION_RUNBOOK.md
ADDED
|
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# OCC Collapse Mechanism — Runbook
|
| 2 |
+
## Session: 2026-05-11
|
| 3 |
+
|
| 4 |
+
### JOBS RUNNING (5 total, all session e95fd6cc)
|
| 5 |
+
|
| 6 |
+
| Job ID | Hardware | Script | Timeout | Status |
|
| 7 |
+
|--------|----------|--------|---------|--------|
|
| 8 |
+
| `6a0236d6317220dbbd1a7c07` | H200 | occ_debate_collapse_mechanism_v3.py | 24h | RUNNING |
|
| 9 |
+
| `6a0236d6aff1cd33e8f33ee6` | a10g-large | occ_cheap_baselines.py | 6h | RUNNING |
|
| 10 |
+
| `6a0236d6317220dbbd1a7c09` | a10g-large | occ_strong_baselines.py | 6h | RUNNING |
|
| 11 |
+
| `6a022292aff1cd33e8f33ded` | a10g-large | occ_strong_baselines.py (older) | 6h | RUNNING |
|
| 12 |
+
| `6a022033317220dbbd1a7b8c` | a10g-large | occ_cheap_baselines.py (older) | 6h | RUNNING |
|
| 13 |
+
|
| 14 |
+
### DO NOT SUBMIT NEW JOBS until these complete. Session ID e95fd6cc is shared — new job submission WILL cancel all running jobs.
|
| 15 |
+
|
| 16 |
+
### Data locations (on narcolepticchicken/occ-stack Hub)
|
| 17 |
+
|
| 18 |
+
| File | Produced by |
|
| 19 |
+
|------|-------------|
|
| 20 |
+
| `reports/debate_collapse_mechanism_results.json` | Mechanism v3 (pushes incrementally after each condition) |
|
| 21 |
+
| `reports/cheap_baselines_results.json` | Cheap baselines |
|
| 22 |
+
| `reports/strong_baselines_results.json` | Strong baselines |
|
| 23 |
+
| `reports/debate_extended_baselines_2seed.json` | Pre-existing v2 data (88.3% → 56.7% collapse) |
|
| 24 |
+
|
| 25 |
+
### When mechanism data arrives, run:
|
| 26 |
+
|
| 27 |
+
```bash
|
| 28 |
+
# 1. Download results
|
| 29 |
+
python -c "
|
| 30 |
+
from huggingface_hub import hf_hub_download
|
| 31 |
+
p = hf_hub_download('narcolepticchicken/occ-stack', 'reports/debate_collapse_mechanism_results.json')
|
| 32 |
+
import shutil; shutil.copy(p, './debate_collapse_mechanism_results.json')
|
| 33 |
+
"
|
| 34 |
+
|
| 35 |
+
# 2. Run the analysis harness (v2.1, handles v2+v3 formats)
|
| 36 |
+
python jobs/analyze_collapse.py debate_collapse_mechanism_results.json
|
| 37 |
+
|
| 38 |
+
# 3. Outputs: reports/analysis/
|
| 39 |
+
# - condition_summary.csv
|
| 40 |
+
# - per_topic_outcomes.csv
|
| 41 |
+
# - round_flip_matrix.csv
|
| 42 |
+
# - hypothesis_verdicts.json
|
| 43 |
+
# - fig_accuracy_by_condition.png
|
| 44 |
+
# - fig_honest_retention.png
|
| 45 |
+
# - fig_flip_rate.png
|
| 46 |
+
# - fig_adversary_skill.png
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
### Then fill v13 memo:
|
| 50 |
+
|
| 51 |
+
```bash
|
| 52 |
+
# Fill {VALUE} placeholders in reports/v13_mechanism_memo.md
|
| 53 |
+
# Data comes from: reports/analysis/condition_summary.csv + hypothesis_verdicts.json
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
### Infrastructure
|
| 57 |
+
|
| 58 |
+
- Analysis harness: `jobs/analyze_collapse.py` (v2.1 - handles per_seed and seeds keys)
|
| 59 |
+
- v13 memo template: `reports/v13_mechanism_memo.md`
|
| 60 |
+
- All scripts: `narcolepticchicken/occ-stack` on Hub
|
| 61 |
+
|
| 62 |
+
### Pre-registered hypotheses
|
| 63 |
+
|
| 64 |
+
Evaluated automatically by analysis harness using rules in HYPOTHESIS_RULES dict:
|
| 65 |
+
|
| 66 |
+
| Hypothesis | Mechanism |
|
| 67 |
+
|-----------|-----------|
|
| 68 |
+
| H1: Volume amplification | equal_token_unequal_turn vs baseline_1round_traced |
|
| 69 |
+
| H2: Turn-order effect | randomized_order_3round vs equal_3round_traced |
|
| 70 |
+
| H3: Voting vulnerability | judge_vote + confidence_weighted vs equal_3round_traced |
|
| 71 |
+
| H4: Contamination | Honest retention rate round 3 |
|
| 72 |
+
| H5: Confidence distortion | confidence_weighted vs equal_3round_traced |
|
| 73 |
+
| H6: Skill dependency | weak vs normal vs strong vs oracle adversary |
|
| 74 |
+
| H7: Topic vulnerability | Per-topic variance in collapse |
|
| 75 |
+
|
| 76 |
+
### Expected results (from 2-seed pilot)
|
| 77 |
+
|
| 78 |
+
- 1-round baseline: 88.3% accuracy
|
| 79 |
+
- 3-round equal: 56.7% accuracy (32pp collapse)
|
| 80 |
+
- Random 25% drop: 85.0% with 26.5% token savings
|
| 81 |
+
- OCC credit: prevents catastrophe but doesn't beat random gating at moderate budgets
|
| 82 |
+
|
| 83 |
+
### Key fix: judge_vote_3round
|
| 84 |
+
|
| 85 |
+
v2 returned 0/30 because extract_position() checked first-line prefixes.
|
| 86 |
+
v3 uses extract_judge_answer() with regex \b(yes|no)\b + last-occurrence tiebreaker.
|
| 87 |
+
The judge prompt now asks "Based on the debate, the correct answer is: " and generates 32 tokens at temperature 0.1.
|