File size: 3,593 Bytes
6c0e47f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# OCC Collapse Mechanism — Runbook
## Session: 2026-05-11

### JOBS RUNNING (5 total, all session e95fd6cc)

| Job ID | Hardware | Script | Timeout | Status |
|--------|----------|--------|---------|--------|
| `6a0236d6317220dbbd1a7c07` | H200 | occ_debate_collapse_mechanism_v3.py | 24h | RUNNING |
| `6a0236d6aff1cd33e8f33ee6` | a10g-large | occ_cheap_baselines.py | 6h | RUNNING |
| `6a0236d6317220dbbd1a7c09` | a10g-large | occ_strong_baselines.py | 6h | RUNNING |
| `6a022292aff1cd33e8f33ded` | a10g-large | occ_strong_baselines.py (older) | 6h | RUNNING |
| `6a022033317220dbbd1a7b8c` | a10g-large | occ_cheap_baselines.py (older) | 6h | RUNNING |

### DO NOT SUBMIT NEW JOBS until these complete. Session ID e95fd6cc is shared — new job submission WILL cancel all running jobs.

### Data locations (on narcolepticchicken/occ-stack Hub)

| File | Produced by |
|------|-------------|
| `reports/debate_collapse_mechanism_results.json` | Mechanism v3 (pushes incrementally after each condition) |
| `reports/cheap_baselines_results.json` | Cheap baselines |
| `reports/strong_baselines_results.json` | Strong baselines |
| `reports/debate_extended_baselines_2seed.json` | Pre-existing v2 data (88.3% → 56.7% collapse) |

### When mechanism data arrives, run:

```bash
# 1. Download results
python -c "
from huggingface_hub import hf_hub_download
p = hf_hub_download('narcolepticchicken/occ-stack', 'reports/debate_collapse_mechanism_results.json')
import shutil; shutil.copy(p, './debate_collapse_mechanism_results.json')
"

# 2. Run the analysis harness (v2.1, handles v2+v3 formats)
python jobs/analyze_collapse.py debate_collapse_mechanism_results.json

# 3. Outputs: reports/analysis/
#    - condition_summary.csv
#    - per_topic_outcomes.csv
#    - round_flip_matrix.csv
#    - hypothesis_verdicts.json
#    - fig_accuracy_by_condition.png
#    - fig_honest_retention.png
#    - fig_flip_rate.png
#    - fig_adversary_skill.png
```

### Then fill v13 memo:

```bash
# Fill {VALUE} placeholders in reports/v13_mechanism_memo.md
# Data comes from: reports/analysis/condition_summary.csv + hypothesis_verdicts.json
```

### Infrastructure

- Analysis harness: `jobs/analyze_collapse.py` (v2.1 - handles per_seed and seeds keys)
- v13 memo template: `reports/v13_mechanism_memo.md`
- All scripts: `narcolepticchicken/occ-stack` on Hub

### Pre-registered hypotheses

Evaluated automatically by analysis harness using rules in HYPOTHESIS_RULES dict:

| Hypothesis | Mechanism |
|-----------|-----------|
| H1: Volume amplification | equal_token_unequal_turn vs baseline_1round_traced |
| H2: Turn-order effect | randomized_order_3round vs equal_3round_traced |
| H3: Voting vulnerability | judge_vote + confidence_weighted vs equal_3round_traced |
| H4: Contamination | Honest retention rate round 3 |
| H5: Confidence distortion | confidence_weighted vs equal_3round_traced |
| H6: Skill dependency | weak vs normal vs strong vs oracle adversary |
| H7: Topic vulnerability | Per-topic variance in collapse |

### Expected results (from 2-seed pilot)

- 1-round baseline: 88.3% accuracy
- 3-round equal: 56.7% accuracy (32pp collapse)
- Random 25% drop: 85.0% with 26.5% token savings
- OCC credit: prevents catastrophe but doesn't beat random gating at moderate budgets

### Key fix: judge_vote_3round

v2 returned 0/30 because extract_position() checked first-line prefixes.
v3 uses extract_judge_answer() with regex \b(yes|no)\b + last-occurrence tiebreaker.
The judge prompt now asks "Based on the debate, the correct answer is: " and generates 32 tokens at temperature 0.1.