occ-stack / SESSION_RUNBOOK.md

Session runbook for mechanism/baselines jobs

6c0e47f verified 20 days ago

3.59 kB

	# OCC Collapse Mechanism — Runbook
	## Session: 2026-05-11

	### JOBS RUNNING (5 total, all session e95fd6cc)

	\| Job ID \| Hardware \| Script \| Timeout \| Status \|
	\|--------\|----------\|--------\|---------\|--------\|
	\| `6a0236d6317220dbbd1a7c07` \| H200 \| occ_debate_collapse_mechanism_v3.py \| 24h \| RUNNING \|
	\| `6a0236d6aff1cd33e8f33ee6` \| a10g-large \| occ_cheap_baselines.py \| 6h \| RUNNING \|
	\| `6a0236d6317220dbbd1a7c09` \| a10g-large \| occ_strong_baselines.py \| 6h \| RUNNING \|
	\| `6a022292aff1cd33e8f33ded` \| a10g-large \| occ_strong_baselines.py (older) \| 6h \| RUNNING \|
	\| `6a022033317220dbbd1a7b8c` \| a10g-large \| occ_cheap_baselines.py (older) \| 6h \| RUNNING \|

	### DO NOT SUBMIT NEW JOBS until these complete. Session ID e95fd6cc is shared — new job submission WILL cancel all running jobs.

	### Data locations (on narcolepticchicken/occ-stack Hub)

	\| File \| Produced by \|
	\|------\|-------------\|
	\| `reports/debate_collapse_mechanism_results.json` \| Mechanism v3 (pushes incrementally after each condition) \|
	\| `reports/cheap_baselines_results.json` \| Cheap baselines \|
	\| `reports/strong_baselines_results.json` \| Strong baselines \|
	\| `reports/debate_extended_baselines_2seed.json` \| Pre-existing v2 data (88.3% → 56.7% collapse) \|

	### When mechanism data arrives, run:

	```bash
	# 1. Download results
	python -c "
	from huggingface_hub import hf_hub_download
	p = hf_hub_download('narcolepticchicken/occ-stack', 'reports/debate_collapse_mechanism_results.json')
	import shutil; shutil.copy(p, './debate_collapse_mechanism_results.json')
	"

	# 2. Run the analysis harness (v2.1, handles v2+v3 formats)
	python jobs/analyze_collapse.py debate_collapse_mechanism_results.json

	# 3. Outputs: reports/analysis/
	# - condition_summary.csv
	# - per_topic_outcomes.csv
	# - round_flip_matrix.csv
	# - hypothesis_verdicts.json
	# - fig_accuracy_by_condition.png
	# - fig_honest_retention.png
	# - fig_flip_rate.png
	# - fig_adversary_skill.png
	```

	### Then fill v13 memo:

	```bash
	# Fill {VALUE} placeholders in reports/v13_mechanism_memo.md
	# Data comes from: reports/analysis/condition_summary.csv + hypothesis_verdicts.json
	```

	### Infrastructure

	- Analysis harness: `jobs/analyze_collapse.py` (v2.1 - handles per_seed and seeds keys)
	- v13 memo template: `reports/v13_mechanism_memo.md`
	- All scripts: `narcolepticchicken/occ-stack` on Hub

	### Pre-registered hypotheses

	Evaluated automatically by analysis harness using rules in HYPOTHESIS_RULES dict:

	\| Hypothesis \| Mechanism \|
	\|-----------\|-----------\|
	\| H1: Volume amplification \| equal_token_unequal_turn vs baseline_1round_traced \|
	\| H2: Turn-order effect \| randomized_order_3round vs equal_3round_traced \|
	\| H3: Voting vulnerability \| judge_vote + confidence_weighted vs equal_3round_traced \|
	\| H4: Contamination \| Honest retention rate round 3 \|
	\| H5: Confidence distortion \| confidence_weighted vs equal_3round_traced \|
	\| H6: Skill dependency \| weak vs normal vs strong vs oracle adversary \|
	\| H7: Topic vulnerability \| Per-topic variance in collapse \|

	### Expected results (from 2-seed pilot)

	- 1-round baseline: 88.3% accuracy
	- 3-round equal: 56.7% accuracy (32pp collapse)
	- Random 25% drop: 85.0% with 26.5% token savings
	- OCC credit: prevents catastrophe but doesn't beat random gating at moderate budgets

	### Key fix: judge_vote_3round

	v2 returned 0/30 because extract_position() checked first-line prefixes.
	v3 uses extract_judge_answer() with regex \b(yes\|no)\b + last-occurrence tiebreaker.
	The judge prompt now asks "Based on the debate, the correct answer is: " and generates 32 tokens at temperature 0.1.