"""One-shot script to pull Run 8 artifacts from HF Hub.

Run 8 root-cause fix:
  BETA_RANK = 0.25 → 0.0 (disabled unlikeliness shaping).

Why: He et al. 2506.02355 designed unlikeliness reward for BINARY-reward
theorem proving. Our classification-style RLVR with partial-credit rewards
(level_accuracy × calibration ∈ [0,1]) INVERTED the gradient — GRPO paid
more for wrong R1 predictions than correct R2 predictions on db_snapshot
(Run 7 training log shows 0.773 vs 0.751).

Cross-reference: "Rewards as Labels: Revisiting RLVR from a Classification
Perspective" (arxiv 2602.05630) confirms GRPO's Gradient Misassignment in
Positives for classification tasks.

Everything else from Run 7 is preserved:
  * β=0.04 KL (stabilized late drift)
  * μ=2 PPO epochs
  * 78 env-verified warmup traces
  * 4 forced variants
  * R-level balance bonus

Theory predictions:
  * Eval accuracy: 46% (Run 7) → 85-92% (high confidence)
  * R5 recall: preserved at ≥95%
  * No R1 over-prediction in confusion matrix
  * task_force_push_release recovered to R2

GUARDRAIL: if eval R5 recall < 95%, revert to Run 6.1 adapter.
"""
from __future__ import annotations

import os
import shutil
import subprocess
from huggingface_hub import snapshot_download


TARGET_DIR = "training_runs/run_8_disable_unlikeliness"


def main() -> None:
    if os.path.exists(TARGET_DIR):
        shutil.rmtree(TARGET_DIR)
    token = subprocess.check_output(["hf", "auth", "token"], text=True).strip()
    path = snapshot_download(
        repo_id="chane335/permanence-artifacts",
        repo_type="dataset",
        local_dir=TARGET_DIR,
        token=token,
    )
    total = 0
    for root, _dirs, files in os.walk(path):
        for f in files:
            rel = os.path.relpath(os.path.join(root, f), path)
            if ".cache" in rel:
                continue
            size = os.path.getsize(os.path.join(root, f))
            total += size
            print(f"  {size:>12,} bytes  {rel}")
    print(f"TOTAL: {total/1e6:.1f} MB")
    print(f"\nCheck eval first: python -c \"import json; "
          f"print(json.load(open('{TARGET_DIR}/eval/results.json')))\"")


if __name__ == "__main__":
    main()