| # Student Simulation v2 |
|
|
| Quick reference. See `runall.sh` for the full pipeline. |
|
|
| ## 关键变更(v2 vs v1) |
|
|
| 1. **公式语义**: `x_new = x - (1 - α) · P · h`,α=1 不变,α=0 完全压制 |
| 2. **Sweep 范围**: α ∈ [0, 1](v1 范围越界导致崩溃伪迹) |
| 3. **方向版本**: 只保留 `v1_raw` 和新版 `v_pca_subspace`(k=3 子空间) |
| 4. **新功能**: `JointResidualSteerer` 防止跨维度代偿 |
| 5. **新指标**: `count_real_monitoring()` 区分真反思和填充词 |
| 6. **新指标**: `is_collapsed()` 用 4-gram 重复 + 长度比,比 v1 更稳 |
| 7. **08b**: attention 输出诊断(informational only) |
| 8. **10_infer**: 加入 runall 作为 sanity check |
| 9. **删除**: LLM rater (11_llm_quality_rating.py) |
| |
| ## 用法 |
| |
| ```bash |
| # 单卡完整跑 |
| bash runall.sh |
| |
| # 启用 anti-leak joint steering |
| JOINT=1 bash runall.sh |
| |
| # 只跑某些 stage |
| STAGES=8,8b,9,10 bash runall.sh |
| |
| # 只对一个题做 inference |
| python scripts/10_infer.py --dim planning --alphas 1.0 0.5 0.0 \ |
| --problem "Find x such that x^2=49" |
| ``` |
| |
| ## 目录 |
| |
| ``` |
| data/ |
| models/ # Qwen3-30B-A3B-Thinking-2507 |
| cots/ # raw + labeled CoTs |
| routing/ # router top-k dumps |
| activations/ # decision-point residuals |
| checkpoints/ |
| planning_v1_raw.pt |
| planning_v_pca_subspace.pt # 新版 k=3 子空间 |
| monitoring_v1_raw.pt |
| monitoring_v_pca_subspace.pt |
| results/ |
| sweep_log.jsonl # 含 steered_text |
| final_report.md |
| attention_diagnostic.{json,png} # 新 |
| infer_sanity_planning.json # 新 |
| infer_sanity_monitoring.json # 新 |
| logs/ |
| ``` |
| |
| ## 关键 config (`configs/model.py`) |
| |
| ```python |
| ALPHA_SWEEP = [0.0, 0.1, 0.2, 0.3, 0.5, 0.75, 1.0] |
| DIRECTION_VERSIONS = ["v1_raw", "v_pca_subspace"] |
| PCA_SUBSPACE_K = 3 |
| ANTI_LEAK_BETA = 0.3 |
| GEN_CONFIG["max_new_tokens"] = 12000 # 之前 4096 太小 |
| GEN_CONFIG_FAST["max_new_tokens"] = 8192 # 之前 1024 太小 |
| ``` |
| |