File size: 9,778 Bytes
6d96029
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# scFATE NeurIPS 2026 β€” Reproduce the Paper

This directory ships every checkpoint behind the paper's Table 1 + a `reproduce.sh`
for each run. All 31 paper-headline runs are listed below.

## 0. Prerequisites

```bash
git clone https://huggingface.co/Angione-Lab/scFATE
cd scFATE/code  # source code is in scfate-code submodule
uv venv && uv pip install -e .  # or pip install -r requirements.txt
```

Then download datasets:
```bash
huggingface-cli download Angione-Lab/scFATE-datasets --local-dir datasets/scFATE/processed --repo-type dataset
```

## 1. Dependency graph

```
backbone (rotation autoencoder)         β€” hf-assets/checkpoints/<dataset>/
    └─→ flow head (s1, s2, s3)          β€” runs/<dataset>_flow_*/flow_best.pt
            └─→ reflow K=2 (K562)       β€” runs/*_reflow_K2_*_s1/flow_best.pt
            └─→ teachers Γ—18 (SciPlex3) β€” runs/*_priorkrr_V2B_s{1..9}, *_priornone_V2B_s{1..9}
                    └─→ student Γ—7     β€” runs/*_reflow_ensemble_mixed18_K16_V2B_s{1..7}
```

## 2. Per-run reproduction

Each `runs/<run_dir>/` contains:
- `flow_best.pt` β€” checkpoint with embedded hparams (load via `torch.load`, look at top-level keys or `ckpt['hparams']`)
- `config.json` β€” extracted hparams + result-JSON pointer + dataset path
- `reproduce.sh` β€” exact training command, ready to run
- `flow_metrics.jsonl` β€” training trajectory
- `krr_prior.pkl` β€” KRR-init prior (if `prior=krr`)

## 3. Paper-headline runs

| Paper row | Dataset | Run dir | Result JSON | Reproduce |
|---|---|---|---|---|
| Norman seed 1 | CRISPRa Norman | `runs/b200_norman_flow_e115_krrinit_s02_mask_30k` | `experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_K128.json` | `bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k/reproduce.sh` |
| Norman seed 2 | CRISPRa Norman | `runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed2` | `experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_seed2.json` | `bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed2/reproduce.sh` |
| Norman seed 3 | CRISPRa Norman | `runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed3` | `experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_seed3.json` | `bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed3/reproduce.sh` |
| RPE1 seed 1 | Replogle RPE1 | `runs/b200_rpe1_flow_block_krrinit_mask_30k_s1` | `experiments/results/fair_comparison/rpe1_rotation_vs_direct__flow__b200_rpe1_flow_block_krrinit_mask_30k_s1_rpe1_block_K128.json` | `bash runs/b200_rpe1_flow_block_krrinit_mask_30k_s1/reproduce.sh` |
| K562 base flow (teacher for reflow) | Replogle K562 | `runs/b200_k562_flow_bs2048_krrinit_mask_30k` | `experiments/results/fair_comparison/replogle_rotation_vs_direct__flow__b200_k562_flow_bs2048_krrinit_mask_30k_K128.json` | `bash runs/b200_k562_flow_bs2048_krrinit_mask_30k/reproduce.sh` |
| K562 reflow K=2 (paper headline 81.2) | Replogle K562 | `runs/b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1` | `experiments/results/fair_comparison/replogle_rotation_vs_direct__flow__b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1_reflow_K2_bracket1p0_ens5seed_sigmainf0p15_antithetic_Kper128.json` | `bash runs/b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1/reproduce.sh` |
| SciPlex3 priornone teacher s1 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s1` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s1/reproduce.sh` |
| SciPlex3 priorkrr teacher s1 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s1` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s1/reproduce.sh` |
| SciPlex3 priornone teacher s2 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s2` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s2/reproduce.sh` |
| SciPlex3 priorkrr teacher s2 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s2` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s2/reproduce.sh` |
| SciPlex3 priornone teacher s3 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s3` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s3/reproduce.sh` |
| SciPlex3 priorkrr teacher s3 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s3` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s3/reproduce.sh` |
| SciPlex3 priornone teacher s4 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s4` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s4/reproduce.sh` |
| SciPlex3 priorkrr teacher s4 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s4` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s4/reproduce.sh` |
| SciPlex3 priornone teacher s5 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s5` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s5/reproduce.sh` |
| SciPlex3 priorkrr teacher s5 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s5` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s5/reproduce.sh` |
| SciPlex3 priornone teacher s6 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s6` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s6/reproduce.sh` |
| SciPlex3 priorkrr teacher s6 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s6` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s6/reproduce.sh` |
| SciPlex3 priornone teacher s7 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s7` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s7/reproduce.sh` |
| SciPlex3 priorkrr teacher s7 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s7` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s7/reproduce.sh` |
| SciPlex3 priornone teacher s8 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s8` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s8/reproduce.sh` |
| SciPlex3 priorkrr teacher s8 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s8` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s8/reproduce.sh` |
| SciPlex3 priornone teacher s9 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s9` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s9/reproduce.sh` |
| SciPlex3 priorkrr teacher s9 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s9` | β€” | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s9/reproduce.sh` |
| SciPlex3 mixed-18 K=16 student s1 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s1` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s1/reproduce.sh` |
| SciPlex3 mixed-18 K=16 student s2 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s2` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s2/reproduce.sh` |
| SciPlex3 mixed-18 K=16 student s3 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s3` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s3/reproduce.sh` |
| SciPlex3 mixed-18 K=16 student s4 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s4` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s4/reproduce.sh` |
| SciPlex3 mixed-18 K=16 student s5 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s5` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s5/reproduce.sh` |
| SciPlex3 mixed-18 K=16 student s6 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s6` | `experiments/results/sciplex3_iter197_reflow_ensemble_mixed18_K16_N7.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s6/reproduce.sh` |
| SciPlex3 mixed-18 K=16 student s7 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s7` | `experiments/results/sciplex3_iter197_reflow_ensemble_mixed18_K16_N7.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s7/reproduce.sh` |

## 4. Eval pipeline

```bash
.venv/bin/python scripts/eval_fair_comparison.py \\
    --dataset <dataset_name> --flow_ckpt <run_dir>/flow_best.pt \\
    --multiview data/gene_embeddings/<dataset>_multiview.pt --K_eval 128
```

## 5. Known caveats

- SciPlex3 paper-row cosine (0.491) and PDE (0.483) numbers in Table 1 are not in any saved
  results JSON; only mean DA = 0.700 reproduces from `iter214_multi_metric_router_ensemble.json`
  (`metric_da.tanimoto_morgan2048`). Re-eval needed before final submission.
- Norman cos/PDE in Table 1 differ by ~0.002 from `*_K128.json` (rounded).
- K562 reflow cos/PDE in Table 1 differ by ~0.007 from the saved ensemble JSON.