# scFATE NeurIPS 2026 — Reproduce the Paper This directory ships every checkpoint behind the paper's Table 1 + a `reproduce.sh` for each run. All 31 paper-headline runs are listed below. ## 0. Prerequisites ```bash git clone https://huggingface.co/Angione-Lab/scFATE cd scFATE/code # source code is in scfate-code submodule uv venv && uv pip install -e . # or pip install -r requirements.txt ``` Then download datasets: ```bash huggingface-cli download Angione-Lab/scFATE-datasets --local-dir datasets/scFATE/processed --repo-type dataset ``` ## 1. Dependency graph ``` backbone (rotation autoencoder) — hf-assets/checkpoints// └─→ flow head (s1, s2, s3) — runs/_flow_*/flow_best.pt └─→ reflow K=2 (K562) — runs/*_reflow_K2_*_s1/flow_best.pt └─→ teachers ×18 (SciPlex3) — runs/*_priorkrr_V2B_s{1..9}, *_priornone_V2B_s{1..9} └─→ student ×7 — runs/*_reflow_ensemble_mixed18_K16_V2B_s{1..7} ``` ## 2. Per-run reproduction Each `runs//` contains: - `flow_best.pt` — checkpoint with embedded hparams (load via `torch.load`, look at top-level keys or `ckpt['hparams']`) - `config.json` — extracted hparams + result-JSON pointer + dataset path - `reproduce.sh` — exact training command, ready to run - `flow_metrics.jsonl` — training trajectory - `krr_prior.pkl` — KRR-init prior (if `prior=krr`) ## 3. Paper-headline runs | Paper row | Dataset | Run dir | Result JSON | Reproduce | |---|---|---|---|---| | Norman seed 1 | CRISPRa Norman | `runs/b200_norman_flow_e115_krrinit_s02_mask_30k` | `experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_K128.json` | `bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k/reproduce.sh` | | Norman seed 2 | CRISPRa Norman | `runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed2` | `experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_seed2.json` | `bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed2/reproduce.sh` | | Norman seed 3 | CRISPRa Norman | `runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed3` | `experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_seed3.json` | `bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed3/reproduce.sh` | | RPE1 seed 1 | Replogle RPE1 | `runs/b200_rpe1_flow_block_krrinit_mask_30k_s1` | `experiments/results/fair_comparison/rpe1_rotation_vs_direct__flow__b200_rpe1_flow_block_krrinit_mask_30k_s1_rpe1_block_K128.json` | `bash runs/b200_rpe1_flow_block_krrinit_mask_30k_s1/reproduce.sh` | | K562 base flow (teacher for reflow) | Replogle K562 | `runs/b200_k562_flow_bs2048_krrinit_mask_30k` | `experiments/results/fair_comparison/replogle_rotation_vs_direct__flow__b200_k562_flow_bs2048_krrinit_mask_30k_K128.json` | `bash runs/b200_k562_flow_bs2048_krrinit_mask_30k/reproduce.sh` | | K562 reflow K=2 (paper headline 81.2) | Replogle K562 | `runs/b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1` | `experiments/results/fair_comparison/replogle_rotation_vs_direct__flow__b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1_reflow_K2_bracket1p0_ens5seed_sigmainf0p15_antithetic_Kper128.json` | `bash runs/b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1/reproduce.sh` | | SciPlex3 priornone teacher s1 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s1` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s1/reproduce.sh` | | SciPlex3 priorkrr teacher s1 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s1` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s1/reproduce.sh` | | SciPlex3 priornone teacher s2 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s2` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s2/reproduce.sh` | | SciPlex3 priorkrr teacher s2 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s2` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s2/reproduce.sh` | | SciPlex3 priornone teacher s3 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s3` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s3/reproduce.sh` | | SciPlex3 priorkrr teacher s3 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s3` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s3/reproduce.sh` | | SciPlex3 priornone teacher s4 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s4` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s4/reproduce.sh` | | SciPlex3 priorkrr teacher s4 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s4` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s4/reproduce.sh` | | SciPlex3 priornone teacher s5 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s5` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s5/reproduce.sh` | | SciPlex3 priorkrr teacher s5 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s5` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s5/reproduce.sh` | | SciPlex3 priornone teacher s6 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s6` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s6/reproduce.sh` | | SciPlex3 priorkrr teacher s6 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s6` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s6/reproduce.sh` | | SciPlex3 priornone teacher s7 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s7` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s7/reproduce.sh` | | SciPlex3 priorkrr teacher s7 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s7` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s7/reproduce.sh` | | SciPlex3 priornone teacher s8 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s8` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s8/reproduce.sh` | | SciPlex3 priorkrr teacher s8 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s8` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s8/reproduce.sh` | | SciPlex3 priornone teacher s9 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s9` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s9/reproduce.sh` | | SciPlex3 priorkrr teacher s9 | SciPlex3 | `runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s9` | — | `bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s9/reproduce.sh` | | SciPlex3 mixed-18 K=16 student s1 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s1` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s1/reproduce.sh` | | SciPlex3 mixed-18 K=16 student s2 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s2` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s2/reproduce.sh` | | SciPlex3 mixed-18 K=16 student s3 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s3` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s3/reproduce.sh` | | SciPlex3 mixed-18 K=16 student s4 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s4` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s4/reproduce.sh` | | SciPlex3 mixed-18 K=16 student s5 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s5` | `experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s5/reproduce.sh` | | SciPlex3 mixed-18 K=16 student s6 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s6` | `experiments/results/sciplex3_iter197_reflow_ensemble_mixed18_K16_N7.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s6/reproduce.sh` | | SciPlex3 mixed-18 K=16 student s7 (paper headline 70.0) | SciPlex3 | `runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s7` | `experiments/results/sciplex3_iter197_reflow_ensemble_mixed18_K16_N7.json` | `bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s7/reproduce.sh` | ## 4. Eval pipeline ```bash .venv/bin/python scripts/eval_fair_comparison.py \\ --dataset --flow_ckpt /flow_best.pt \\ --multiview data/gene_embeddings/_multiview.pt --K_eval 128 ``` ## 5. Known caveats - SciPlex3 paper-row cosine (0.491) and PDE (0.483) numbers in Table 1 are not in any saved results JSON; only mean DA = 0.700 reproduces from `iter214_multi_metric_router_ensemble.json` (`metric_da.tanimoto_morgan2048`). Re-eval needed before final submission. - Norman cos/PDE in Table 1 differ by ~0.002 from `*_K128.json` (rounded). - K562 reflow cos/PDE in Table 1 differ by ~0.007 from the saved ensemble JSON.