scFATE / REPRODUCE.md
farhan-ahmad's picture
Initial NeurIPS 2026 release: backbones, flow heads, SciPlex3 Path-B teachers/students, paper-table result JSONs
6d96029 verified

scFATE NeurIPS 2026 β€” Reproduce the Paper

This directory ships every checkpoint behind the paper's Table 1 + a reproduce.sh for each run. All 31 paper-headline runs are listed below.

0. Prerequisites

git clone https://huggingface.co/Angione-Lab/scFATE
cd scFATE/code  # source code is in scfate-code submodule
uv venv && uv pip install -e .  # or pip install -r requirements.txt

Then download datasets:

huggingface-cli download Angione-Lab/scFATE-datasets --local-dir datasets/scFATE/processed --repo-type dataset

1. Dependency graph

backbone (rotation autoencoder)         β€” hf-assets/checkpoints/<dataset>/
    └─→ flow head (s1, s2, s3)          β€” runs/<dataset>_flow_*/flow_best.pt
            └─→ reflow K=2 (K562)       β€” runs/*_reflow_K2_*_s1/flow_best.pt
            └─→ teachers Γ—18 (SciPlex3) β€” runs/*_priorkrr_V2B_s{1..9}, *_priornone_V2B_s{1..9}
                    └─→ student Γ—7     β€” runs/*_reflow_ensemble_mixed18_K16_V2B_s{1..7}

2. Per-run reproduction

Each runs/<run_dir>/ contains:

  • flow_best.pt β€” checkpoint with embedded hparams (load via torch.load, look at top-level keys or ckpt['hparams'])
  • config.json β€” extracted hparams + result-JSON pointer + dataset path
  • reproduce.sh β€” exact training command, ready to run
  • flow_metrics.jsonl β€” training trajectory
  • krr_prior.pkl β€” KRR-init prior (if prior=krr)

3. Paper-headline runs

Paper row Dataset Run dir Result JSON Reproduce
Norman seed 1 CRISPRa Norman runs/b200_norman_flow_e115_krrinit_s02_mask_30k experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_K128.json bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k/reproduce.sh
Norman seed 2 CRISPRa Norman runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed2 experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_seed2.json bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed2/reproduce.sh
Norman seed 3 CRISPRa Norman runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed3 experiments/results/fair_comparison/norman_rotation_vs_direct__flow__b200_norman_flow_e115_krrinit_s02_mask_30k_seed3.json bash runs/b200_norman_flow_e115_krrinit_s02_mask_30k_seed3/reproduce.sh
RPE1 seed 1 Replogle RPE1 runs/b200_rpe1_flow_block_krrinit_mask_30k_s1 experiments/results/fair_comparison/rpe1_rotation_vs_direct__flow__b200_rpe1_flow_block_krrinit_mask_30k_s1_rpe1_block_K128.json bash runs/b200_rpe1_flow_block_krrinit_mask_30k_s1/reproduce.sh
K562 base flow (teacher for reflow) Replogle K562 runs/b200_k562_flow_bs2048_krrinit_mask_30k experiments/results/fair_comparison/replogle_rotation_vs_direct__flow__b200_k562_flow_bs2048_krrinit_mask_30k_K128.json bash runs/b200_k562_flow_bs2048_krrinit_mask_30k/reproduce.sh
K562 reflow K=2 (paper headline 81.2) Replogle K562 runs/b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1 experiments/results/fair_comparison/replogle_rotation_vs_direct__flow__b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1_reflow_K2_bracket1p0_ens5seed_sigmainf0p15_antithetic_Kper128.json bash runs/b200_k562_flow_bs2048_krrinit_mask_30k_reflow_K2_nomask_bracket1ep0_s1/reproduce.sh
SciPlex3 priornone teacher s1 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s1 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s1/reproduce.sh
SciPlex3 priorkrr teacher s1 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s1 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s1/reproduce.sh
SciPlex3 priornone teacher s2 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s2 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s2/reproduce.sh
SciPlex3 priorkrr teacher s2 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s2 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s2/reproduce.sh
SciPlex3 priornone teacher s3 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s3 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s3/reproduce.sh
SciPlex3 priorkrr teacher s3 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s3 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s3/reproduce.sh
SciPlex3 priornone teacher s4 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s4 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s4/reproduce.sh
SciPlex3 priorkrr teacher s4 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s4 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s4/reproduce.sh
SciPlex3 priornone teacher s5 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s5 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s5/reproduce.sh
SciPlex3 priorkrr teacher s5 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s5 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s5/reproduce.sh
SciPlex3 priornone teacher s6 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s6 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s6/reproduce.sh
SciPlex3 priorkrr teacher s6 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s6 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s6/reproduce.sh
SciPlex3 priornone teacher s7 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s7 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s7/reproduce.sh
SciPlex3 priorkrr teacher s7 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s7 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s7/reproduce.sh
SciPlex3 priornone teacher s8 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s8 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s8/reproduce.sh
SciPlex3 priorkrr teacher s8 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s8 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s8/reproduce.sh
SciPlex3 priornone teacher s9 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s9 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priornone_V2B_s9/reproduce.sh
SciPlex3 priorkrr teacher s9 SciPlex3 runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s9 β€” bash runs/b200_sciplex3_delta_flow_mv_v2_sig0p3_priorkrr_V2B_s9/reproduce.sh
SciPlex3 mixed-18 K=16 student s1 (paper headline 70.0) SciPlex3 runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s1 experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s1/reproduce.sh
SciPlex3 mixed-18 K=16 student s2 (paper headline 70.0) SciPlex3 runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s2 experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s2/reproduce.sh
SciPlex3 mixed-18 K=16 student s3 (paper headline 70.0) SciPlex3 runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s3 experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s3/reproduce.sh
SciPlex3 mixed-18 K=16 student s4 (paper headline 70.0) SciPlex3 runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s4 experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s4/reproduce.sh
SciPlex3 mixed-18 K=16 student s5 (paper headline 70.0) SciPlex3 runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s5 experiments/results/sciplex3_iter196_reflow_ensemble_mixed18_K16_N5.json bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s5/reproduce.sh
SciPlex3 mixed-18 K=16 student s6 (paper headline 70.0) SciPlex3 runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s6 experiments/results/sciplex3_iter197_reflow_ensemble_mixed18_K16_N7.json bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s6/reproduce.sh
SciPlex3 mixed-18 K=16 student s7 (paper headline 70.0) SciPlex3 runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s7 experiments/results/sciplex3_iter197_reflow_ensemble_mixed18_K16_N7.json bash runs/b200_sciplex3_delta_flow_reflow_ensemble_mixed18_K16_V2B_s7/reproduce.sh

4. Eval pipeline

.venv/bin/python scripts/eval_fair_comparison.py \\
    --dataset <dataset_name> --flow_ckpt <run_dir>/flow_best.pt \\
    --multiview data/gene_embeddings/<dataset>_multiview.pt --K_eval 128

5. Known caveats

  • SciPlex3 paper-row cosine (0.491) and PDE (0.483) numbers in Table 1 are not in any saved results JSON; only mean DA = 0.700 reproduces from iter214_multi_metric_router_ensemble.json (metric_da.tanimoto_morgan2048). Re-eval needed before final submission.
  • Norman cos/PDE in Table 1 differ by ~0.002 from *_K128.json (rounded).
  • K562 reflow cos/PDE in Table 1 differ by ~0.007 from the saved ensemble JSON.