sophtang commited on
Commit
5b4b155
·
verified ·
1 Parent(s): e3236e5
Files changed (1) hide show
  1. scripts/README.md +6 -6
scripts/README.md CHANGED
@@ -1,6 +1,6 @@
1
- # Running Experiments with BranchSBM 🌳🧬
2
 
3
- This directory contains training scripts for all experiments with BranchSBM, including LiDAR navigation 🗻, simulating cell differentiation 🧫, and cell state perturbation modelling 🧬. This codebase contains code from the [Metric Flow Matching repo](https://github.com/kkapusniak/metric-flow-matching) ([Kapusniak et al. 2024](https://arxiv.org/abs/2405.14780)).
4
 
5
  ## Environment Installation
6
  ```
@@ -14,7 +14,7 @@ LiDAR data is taken from the [Generalized Schrödinger Bridge Matching repo](htt
14
 
15
  We use perturbation data from the [Tahoe-100M dataset](https://huggingface.co/datasets/tahoebio/Tahoe-100M) containing control DMSO-treated cell data and perturbed cell data.
16
 
17
- The raw data contains a total of 60K genes. We select the top 2000 highly variable genes (HVGs) and perform principal component analysis (PCA), to maximally capture the variance in the data via the top principal components (38% in the top-50 PCs). **Our goal is to learn the dynamic trajectories that map control cell clusters to the perturbd cell clusters.**
18
 
19
  **Specifically, we model the following perturbations**:
20
 
@@ -73,11 +73,11 @@ nohup ./lidar.sh > lidar.log 2>&1 &
73
 
74
  Evaluation will run automatically after the specified number of rollouts `--num_rollouts` is finished. To see metrics, go to `results/<experiment>/metrics/` or the end of `logs/<experiment>.log`.
75
 
76
- For Clonidine, `x1_1` indicates the cell cluster that is sampled from for training and `x1_2` is the held-out cell cluster. For Trametinib `x1_1` indicates the cell cluster that is sampled from for training and `x1_2` and `x1_3` are the held-out cell clusters.
77
 
78
  We report the following metrics for each of the clusters in our paper:
79
- 1. Maximum Mean Discrepancy (RBF-MMD) of simualted cell cluster with target cell cluster (same cell count).
80
- 2. 1-Wasserstein and 2-Wasserstein distances against full cell population in the cluster.
81
 
82
  ## Overview of Outputs
83
 
 
1
+ # Running Experiments with BranchSBM 🌳🧫
2
 
3
+ This directory contains training scripts for all experiments with BranchSBM, including LiDAR navigation 🗻, simulating cell differentiation 🧫, and cell state perturbation modelling 🧫. This codebase contains code from the [Metric Flow Matching repo](https://github.com/kkapusniak/metric-flow-matching) ([Kapusniak et al. 2024](https://arxiv.org/abs/2405.14780)).
4
 
5
  ## Environment Installation
6
  ```
 
14
 
15
  We use perturbation data from the [Tahoe-100M dataset](https://huggingface.co/datasets/tahoebio/Tahoe-100M) containing control DMSO-treated cell data and perturbed cell data.
16
 
17
+ The raw data contains a total of 60K genes. We select the top 2000 highly variable genes (HVGs) and perform principal component analysis (PCA), to maximally capture the variance in the data via the top principal components (38% in the top-50 PCs). **Our goal is to learn the dynamic trajectories that map control cell clusters to the perturbed cell clusters.**
18
 
19
  **Specifically, we model the following perturbations**:
20
 
 
73
 
74
  Evaluation will run automatically after the specified number of rollouts `--num_rollouts` is finished. To see metrics, go to `results/<experiment>/metrics/` or the end of `logs/<experiment>.log`.
75
 
76
+ For Clonidine, `x1_1` indicates the cell cluster that is sampled from for training and `x1_2` is the held-out cell cluster. For Trametinib, `x1_1` indicates the cell cluster that is sampled from for training, and `x1_2` and `x1_3` are the held-out cell clusters.
77
 
78
  We report the following metrics for each of the clusters in our paper:
79
+ 1. Maximum Mean Discrepancy (RBF-MMD) of simulated cell cluster with target cell cluster (same cell count).
80
+ 2. 1-Wasserstein and 2-Wasserstein distances against the full cell population in the cluster.
81
 
82
  ## Overview of Outputs
83