Updated description of the project
Browse files
README.md
CHANGED
|
@@ -6,30 +6,89 @@ tags:
|
|
| 6 |
- tahoe-deepdive
|
| 7 |
---
|
| 8 |
|
| 9 |
-
|
| 10 |
|
| 11 |
-
Team Name
|
| 12 |
-
Frameshift
|
| 13 |
|
| 14 |
-
Members
|
| 15 |
-
Jesus Gonzalez Ferrer, UCSC
|
| 16 |
-
Carlota Pereda, UCSF
|
| 17 |
-
Laura Almonte, UCSF
|
| 18 |
-
Aidan Winters, Arc Institute/UCSF
|
| 19 |
-
Michael Kosicki, LBL
|
| 20 |
|
| 21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
-
|
| 26 |
-
Personalized (ie. context-specific) treatments lead to better cancer outcomes. We want to develop a framework that is able to measure how
|
| 27 |
-
drugs affect cells differently based on their genetic context and that is also to explain the genetic programs that cells use to respond.
|
| 28 |
|
| 29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
-
Methods
|
| 32 |
-
CellCap, Augur, MSE, E-distance.
|
| 33 |
-
Results
|
| 34 |
|
| 35 |
-
Discussion
|
|
|
|
| 6 |
- tahoe-deepdive
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# Frameshift Team Submission – Tahoe-DeepDive Hackathon 2025
|
| 10 |
|
| 11 |
+
## Team Name
|
| 12 |
+
**Frameshift**
|
| 13 |
|
| 14 |
+
## Members
|
| 15 |
+
- Jesus Gonzalez Ferrer, UCSC — [@JesusGF1](https://github.com/JesusGF1)
|
| 16 |
+
- Carlota Pereda, UCSF — [@carlotapereda](https://github.com/carlotapereda)
|
| 17 |
+
- Laura Almonte, UCSF — [@almonteloya](https://github.com/almonteloya)
|
| 18 |
+
- Aidan Winters, Arc Institute/UCSF — [@aidanwinters](https://github.com/aidanwinters)
|
| 19 |
+
- Michael Kosicki, LBL — [@lotard](https://github.com/lotard)
|
| 20 |
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
## Project
|
| 24 |
+
|
| 25 |
+
### Title
|
| 26 |
+
**Defining context-specific responses to drug perturbations in Tahoe 100M dataset**
|
| 27 |
+
|
| 28 |
+
### Overview
|
| 29 |
+
Personalized (i.e. context-specific) treatments lead to better cancer outcomes.
|
| 30 |
+
We want to develop a framework that measures how drugs affect cells differently based on their genetic context, and explains the genetic programs that cells use to respond.
|
| 31 |
+
We define context-specificity as genotype-, cell line-, tissue-of-origin-, and patient-specific effects on gene expression.
|
| 32 |
+
|
| 33 |
+
### Motivation
|
| 34 |
+
Drugs don't work the same way for everyone. Oncotherapies sometimes lack efficacy and tend to be indiscriminate and toxic.
|
| 35 |
+
Broad-acting chemotherapies are effective but are limited by patient side effects.
|
| 36 |
+
We need better ways of stratifying patients, selecting adequate treatments, and simulating adverse effects before they happen.
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
## Methods
|
| 41 |
+
|
| 42 |
+
### Data Selection
|
| 43 |
+
We applied an array of methods to a subset of the Tahoe-100M dataset.
|
| 44 |
+
We focused on cell lines with **KRAS gain-of-function mutations**, especially **G12C**.
|
| 45 |
+
Selected drugs included known KRAS inhibitors, positive controls, and negative controls.
|
| 46 |
+
|
| 47 |
+
### E-distance
|
| 48 |
+
- Used precomputed `scVi` embeddings from Tahoe-100M.
|
| 49 |
+
- Calculated distances to plate-paired `DMSO_TF` for each drug and cell line.
|
| 50 |
+
- Visualized results.
|
| 51 |
+
|
| 52 |
+
### MSE
|
| 53 |
+
- Applied similar steps as E-distance.
|
| 54 |
+
- Started from **pseudobulk samples** provided in the dataset.
|
| 55 |
+
|
| 56 |
+
### Augur
|
| 57 |
+
- A **scRNA classifier** to quantify separability between control and perturbed groups.
|
| 58 |
+
- Score of 1 indicates high separability.
|
| 59 |
+
- Applied across all cell lines and drug perturbations.
|
| 60 |
+
|
| 61 |
+
### CellCap
|
| 62 |
+
- A **generative model** for perturbation data.
|
| 63 |
+
- Models correspondence between basal state and measured perturbation.
|
| 64 |
+
- Learns interpretable response programs as weighted gene sets.
|
| 65 |
+
|
| 66 |
+
---
|
| 67 |
+
|
| 68 |
+
## Results
|
| 69 |
+
|
| 70 |
+
- **E-distance** and **MSE** failed to detect context-specific drug effects across selected KRAS cell lines.
|
| 71 |
+
- **Augur** and **CellCap**:
|
| 72 |
+
- Detected strong responses in **KRAS-G12C** lines.
|
| 73 |
+
- Captured cell-specific gene expression programs linked to KRAS mutations.
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## Discussion
|
| 78 |
+
|
| 79 |
+
The discovery of novel cancer therapies is limited by the lack of generalizable experimental and computational workflows. In a proof-of-concept analysis, we tested four computational methods on the Tahoe-100M dataset for identifying context-specific responses to KRAS inhibitors.
|
| 80 |
+
|
| 81 |
+
- **Augur** and **CellCap** succeeded in detecting KRAS-inhibitor effects in KRAS-G12C cell lines.
|
| 82 |
+
- **E-distance** and **MSE** failed to differentiate responses.
|
| 83 |
|
| 84 |
+
We hypothesize that the success of Augur and CellCap lies in their ability to utilize **local, pathway-level expression** rather than global transcriptomic changes.
|
| 85 |
|
| 86 |
+
Preliminary results highlight genes associated with the **Ras-Raf pathway**, suggesting a targeted effect by the drugs.
|
|
|
|
|
|
|
| 87 |
|
| 88 |
+
### Future Directions
|
| 89 |
+
We aim to:
|
| 90 |
+
- Scale our approach to all cell lines and drugs in Tahoe-100M.
|
| 91 |
+
- Identify potential **cell-type specific drugs**.
|
| 92 |
+
- Propose **candidates for clinical development**.
|
| 93 |
|
|
|
|
|
|
|
|
|
|
| 94 |
|
|
|