giovp commited on
Commit
f53de42
·
1 Parent(s): 327881b
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -23,7 +23,7 @@ SigSpace
23
  - Ishita Mangla - Lila Sciences - imangla@stanford.edu
24
  - Giovanni Palla - Chan Zuckerberg Initiative - gpalla@chanzuckerberg.com
25
  - Rohit Khurana - Stanford - rkhurana@stanford.edu
26
- - Siddhant Sanghi - UC Davis - ssanghi@ucdavis.edu
27
  - Kuan Pang - Stanford - kuanpang@stanford.edu
28
  - Yanay Rosen - Stanford - yanay@stanford.edu
29
  - Yasha Ektefaie - Harvard - yasha_ektefaie@g.harvard.edu
@@ -47,6 +47,22 @@ Specifically:
47
  - JUMP: We use the JUMP dataset, which captures morphological profiles of cells in response to chemical and genetic perturbations. High-content imaging and automated feature extraction are used to quantify cellular changes, enabling large-scale profiling of perturbation effects across diverse biological contexts.
48
  - UCE-CXG-EMBEDDING: natural perturbation search using AI virtual cell.
49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
  ## Results
51
 
52
  We have developed a Gradio application that accesses these databases and performs complex queries, enhancing and grounding the reasoning in real biological measurements.
 
23
  - Ishita Mangla - Lila Sciences - imangla@stanford.edu
24
  - Giovanni Palla - Chan Zuckerberg Initiative - gpalla@chanzuckerberg.com
25
  - Rohit Khurana - Stanford - rkhurana@stanford.edu
26
+ - Sid Sanghi - UC Davis - ssanghi@ucdavis.edu
27
  - Kuan Pang - Stanford - kuanpang@stanford.edu
28
  - Yanay Rosen - Stanford - yanay@stanford.edu
29
  - Yasha Ektefaie - Harvard - yasha_ektefaie@g.harvard.edu
 
47
  - JUMP: We use the JUMP dataset, which captures morphological profiles of cells in response to chemical and genetic perturbations. High-content imaging and automated feature extraction are used to quantify cellular changes, enabling large-scale profiling of perturbation effects across diverse biological contexts.
48
  - UCE-CXG-EMBEDDING: natural perturbation search using AI virtual cell.
49
 
50
+ ## Data
51
+ The following datasets are used in our project:
52
+
53
+ - **drug_metadata_inchikey.csv**: Drug metadata from Tahoe-100M including InChIKey identifiers for chemical structure representation.
54
+ - **compound_genetic_perturbation_cosine_similarity_inchikey.csv**: Cosine similarity scores between compound and genetic perturbations in Jump dataset.
55
+ - **Tahoe_PRISM_cell_by_drug_ic50_matrix_named.csv**: IC50 values showing drug sensitivity across cell lines.
56
+ - **filtered_results.csv**: Filtered NCI60 LC50 data for drug response analysis.
57
+ - **cell_line_metadata.csv**: Comprehensive metadata for cell lines in the Tahoe dataset.
58
+ - **drug_metadata.csv**: Detailed information about drugs in the Tahoe dataset.
59
+ - **tahoe_vision_scores.h5ad**: Vision scores in AnnData format capturing cellular morphological changes.
60
+ - **Tahoe_PRISM_matched_cell_metadata_final.csv**: Cell metadata for PRISM-Tahoe matched cell lines.
61
+ - **Tahoe_PRISM_matched_drug_metadata_final.csv**: Drug metadata for PRISM-Tahoe matched compounds.
62
+ - **in_tahoe_search_result_df.csv**: Search results for perturbations within the Tahoe dataset embedded with UCE.
63
+ - **cxg_search_result_df.csv**: Cross-dataset search results using CXG embeddings with UCE.
64
+
65
+
66
  ## Results
67
 
68
  We have developed a Gradio application that accesses these databases and performs complex queries, enhancing and grounding the reasoning in real biological measurements.