Upload scelmo-celllines-gpt-3.5 - scELMo cell line embeddings generated using GPT-3.5-turbo
Browse files- README.md +49 -0
- config.json +10 -0
- gene_embeddings.pkl +3 -0
README.md
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# scELMo Cellline Embeddings
|
| 2 |
+
|
| 3 |
+
This directory contains cellline embeddings generated using scELMo methodology.
|
| 4 |
+
|
| 5 |
+
## Source
|
| 6 |
+
|
| 7 |
+
These embeddings are converted from the official scELMo repository:
|
| 8 |
+
**https://github.com/HelloWorldLTY/scELMo**
|
| 9 |
+
|
| 10 |
+
## Model Information
|
| 11 |
+
|
| 12 |
+
- **Model**: gpt-3.5-turbo
|
| 13 |
+
- **Embedding Dimension**: 1536
|
| 14 |
+
- **Type**: cellline
|
| 15 |
+
- **Aggregation Mode**: wa
|
| 16 |
+
- **API Model**: text-embedding-ada-002
|
| 17 |
+
|
| 18 |
+
## Files
|
| 19 |
+
|
| 20 |
+
- `gene_embeddings.pkl`: Gene embeddings dictionary in PerturbLab format
|
| 21 |
+
- Format: `{'embeddings': {gene_name: embedding_array}, 'gene_list': [gene_names]}`
|
| 22 |
+
- `config.json`: Model configuration
|
| 23 |
+
|
| 24 |
+
## Usage
|
| 25 |
+
|
| 26 |
+
```python
|
| 27 |
+
from perturblab.model.scelmo import scELMoModel
|
| 28 |
+
|
| 29 |
+
# Load model
|
| 30 |
+
model = scELMoModel.from_pretrained('scelmo-celllines-gpt-3.5')
|
| 31 |
+
|
| 32 |
+
# Use embeddings
|
| 33 |
+
embeddings = model.predict_embeddings(adata, aggregation_mode='wa')
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
+
## Citation
|
| 37 |
+
|
| 38 |
+
If you use these embeddings, please cite the original scELMo paper:
|
| 39 |
+
|
| 40 |
+
```bibtex
|
| 41 |
+
@article{liu2023scelmo,
|
| 42 |
+
title={scELMo: Embeddings from Language Models are Good Learners for Single-cell Data Analysis},
|
| 43 |
+
author={Liu, Tianyu and Chen, Tianqi and Zheng, Wangjie and Luo, Xiao and Zhao, Hongyu},
|
| 44 |
+
journal={Cell Patterns (in press)},
|
| 45 |
+
pages={2023--12},
|
| 46 |
+
year={2025},
|
| 47 |
+
publisher={Cell Press}
|
| 48 |
+
}
|
| 49 |
+
```
|
config.json
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_series": "scelmo",
|
| 3 |
+
"model_name": "gpt-3.5-turbo",
|
| 4 |
+
"model_type": "embedding_extractor",
|
| 5 |
+
"embedding_dim": 1536,
|
| 6 |
+
"aggregation_mode": "wa",
|
| 7 |
+
"api_model": "text-embedding-ada-002",
|
| 8 |
+
"source_type": "cellline",
|
| 9 |
+
"description": "Cell line embeddings generated using GPT-3.5-turbo"
|
| 10 |
+
}
|
gene_embeddings.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cfc4e5c2b41d45ce920ae0f72e5412aa86a17bd0a70a2f43db637272ccdb844d
|
| 3 |
+
size 24599857
|