| library_name: perturblab | |
| tags: | |
| - biology | |
| - genomics | |
| - scfoundation | |
| - foundation-model | |
| license: apache-2.0 | |
| base_model: biomap-research/scFoundation | |
| # scfoundation-cell | |
| ## Model Description | |
| This is the **cell embedding** model from scFoundation. It generates cell-level embeddings from single-cell RNA-seq data. | |
| Model weights were originally from the [biomap-research/scFoundation](https://github.com/biomap-research/scFoundation) repository and have been re-uploaded here for ease of use with the `perturblab` library. | |
| ## Model Details | |
| - **Model Type**: Cell embedding model | |
| - **Architecture**: xTrimoGene with MAE (Masked Autoencoder), Performer/Transformer modules | |
| - **Parameters**: 100M parameters | |
| - **Training Data**: 50M+ human single-cell transcriptomics data | |
| - **Input**: Single-cell or bulk RNA-seq expression data (19,264 fixed genes) | |
| - **Output**: Cell-level embeddings | |
| ## Source | |
| - **Original Repository**: [biomap-research/scFoundation](https://github.com/biomap-research/scFoundation) | |
| - **Paper**: [Large Scale Foundation Model on Single-cell Transcriptomics](https://www.nature.com/articles/s41592-024-02305-7) (_Nature Methods_, 2024) | |
| ## Usage | |
| ```python | |
| from perturblab.model.scfoundation import scFoundationModel | |
| # Load model | |
| model = scFoundationModel.from_pretrained('scfoundation-cell', device='cuda') | |
| # Generate cell embeddings | |
| cell_embeddings = model.predict_embedding( | |
| adata, | |
| output_type='cell', | |
| pool_type='all' | |
| ) | |
| ``` | |
| ## Note | |
| Intended for internal use with the PerturbLab framework. | |