Model Card for databio/r2v-buenrostro2018-hg19
Model Details
This is a single-cell Region2Vec (r2v) model designed to be used with with scEmbed and Region2Vec. It was trained on the Buenrostro2018 dataset. This model should be used to generate embeddings of single cells from scATAC-seq experiments. It produces 100 dimensional embeddings for each single-cell.
Model Sources [optional]
- Repository: https://github.com/databio/geniml
- Paper: https://www.biorxiv.org/content/10.1101/2023.08.01.551452v1
Uses
This model should be used for producing low-dimensional embeddings of single-cells. These embeddings can be used for downstream clustering or classification tasks.
Bias, Risks, and Limitations
The Buenrostro2018 dataset comprises 2034 human hematopoietic stem cells from data aligned to hg19. Therefore, it should only be used with other data aligned to hg19.
How to Get Started with the Model
You can use the geniml python library to download this model and start encoding your single-cell data:
import scanpy as sc
from geniml.scembed import ScEmbed
adata = sc.read_h5ad("path/to/adata.h5ad")
model = ScEmbed("databio/r2v-buenrostro2018-hg19")
embeddings = model.encode(adata)
Training Details
Training Data
The data for this model comes from Buenrostro2018: https://www.sciencedirect.com/science/article/pii/S009286741830446X
- Downloads last month
- -