Spaces:

popV
/

README

Running

App Files Files Community

README / README.md

canergen

Update README.md

ada01ff verified about 1 year ago

preview code

raw

history blame contribute delete

3.71 kB

	---
	title: README
	emoji: 🐨
	colorFrom: purple
	colorTo: blue
	sdk: static
	pinned: true
	license: bsd-3-clause
	short_description: Ensemble of experts for cell-type annotation
	thumbnail: >-
	https://cdn-uploads.huggingface.co/production/uploads/63d7697f2e397d9f8e30e677/tvABibiml6K2sccfXLybG.png
	---
	# popV

	Welcome to the popV framework. We provide state-of-the-art performance in cell-type label transfer using an ensemble of experts approach. We provide here pre-trained
	models to transfer cell-types to your own query dataset. Cell-type definition is a tedious process. Using reference data can significantly accelerate this process.
	By using several tools for label transfer, we provide a certainty score that is well calibrated and allows to detect cell-types, where automatic annotation has high
	uncertainty. We recommend to manually check transferred cell-type labels by plotting marker or differentially expressed genes before blindly trusting them.
	This is an open science initiative, please contribute your own models to allow the single-cell community to leverage your reference datasets by asking in our [GitHub
	repository](https://github.com/YosefLab/popV) to add your dataset.

	---

	## Model Overview
	popV trains up to 9 different algorithms for automatic label transfer and computes a consensus score. We provide an automatic report. To learn how to apply popV to your
	own dataset, please refer to our [tutorial]()

	### Algorithms

	Currently implemented algorithms are:

	- K-nearest neighbor classification after dataset integration with [BBKNN](https://github.com/Teichlab/bbknn)
	- K-nearest neighbor classification after dataset integration with [SCANORAMA](https://github.com/brianhie/scanorama)
	- K-nearest neighbor classification after dataset integration with [scVI](https://github.com/scverse/scvi-tools)
	- K-nearest neighbor classification after dataset integration with [Harmony](https://github.com/lilab-bcb/harmony-pytorch)
	- Random forest classification
	- Support vector machine classification
	- [OnClass](https://github.com/wangshenguiuc/OnClass) cell type classification
	- [scANVI](https://github.com/scverse/scvi-tools) label transfer
	- [Celltypist](https://www.celltypist.org) cell type classification

	---

	## Key Applications
	The purpose of these models is to perform cell-type label transfer.
	We provide models with (CUML support)[collection] for large-scale reference mapping and (without CUML support)[collection] if no GPU is available. PopV without GPU scales
	well to 100k cells. PopV has three levels of prediction complexities:

	- retrain will train all classifiers from scratch. For 50k cells this takes up to an hour of computing time using a GPU.
	- inference will use pretrained classifiers to annotate query as well as reference cells and construct a joint embedding using all integration methods from above. For 50k cells this takes in our hands up to half an hour of computing time using a GPU.
	- fast will use only methods with pretrained classifiers to annotate only query cells. For 50k cells this takes 5 minutes without a GPU (without UMAP embedding).

	---

	## Publications
	- [Original popV paper](https://www.nature.com/articles/s41588-024-01993-3):
	- Published in Nature Genetics, this paper introduces popV and benchmarks it.

	## Contact
	- GitHub: [https://github.com/YosefLab/popV](https://github.com/YosefLab/popV)
	- User questions: [Discourse](https://discourse.scverse.org)


	<!---
	- [MultiVI](https://docs.scvi-tools.org/en/stable/user_guide/models/multivi.html):
	- A multi-modal model for joint analysis of RNA, ATAC and protein data, enabling integrative insights from diverse omics data.
	-->