DeepRBP / README.md

Update Readme.md

aeeb499 verified 2 months ago

3.64 kB

	---
	library_name: pytorch
	framework: pytorch
	tags:
	- pytorch
	- pytorch-lightning
	- bioinformatics
	- rna-binding-proteins
	- explainability
	- alternative-splicing
	- deep-learning
	license: mit
	---

	# DeepRBP Predictor (pretrained)

	This repository provides a pretrained DeepRBP predictor model, a deep learning framework designed to infer RNA-binding protein (RBP)–transcript and RBP–gene regulatory relationships from expression data.

	DeepRBP was introduced in the following preprint:

	> DeepRBP: A deep neural network for inferring splicing regulation
	> https://doi.org/10.1101/2024.04.11.589004

	The model is intended to be used directly for inference and explainability, without retraining.

	---

	## Model overview

	DeepRBP is composed of two conceptual stages:

	1. Prediction stage
	A neural network predicts transcript abundances from:
	- RBP expression
	- Gene expression

	2. Explainability stage
	Feature attribution methods (e.g. DeepLIFT) are applied on the trained predictor to compute:
	- Transcript × RBP (TxRBP) scores
	- Gene × RBP (GxRBP) scores

	This repository contains only the pretrained predictor and its required preprocessing artifacts.

	---

	## Files in this repository

	⚠️ All files are required for correct inference and explainability.

	\| File \| Description \|
	\|-----\|-------------\|
	\| `model.ckpt` \| PyTorch Lightning checkpoint of the pretrained DeepRBP predictor \|
	\| `scaler.joblib` \| Fitted input scaler used during model training \|
	\| `sigma.npy` \| Scaling parameter required to reconstruct transcript abundance values \|

	The scaler and sigma are part of the trained model state and must be used together with the checkpoint.

	---

	## Intended use

	This pretrained model is intended for:

	- Computing transcript abundance predictions
	- Running explainability analyses (e.g. DeepLIFT-based attribution)
	- Identifying candidate RBP–transcript and RBP–gene regulatory relationships
	- Downstream biological interpretation and hypothesis generation

	Typical applications include:
	- Cancer transcriptomics (e.g. TCGA)
	- Perturbation studies (e.g. RBP knockdowns)
	- Comparative regulatory analyses across conditions

	---

	## Usage

	This repository does not provide a standalone inference script.

	Please refer to the main DeepRBP code repository for:
	- Data preprocessing
	- Model loading
	- Running prediction and explainability pipelines

	👉 Main repository:
	https://github.com/ML4BM-Lab/DeepRBP

	The main repository contains:
	- End-to-end examples
	- Command-line interfaces
	- Explainability workflows
	- Validation pipelines

	---

	## Reproducibility notes

	- The model was trained on public datasets (TCGA, GTEx and related resources).
	- The provided scaler and sigma ensure:
	- Consistent input normalization
	- Comparable predictions and explainability scores across users
	- Using a different scaler or recomputing normalization will break comparability.

	---

	## Limitations

	- The model was trained on bulk RNA-seq data and may not generalize to:
	- Single-cell RNA-seq
	- Extremely low-coverage datasets
	- Predictions represent associations, not direct causal regulation.
	- Experimental validation is required before biological conclusions.

	---

	## License

	This model is released under the MIT License.

	You are free to use, modify and redistribute it, provided that the license and copyright notice are preserved.

	---

	## Citation

	If you use DeepRBP in your work, please cite:

	DeepRBP: A deep neural network for inferring splicing regulation
	bioRxiv (2024)
	https://doi.org/10.1101/2024.04.11.589004