DeepRBP / README.md

jsanchoz

Update Readme.md

aeeb499 verified 2 months ago

preview code

raw

history blame

3.64 kB

metadata

library_name: pytorch
framework: pytorch
tags:
  - pytorch
  - pytorch-lightning
  - bioinformatics
  - rna-binding-proteins
  - explainability
  - alternative-splicing
  - deep-learning
license: mit

DeepRBP Predictor (pretrained)

This repository provides a pretrained DeepRBP predictor model, a deep learning framework designed to infer RNA-binding protein (RBP)–transcript and RBP–gene regulatory relationships from expression data.

DeepRBP was introduced in the following preprint:

DeepRBP: A deep neural network for inferring splicing regulation
https://doi.org/10.1101/2024.04.11.589004

The model is intended to be used directly for inference and explainability, without retraining.

Model overview

DeepRBP is composed of two conceptual stages:

Prediction stage
A neural network predicts transcript abundances from:
- RBP expression
- Gene expression
Explainability stage
Feature attribution methods (e.g. DeepLIFT) are applied on the trained predictor to compute:
- Transcript × RBP (TxRBP) scores
- Gene × RBP (GxRBP) scores

This repository contains only the pretrained predictor and its required preprocessing artifacts.

Files in this repository

⚠️ All files are required for correct inference and explainability.

File	Description
`model.ckpt`	PyTorch Lightning checkpoint of the pretrained DeepRBP predictor
`scaler.joblib`	Fitted input scaler used during model training
`sigma.npy`	Scaling parameter required to reconstruct transcript abundance values

The scaler and sigma are part of the trained model state and must be used together with the checkpoint.

Intended use

This pretrained model is intended for:

Computing transcript abundance predictions
Running explainability analyses (e.g. DeepLIFT-based attribution)
Identifying candidate RBP–transcript and RBP–gene regulatory relationships
Downstream biological interpretation and hypothesis generation

Typical applications include:

Cancer transcriptomics (e.g. TCGA)
Perturbation studies (e.g. RBP knockdowns)
Comparative regulatory analyses across conditions

Usage

This repository does not provide a standalone inference script.

Please refer to the main DeepRBP code repository for:

Data preprocessing
Model loading
Running prediction and explainability pipelines

👉 Main repository:
https://github.com/ML4BM-Lab/DeepRBP

The main repository contains:

End-to-end examples
Command-line interfaces
Explainability workflows
Validation pipelines

Reproducibility notes

The model was trained on public datasets (TCGA, GTEx and related resources).
The provided scaler and sigma ensure:
- Consistent input normalization
- Comparable predictions and explainability scores across users
Using a different scaler or recomputing normalization will break comparability.

Limitations

The model was trained on bulk RNA-seq data and may not generalize to:
- Single-cell RNA-seq
- Extremely low-coverage datasets
Predictions represent associations, not direct causal regulation.
Experimental validation is required before biological conclusions.

License

This model is released under the MIT License.

You are free to use, modify and redistribute it, provided that the license and copyright notice are preserved.

Citation

If you use DeepRBP in your work, please cite:

DeepRBP: A deep neural network for inferring splicing regulation bioRxiv (2024) https://doi.org/10.1101/2024.04.11.589004