ML4BM-Lab
/

DeepRBP

+---
+library_name: pytorch
+framework: pytorch
+tags:
+  - pytorch
+  - pytorch-lightning
+  - bioinformatics
+  - rna-binding-proteins
+  - explainability
+  - alternative-splicing
+  - deep-learning
+license: mit
+---
+# DeepRBP Predictor (pretrained)
+This repository provides a **pretrained DeepRBP predictor model**, a deep learning framework designed to infer **RNA-binding protein (RBP)–transcript and RBP–gene regulatory relationships** from expression data.
+DeepRBP was introduced in the following preprint:
+> **DeepRBP: A deep neural network for inferring splicing regulation**
+> https://doi.org/10.1101/2024.04.11.589004
+The model is intended to be used **directly for inference and explainability**, without retraining.
+---
+## Model overview
+DeepRBP is composed of two conceptual stages:
+1. **Prediction stage**
+   A neural network predicts transcript abundances from:
+   - RBP expression
+   - Gene expression
+2. **Explainability stage**
+   Feature attribution methods (e.g. DeepLIFT) are applied on the trained predictor to compute:
+   - Transcript × RBP (TxRBP) scores
+   - Gene × RBP (GxRBP) scores
+This repository contains **only the pretrained predictor and its required preprocessing artifacts**.
+---
+## Files in this repository
+⚠️ **All files are required for correct inference and explainability.**
+| File | Description |
+|-----|-------------|
+| `model.ckpt` | PyTorch Lightning checkpoint of the pretrained DeepRBP predictor |
+| `scaler.joblib` | Fitted input scaler used during model training |
+| `sigma.npy` | Scaling parameter required to reconstruct transcript abundance values |
+The scaler and sigma are **part of the trained model state** and must be used together with the checkpoint.
+---
+## Intended use
+This pretrained model is intended for:
+- Computing transcript abundance predictions
+- Running explainability analyses (e.g. DeepLIFT-based attribution)
+- Identifying candidate RBP–transcript and RBP–gene regulatory relationships
+- Downstream biological interpretation and hypothesis generation
+Typical applications include:
+- Cancer transcriptomics (e.g. TCGA)
+- Perturbation studies (e.g. RBP knockdowns)
+- Comparative regulatory analyses across conditions
+---
+## Usage
+This repository **does not provide a standalone inference script**.
+Please refer to the **main DeepRBP code repository** for:
+- Data preprocessing
+- Model loading
+- Running prediction and explainability pipelines
+👉 **Main repository:**
+https://github.com/ML4BM-Lab/DeepRBP
+The main repository contains:
+- End-to-end examples
+- Command-line interfaces
+- Explainability workflows
+- Validation pipelines
+---
+## Reproducibility notes
+- The model was trained on public datasets (TCGA, GTEx and related resources).
+- The provided scaler and sigma ensure:
+  - Consistent input normalization
+  - Comparable predictions and explainability scores across users
+- Using a different scaler or recomputing normalization **will break comparability**.
+---
+## Limitations
+- The model was trained on bulk RNA-seq data and may not generalize to:
+  - Single-cell RNA-seq
+  - Extremely low-coverage datasets
+- Predictions represent **associations**, not direct causal regulation.
+- Experimental validation is required before biological conclusions.
+---
+## License
+This model is released under the **MIT License**.
+You are free to use, modify and redistribute it, provided that the license and copyright notice are preserved.
+---
+## Citation
+If you use DeepRBP in your work, please cite:
+DeepRBP: A deep neural network for inferring splicing regulation
+bioRxiv (2024)
+https://doi.org/10.1101/2024.04.11.589004