--- library_name: pytorch framework: pytorch tags: - pytorch - pytorch-lightning - bioinformatics - rna-binding-proteins - explainability - alternative-splicing - deep-learning license: mit --- # DeepRBP Predictor (pretrained) This repository provides a **pretrained DeepRBP predictor model**, a deep learning framework designed to infer **RNA-binding protein (RBP)–transcript and RBP–gene regulatory relationships** from expression data. DeepRBP was introduced in the following preprint: > **DeepRBP: A deep neural network for inferring splicing regulation** > https://doi.org/10.1101/2024.04.11.589004 The model is intended to be used **directly for inference and explainability**, without retraining. --- ## Model overview DeepRBP is composed of two conceptual stages: 1. **Prediction stage** A neural network predicts transcript abundances from: - RBP expression - Gene expression 2. **Explainability stage** Feature attribution methods (e.g. DeepLIFT) are applied on the trained predictor to compute: - Transcript × RBP (TxRBP) scores - Gene × RBP (GxRBP) scores This repository contains **only the pretrained predictor and its required preprocessing artifacts**. --- ## Files in this repository ⚠️ **All files are required for correct inference and explainability.** | File | Description | |-----|-------------| | `model.ckpt` | PyTorch Lightning checkpoint of the pretrained DeepRBP predictor | | `scaler.joblib` | Fitted input scaler used during model training | | `sigma.npy` | Scaling parameter required to reconstruct transcript abundance values | The scaler and sigma are **part of the trained model state** and must be used together with the checkpoint. --- ## Intended use This pretrained model is intended for: - Computing transcript abundance predictions - Running explainability analyses (e.g. DeepLIFT-based attribution) - Identifying candidate RBP–transcript and RBP–gene regulatory relationships - Downstream biological interpretation and hypothesis generation Typical applications include: - Cancer transcriptomics (e.g. TCGA) - Perturbation studies (e.g. RBP knockdowns) - Comparative regulatory analyses across conditions --- ## Usage This repository **does not provide a standalone inference script**. Please refer to the **main DeepRBP code repository** for: - Data preprocessing - Model loading - Running prediction and explainability pipelines 👉 **Main repository:** https://github.com/ML4BM-Lab/DeepRBP The main repository contains: - End-to-end examples - Command-line interfaces - Explainability workflows - Validation pipelines --- ## Reproducibility notes - The model was trained on public datasets (TCGA, GTEx and related resources). - The provided scaler and sigma ensure: - Consistent input normalization - Comparable predictions and explainability scores across users - Using a different scaler or recomputing normalization **will break comparability**. --- ## Limitations - The model was trained on bulk RNA-seq data and may not generalize to: - Single-cell RNA-seq - Extremely low-coverage datasets - Predictions represent **associations**, not direct causal regulation. - Experimental validation is required before biological conclusions. --- ## License This model is released under the **MIT License**. You are free to use, modify and redistribute it, provided that the license and copyright notice are preserved. --- ## Citation If you use DeepRBP in your work, please cite: DeepRBP: A deep neural network for inferring splicing regulation bioRxiv (2024) https://doi.org/10.1101/2024.04.11.589004