File size: 1,839 Bytes
6913420 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
license: mit
library_name: pytorch-lightning
pipeline_tag: tabular-regression
tags:
- biology
- genomics
datasets:
- Genentech/decima-data
---
# Decima
## Model Description
Decima is a multi-task regression model designed to predict gene expression from genomic DNA sequences. This model was developed by fine-tuning the **Borzoi** architecture. It maps the genomic DNA sequence to quantitative expression levels across diverse cell types and conditions.
For more details, please refer to the original paper: https://www.biorxiv.org/content/10.1101/2024.10.09.617507v3.
- **Architecture:** Fine-tuned Borzoi
- **Task:** Multi-task Regression
- **Input:** Genomic sequences (hg38)
- **Output:** Predicted expression values (log(CPM) + 1) for 8,856 pseudobulks.
## Repository Content
This repository contains four model replicates (`rep0` through `rep3`). Each replicate is provided in two formats:
1. **`.ckpt`**: PyTorch Lightning checkpoints containing model weights, optimizer states, and hyperparameters.
2. **`.safetensors`**: A lightweight, secure format for weights only.
**Files:**
* `rep0.ckpt`, `rep1.ckpt`, `rep2.ckpt`, `rep3.ckpt`
* `rep0.safetensors`, `rep1.safetensors`, `rep2.safetensors`, `rep3.safetensors`
## How to Use
You can load any of the model replicates for inference or further fine-tuning using the `decima` package (https://github.com/Genentech/decima).
### Loading via PyTorch Lightning Checkpoint
```python
from decima.model.lightning import LightningModel
from huggingface_hub import hf_hub_download
# Download a specific replicate (e.g., rep0)
ckpt_path = hf_hub_download(
repo_id="Genentech/decima-model",
filename="rep0.ckpt"
)
# Load the model
model = LightningModel.load_from_checkpoint(ckpt_path)
model.eval()
# For a safetensor file, use LightningModel.load_safetensor(path)
``` |