metadata
license: mit
library_name: pytorch-lightning
pipeline_tag: tabular-regression
tags:
- biology
- genomics
datasets:
- Genentech/decima-data
Decima
Model Description
Decima is a multi-task regression model designed to predict gene expression from genomic DNA sequences. This model was developed by fine-tuning the Borzoi architecture. It maps the genomic DNA sequence to quantitative expression levels across diverse cell types and conditions.
For more details, please refer to the original paper: https://www.biorxiv.org/content/10.1101/2024.10.09.617507v3.
- Architecture: Fine-tuned Borzoi
- Task: Multi-task Regression
- Input: Genomic sequences (hg38)
- Output: Predicted expression values (log(CPM) + 1) for 8,856 pseudobulks.
Repository Content
This repository contains four model replicates (rep0 through rep3). Each replicate is provided in two formats:
.ckpt: PyTorch Lightning checkpoints containing model weights, optimizer states, and hyperparameters..safetensors: A lightweight, secure format for weights only.
Files:
rep0.ckpt,rep1.ckpt,rep2.ckpt,rep3.ckptrep0.safetensors,rep1.safetensors,rep2.safetensors,rep3.safetensors
How to Use
You can load any of the model replicates for inference or further fine-tuning using the decima package (https://github.com/Genentech/decima).
Loading via PyTorch Lightning Checkpoint
from decima.model.lightning import LightningModel
from huggingface_hub import hf_hub_download
# Download a specific replicate (e.g., rep0)
ckpt_path = hf_hub_download(
repo_id="Genentech/decima-model",
filename="rep0.ckpt"
)
# Load the model
model = LightningModel.load_from_checkpoint(ckpt_path)
model.eval()
# For a safetensor file, use LightningModel.load_safetensor(path)