Decima

Model Description

Decima is a multi-task regression model designed to predict gene expression from genomic DNA sequences. This model was developed by fine-tuning the Borzoi architecture. It maps the genomic DNA sequence to quantitative expression levels across diverse cell types and conditions.

For more details, please refer to the original paper: https://www.biorxiv.org/content/10.1101/2024.10.09.617507v3.

  • Architecture: Fine-tuned Borzoi
  • Task: Multi-task Regression
  • Input: Genomic sequences (hg38)
  • Output: Predicted expression values (log(CPM) + 1) for 8,856 pseudobulks.

Repository Content

This repository contains four model replicates (rep0 through rep3). Each replicate is provided in two formats:

  1. .ckpt: PyTorch Lightning checkpoints containing model weights, optimizer states, and hyperparameters.
  2. .safetensors: A lightweight, secure format for weights only.

Files:

  • rep0.ckpt, rep1.ckpt, rep2.ckpt, rep3.ckpt
  • rep0.safetensors, rep1.safetensors, rep2.safetensors, rep3.safetensors

How to Use

You can load any of the model replicates for inference or further fine-tuning using the decima package (https://github.com/Genentech/decima).

Loading via PyTorch Lightning Checkpoint

from decima.model.lightning import LightningModel
from huggingface_hub import hf_hub_download

# Download a specific replicate (e.g., rep0)
ckpt_path = hf_hub_download(
    repo_id="Genentech/decima-model", 
    filename="rep0.ckpt"
)

# Load the model
model = LightningModel.load_from_checkpoint(ckpt_path)
model.eval()

# For a safetensor file, use LightningModel.load_safetensor(path)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Genentech/decima-model

Collection including Genentech/decima-model