--- license: mit library_name: pytorch-lightning pipeline_tag: tabular-regression tags: - biology - genomics datasets: - Genentech/borzoi-data --- # Borzoi Model (Linder et al. 2025) ## Model Description This repository contains the weights for the Borzoi model, a model designed to predict functional genomic tracks from genomic DNA sequences. - **Architecture:** Convolutions followed by Transformer and U-net layers. - **Input:** 524,288 bp of genomic DNA sequence. - **Output Resolution:** 32 bp bins. - **Source:** [Linder, J. et al. Nature Genetics (2025)](https://www.nature.com/articles/s41588-024-02053-6) ## Repository Content The repository includes both full PyTorch Lightning checkpoints and raw state dictionaries for the human and mouse versions of the model. Each model has 4 independent replicates. Note that the weights are derived from the publication but the model has been converted into the PyTorch Lightning format used by gReLU (https://github.com/Genentech/gReLU). | File | Type | Description | | :--- | :--- | :--- | | `human_rep0.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 0. | | `human_rep1.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 1. | | `human_rep2.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 2. | | `human_rep3.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 3. | | `mouse_rep0.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 0. | | `mouse_rep1.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 1. | | `mouse_rep2.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 2. | | `mouse_rep3.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 3. | | `human_state_dict_rep0.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 0. | | `human_state_dict_rep1.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 1. | | `human_state_dict_rep2.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 2. | | `human_state_dict_rep3.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 3. | | `mouse_state_dict_rep0.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 0. | | `mouse_state_dict_rep1.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 1. | | `mouse_state_dict_rep2.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 2. | | `mouse_state_dict_rep3.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 3. | | `save_wandb_borzoi_human.ipynb` | Jupyter Notebook | Code used to create human checkpoints | | `save_wandb_borzoi_mouse.ipynb` | Jupyter Notebook | Code used to create mouse checkpoints | ## Model Heads & Output Tracks The uman and mouse `.ckpt` files for the same replicate utilize the same core transformer trunk but differ in their species-specific output heads. ### Outputs Human Head: 7,611 total tracks Mouse Head: 2,608 total tracks ## Usage The models are intended for use with the `grelu` library. ```python from grelu.lightning import LightningModel from huggingface_hub import hf_hub_download # Download the desired checkpoint ckpt_path = hf_hub_download( repo_id="Genentech/borzoi-model", filename="human_rep0.ckpt" ) # Load the model model = LightningModel.load_from_checkpoint(ckpt_path) model.eval() ```