|
|
--- |
|
|
license: mit |
|
|
library_name: pytorch-lightning |
|
|
pipeline_tag: tabular-regression |
|
|
tags: |
|
|
- biology |
|
|
- genomics |
|
|
datasets: |
|
|
- Genentech/borzoi-data |
|
|
--- |
|
|
|
|
|
# Borzoi Model (Linder et al. 2025) |
|
|
|
|
|
## Model Description |
|
|
This repository contains the weights for the Borzoi model, a model designed to predict functional genomic tracks from genomic DNA sequences. |
|
|
|
|
|
- **Architecture:** Convolutions followed by Transformer and U-net layers. |
|
|
- **Input:** 524,288 bp of genomic DNA sequence. |
|
|
- **Output Resolution:** 32 bp bins. |
|
|
- **Source:** [Linder, J. et al. Nature Genetics (2025)](https://www.nature.com/articles/s41588-024-02053-6) |
|
|
|
|
|
|
|
|
## Repository Content |
|
|
The repository includes both full PyTorch Lightning checkpoints and raw state dictionaries for the human and mouse versions of the model. Each model has 4 independent replicates. Note that the weights are derived from the publication but the model has been converted into the PyTorch Lightning format used by gReLU (https://github.com/Genentech/gReLU). |
|
|
|
|
|
| File | Type | Description | |
|
|
| :--- | :--- | :--- | |
|
|
| `human_rep0.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 0. | |
|
|
| `human_rep1.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 1. | |
|
|
| `human_rep2.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 2. | |
|
|
| `human_rep3.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head, replicate 3. | |
|
|
| `mouse_rep0.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 0. | |
|
|
| `mouse_rep1.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 1. | |
|
|
| `mouse_rep2.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 2. | |
|
|
| `mouse_rep3.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head, replicate 3. | |
|
|
| `human_state_dict_rep0.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 0. | |
|
|
| `human_state_dict_rep1.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 1. | |
|
|
| `human_state_dict_rep2.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 2. | |
|
|
| `human_state_dict_rep3.h5` | HDF5 | Weights-only state dictionary for the human model, replicate 3. | |
|
|
| `mouse_state_dict_rep0.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 0. | |
|
|
| `mouse_state_dict_rep1.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 1. | |
|
|
| `mouse_state_dict_rep2.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 2. | |
|
|
| `mouse_state_dict_rep3.h5` | HDF5 | Weights-only state dictionary for the mouse model, replicate 3. | |
|
|
| `save_wandb_borzoi_human.ipynb` | Jupyter Notebook | Code used to create human checkpoints | |
|
|
| `save_wandb_borzoi_mouse.ipynb` | Jupyter Notebook | Code used to create mouse checkpoints | |
|
|
|
|
|
## Model Heads & Output Tracks |
|
|
The uman and mouse `.ckpt` files for the same replicate utilize the same core transformer trunk but differ in their species-specific output heads. |
|
|
|
|
|
### Outputs |
|
|
|
|
|
Human Head: 7,611 total tracks |
|
|
Mouse Head: 2,608 total tracks |
|
|
|
|
|
## Usage |
|
|
The models are intended for use with the `grelu` library. |
|
|
|
|
|
```python |
|
|
from grelu.lightning import LightningModel |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
# Download the desired checkpoint |
|
|
ckpt_path = hf_hub_download( |
|
|
repo_id="Genentech/borzoi-model", |
|
|
filename="human_rep0.ckpt" |
|
|
) |
|
|
|
|
|
# Load the model |
|
|
model = LightningModel.load_from_checkpoint(ckpt_path) |
|
|
model.eval() |
|
|
``` |