--- license: mit library_name: pytorch-lightning pipeline_tag: tabular-regression tags: - biology - genomics datasets: - Genentech/enformer-data --- # Enformer Model (Avsec et al. 2021) ## Model Description This repository contains the weights for the Enformer model, a long-range transformer architecture designed to predict functional genomic tracks from genomic DNA sequences. - **Architecture:** Convolutions followed by Transformer layers. - **Input:** 196,608 bp of genomic DNA sequence. - **Output Resolution:** 128 bp bins. - **Source:** [Avsec, Ž. et al. Nature Methods (2021)](https://www.nature.com/articles/s41592-021-01252-x) ## Repository Content The repository includes both full PyTorch Lightning checkpoints and raw state dictionaries for the human and mouse versions of the model. Note that the weights are derived from the publication but the model has been converted into the PyTorch Lightning format used by gReLU (https://github.com/Genentech/gReLU). | File | Type | Description | | :--- | :--- | :--- | | `human.ckpt` | PyTorch Lightning | Full checkpoint including base model and human head. | | `mouse.ckpt` | PyTorch Lightning | Full checkpoint including base model and mouse head. | | `human_state_dict.h5` | HDF5 | Weights-only state dictionary for the human model. | | `mouse_state_dict.h5` | HDF5 | Weights-only state dictionary for the mouse model. | | `save_wandb_enformer_human.ipynb` | Jupyter Notebook | Code used to create `human.ckpt` | | `save_wandb_enformer_mouse.ipynb` | Jupyter Notebook | Code used to create `mouse.ckpt` | ## Model Heads & Output Tracks Both `.ckpt` files utilize the same core transformer trunk but differ in their species-specific output heads. ### Outputs Human Head: 5,313 total tracks Mouse Head: 1,643 total tracks ## Usage The models are intended for use with the `grelu` library. ```python from grelu.lightning import LightningModel from huggingface_hub import hf_hub_download # Download the desired checkpoint ckpt_path = hf_hub_download( repo_id="Genentech/enformer-model", filename="human.ckpt" ) # Load the model model = LightningModel.load_from_checkpoint(ckpt_path) model.eval() ```