---
license: apache-2.0
library_name: alphagenome-pytorch
tags:
  - genomics
  - biology
  - dna
  - deep-learning
  - regulatory-genomics
  - chromatin-accessibility
  - gene-expression
pipeline_tag: other
---

# AlphaGenome PyTorch

A PyTorch port of [AlphaGenome](https://www.nature.com/articles/s41586-025-10014-0), the DNA sequence model from Google DeepMind that predicts hundreds of genomic tracks at single base-pair resolution from sequences up to 1M bp.

This is an accessible, readable, and hackable implementation for integrating into existing PyTorch pipelines, fine-tuning on custom datasets, and building on top of.

## Model Details

- **Parameters**: 450M
- **Input**: One-hot encoded DNA sequence
- **Organisms**: Human, Mouse
- **Weights**: Converted from the official JAX checkpoint

## Download Weights

Available weight files:
- `model_all_folds.safetensors` - trained on all data (recommended)
- `model_fold_0.safetensors` through `model_fold_3.safetensors` - individual CV folds

```bash
# Using Hugging Face CLI
hf download gtca/alphagenome_pytorch model_all_folds.safetensors --local-dir .

# Or using Python
pip install huggingface_hub
python -c "from huggingface_hub import hf_hub_download; hf_hub_download('gtca/alphagenome_pytorch', 'model_all_folds.safetensors', local_dir='.')"
```

## Usage

```python
from alphagenome_pytorch import AlphaGenome
from alphagenome_pytorch.utils.sequence import sequence_to_onehot_tensor
import pyfaidx

model = AlphaGenome.from_pretrained("model_all_folds.safetensors")

with pyfaidx.Fasta("hg38.fa") as genome:
    sequence = str(genome["chr1"][1_000_000:1_131_072])

dna_onehot = sequence_to_onehot_tensor(sequence).unsqueeze(0)

preds = model.predict(dna_onehot, organism_index=0)  # 0=human, 1=mouse

# Access predictions by head name and resolution:
# - preds['atac'][1]: 1bp resolution, shape (batch, 131072, 256)
# - preds['atac'][128]: 128bp resolution, shape (batch, 1024, 256)
```

## Model Outputs

| Head | Tracks | Resolutions | Description |
|------|--------|-------------|-------------|
| atac | 256 | 1bp, 128bp | Chromatin accessibility |
| dnase | 384 | 1bp, 128bp | DNase-seq |
| procap | 128 | 1bp, 128bp | Transcription initiation |
| cage | 640 | 1bp, 128bp | 5' cap RNA |
| rnaseq | 768 | 1bp, 128bp | RNA expression |
| chip_tf | 1664 | 128bp | TF binding |
| chip_histone | 1152 | 128bp | Histone modifications |
| contact_maps | 28 | 64x64 | 3D chromatin contacts |
| splice_sites | 5 | 1bp | Splice site classification (D+, A+, D−, A−, None) |
| splice_junctions | 734 | pairwise | Junction read counts |
| splice_site_usage | 734 | 1bp | Splice site usage fraction |

## Installation

```bash
pip install alphagenome-pytorch
```

## License

The weights were ported from the weights [provided by Google DeepMind](https://www.kaggle.com/models/google/alphagenome). Those weights were created by Google DeepMind and are the property of Google LLC.
They are released under the [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0),
consistent with the [official release on Kaggle](https://www.kaggle.com/models/google/alphagenome).
They are subject to the model terms at https://deepmind.google.com/science/alphagenome/model-terms.

## Links

- [GitHub Repository](https://github.com/genomicsxai/alphagenome-pytorch)
- [Reference JAX Implementation](https://github.com/google-deepmind/alphagenome_research) (by Google DeepMind)
- [AlphaGenome Paper](https://www.nature.com/articles/s41586-025-10014-0)
- [AlphaGenome Documentation](https://www.alphagenomedocs.com/)

## Citation

```bibtex
@article{avsec2026alphagenome,
  title={Advancing regulatory variant effect prediction with AlphaGenome},
  author={Avsec, {\v{Z}}iga and Latysheva, Natasha and Cheng, Jun and Novati, Guido and Taylor, Kyle R and Ward, Tom and Bycroft, Clare and Nicolaisen, Lauren and Arvaniti, Eirini and Pan, Joshua and others},
  journal={Nature},
  volume={649},
  number={8099},
  pages={1206--1218},
  year={2026},
  publisher={Nature Publishing Group UK London}
}
```