| | --- |
| | license: apache-2.0 |
| | library_name: alphagenome-pytorch |
| | tags: |
| | - genomics |
| | - biology |
| | - dna |
| | - deep-learning |
| | - regulatory-genomics |
| | - chromatin-accessibility |
| | - gene-expression |
| | pipeline_tag: other |
| | --- |
| | |
| | # AlphaGenome PyTorch |
| |
|
| | A PyTorch port of [AlphaGenome](https://www.nature.com/articles/s41586-025-10014-0), the DNA sequence model from Google DeepMind that predicts hundreds of genomic tracks at single base-pair resolution from sequences up to 1M bp. |
| |
|
| | This is an accessible, readable, and hackable implementation for integrating into existing PyTorch pipelines, fine-tuning on custom datasets, and building on top of. |
| |
|
| | ## Model Details |
| |
|
| | - **Parameters**: 450M |
| | - **Input**: One-hot encoded DNA sequence |
| | - **Organisms**: Human, Mouse |
| | - **Weights**: Converted from the official JAX checkpoint |
| |
|
| | ## Download Weights |
| |
|
| | Available weight files: |
| | - `model_all_folds.safetensors` - trained on all data (recommended) |
| | - `model_fold_0.safetensors` through `model_fold_3.safetensors` - individual CV folds |
| |
|
| | ```bash |
| | # Using Hugging Face CLI |
| | hf download gtca/alphagenome_pytorch model_all_folds.safetensors --local-dir . |
| | |
| | # Or using Python |
| | pip install huggingface_hub |
| | python -c "from huggingface_hub import hf_hub_download; hf_hub_download('gtca/alphagenome_pytorch', 'model_all_folds.safetensors', local_dir='.')" |
| | ``` |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from alphagenome_pytorch import AlphaGenome |
| | from alphagenome_pytorch.utils.sequence import sequence_to_onehot_tensor |
| | import pyfaidx |
| | |
| | model = AlphaGenome.from_pretrained("model_all_folds.safetensors") |
| | |
| | with pyfaidx.Fasta("hg38.fa") as genome: |
| | sequence = str(genome["chr1"][1_000_000:1_131_072]) |
| | |
| | dna_onehot = sequence_to_onehot_tensor(sequence).unsqueeze(0) |
| | |
| | preds = model.predict(dna_onehot, organism_index=0) # 0=human, 1=mouse |
| | |
| | # Access predictions by head name and resolution: |
| | # - preds['atac'][1]: 1bp resolution, shape (batch, 131072, 256) |
| | # - preds['atac'][128]: 128bp resolution, shape (batch, 1024, 256) |
| | ``` |
| |
|
| | ## Model Outputs |
| |
|
| | | Head | Tracks | Resolutions | Description | |
| | |------|--------|-------------|-------------| |
| | | atac | 256 | 1bp, 128bp | Chromatin accessibility | |
| | | dnase | 384 | 1bp, 128bp | DNase-seq | |
| | | procap | 128 | 1bp, 128bp | Transcription initiation | |
| | | cage | 640 | 1bp, 128bp | 5' cap RNA | |
| | | rnaseq | 768 | 1bp, 128bp | RNA expression | |
| | | chip_tf | 1664 | 128bp | TF binding | |
| | | chip_histone | 1152 | 128bp | Histone modifications | |
| | | contact_maps | 28 | 64x64 | 3D chromatin contacts | |
| | | splice_sites | 5 | 1bp | Splice site classification (D+, A+, D−, A−, None) | |
| | | splice_junctions | 734 | pairwise | Junction read counts | |
| | | splice_site_usage | 734 | 1bp | Splice site usage fraction | |
| | |
| | ## Installation |
| | |
| | ```bash |
| | pip install alphagenome-pytorch |
| | ``` |
| | |
| | ## License |
| | |
| | The weights were ported from the weights [provided by Google DeepMind](https://www.kaggle.com/models/google/alphagenome). Those weights were created by Google DeepMind and are the property of Google LLC. |
| | They are released under the [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0), |
| | consistent with the [official release on Kaggle](https://www.kaggle.com/models/google/alphagenome). |
| | They are subject to the model terms at https://deepmind.google.com/science/alphagenome/model-terms. |
| | |
| | ## Links |
| | |
| | - [GitHub Repository](https://github.com/genomicsxai/alphagenome-pytorch) |
| | - [Reference JAX Implementation](https://github.com/google-deepmind/alphagenome_research) (by Google DeepMind) |
| | - [AlphaGenome Paper](https://www.nature.com/articles/s41586-025-10014-0) |
| | - [AlphaGenome Documentation](https://www.alphagenomedocs.com/) |
| | |
| | ## Citation |
| | |
| | ```bibtex |
| | @article{avsec2026alphagenome, |
| | title={Advancing regulatory variant effect prediction with AlphaGenome}, |
| | author={Avsec, {\v{Z}}iga and Latysheva, Natasha and Cheng, Jun and Novati, Guido and Taylor, Kyle R and Ward, Tom and Bycroft, Clare and Nicolaisen, Lauren and Arvaniti, Eirini and Pan, Joshua and others}, |
| | journal={Nature}, |
| | volume={649}, |
| | number={8099}, |
| | pages={1206--1218}, |
| | year={2026}, |
| | publisher={Nature Publishing Group UK London} |
| | } |
| | ``` |
| | |