File size: 1,856 Bytes
3209a6f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | ---
license: mit
tags:
- DNA-methylation
- epigenetics
- foundation-model
- aging
- biology
---
# CpGPT Model Checkpoints
Model weights, configurations, and vocabularies for [CpGPT: A Foundation Model for DNA Methylation](https://github.com/lcamillo/CpGPT).
## Contents
```
weights/ # PyTorch Lightning checkpoint files (.ckpt)
config/ # Hydra YAML configuration files
vocab/ # CpG vocabulary files (.json)
```
## Pre-trained Models
| Model | Size | Parameters | Model Name |
|-------|------|------------|------------|
| CpGPT-2M | 30MB | ~2.5M | `small` |
| CpGPT-100M | 1.1GB | ~101M | `large` |
## Download
```bash
# Install huggingface_hub
pip install huggingface_hub
# Download all model files
huggingface-cli download lucascamillomd/cpgpt-models --local-dir dependencies/model
# Or download a specific model
huggingface-cli download lucascamillomd/cpgpt-models weights/small.ckpt config/small.yaml vocab/small.json --local-dir dependencies/model
```
## Dependencies
You will also need the DNA embeddings for your species of interest:
- **Human**: [lucascamillomd/cpgpt-human-dependencies](https://huggingface.co/lucascamillomd/cpgpt-human-dependencies)
- **Mammalian (multi-species)**: [lucascamillomd/cpgpt-mammalian-dependencies](https://huggingface.co/lucascamillomd/cpgpt-mammalian-dependencies)
## Usage
After downloading the model files and species dependencies, follow the tutorials at the [CpGPT GitHub repository](https://github.com/lcamillo/CpGPT) to get started.
## Citation
```bibtex
@article{camillo2024cpgpt,
title={CpGPT: A Foundation Model for DNA Methylation},
author={de Lima Camillo, Lucas Paulo et al.},
journal={bioRxiv},
year={2024},
doi={10.1101/2024.10.24.619766}
}
```
## License
MIT License — see the [GitHub repository](https://github.com/lcamillo/CpGPT) for details.
|