cpgpt-models / README.md
lucascamillomd's picture
Upload README.md with huggingface_hub
159217a verified
---
license: mit
tags:
- DNA-methylation
- epigenetics
- foundation-model
- aging
- biology
---
# CpGPT Model Checkpoints
Model weights, configurations, and vocabularies for [CpGPT: A Foundation Model for DNA Methylation](https://github.com/lcamillo/CpGPT).
## Contents
```
weights/ # PyTorch Lightning checkpoint files (.ckpt)
config/ # Hydra YAML configuration files
vocab/ # CpG vocabulary files (.json)
```
## Pre-trained Models
| Model | Size | Parameters | Model Name |
|-------|------|------------|------------|
| CpGPT-2M | 30MB | ~2.5M | `small` |
| CpGPT-100M | 1.1GB | ~101M | `large` |
## Download
```bash
# Install huggingface_hub
pip install huggingface_hub
# Download all model files
huggingface-cli download lucascamillomd/cpgpt-models --local-dir dependencies/model
# Or download a specific model
huggingface-cli download lucascamillomd/cpgpt-models weights/small.ckpt config/small.yaml vocab/small.json --local-dir dependencies/model
```
## Dependencies
You will also need the DNA embeddings for your species of interest:
- **Human**: [lucascamillomd/cpgpt-human-dependencies](https://huggingface.co/lucascamillomd/cpgpt-human-dependencies)
- **Mammalian (multi-species)**: [lucascamillomd/cpgpt-mammalian-dependencies](https://huggingface.co/lucascamillomd/cpgpt-mammalian-dependencies)
## Usage
After downloading the model files and species dependencies, follow the tutorials at the [CpGPT GitHub repository](https://github.com/lcamillo/CpGPT) to get started.
## Citation
```bibtex
@article{camillo2024cpgpt,
title={CpGPT: A Foundation Model for DNA Methylation},
author={de Lima Camillo, Lucas Paulo et al.},
journal={bioRxiv},
year={2024},
doi={10.1101/2024.10.24.619766}
}
```
## License
MIT License — see the [GitHub repository](https://github.com/lcamillo/CpGPT) for details.