| | --- |
| | license: bsd-3-clause |
| | library_name: pytorch |
| | tags: |
| | - protein |
| | - generative-model |
| | - computational-biology |
| | - neural-hamiltonian-flow |
| | --- |
| | |
| | # ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation |
| |
|
| | **Authors:** Bharath Raghavan¹, David M. Rogers¹ |
| |
|
| | **Affiliations:** |
| | ¹ National Center for Computational Sciences, Oak Ridge National Laboratory |
| |
|
| | ## Introduction |
| |
|
| | ProtNHF is a generative model for protein sequences that enables continuous, controllable design without retraining. It leverages neural Hamiltonian flows with a Transformer-based energy function to map a latent Gaussian to protein embeddings. Sampling-time bias functions allow steering properties like amino acid composition or net charge smoothly and predictably. Generated sequences achieve high quality as measured by ESM-2 pseudo-perplexity and AlphaFold2 pLDDT scores. ProtNHF provides a flexible, physically interpretable framework for programmable protein sequence generation. |
| |
|
| | The source code is available here: https://github.com/bharath-raghavan/ProtNHF.git |
| |
|
| | ## Model Details |
| |
|
| | This current upload corresponds to model/architecture version 1. |
| |
|
| | ### Model Architecture |
| |
|
| | The following are the model parameters: |
| |
|
| | ``` |
| | dt: 0.05 |
| | niter: 4 |
| | hidden_dims: 128 |
| | std: 0.7 |
| | integrator: leapfrog |
| | n_types: 20 |
| | energy: |
| | d_model: 320 |
| | ff_dim: 1280 |
| | n_heads: 20 |
| | n_layers: 6 |
| | ``` |
| |
|
| | ### Training |
| |
|
| | The training was performed using Pytorch DDP on 64*8 GPUs, with a batch size per GPU of 30. The training was performed for 650 epochs. The optimizer and LR scheduler parameter are given below: |
| | |
| | ``` |
| | lr: 1e-4 |
| | betas: [0.9, 0.95] |
| | weight_decay: 0.01 |
| | warmup_epochs: 5 |
| | ``` |
| | |
| | ## Citation |
| | |
| | If you use ProtNHF in your research, please cite: |
| | |
| | B. Raghavan and D. M. Rogers |
| | **ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation** |
| | arXiv:xxxx.xxxxx (2026) |
| | |
| | ```bibtex |
| | @article{raghavan2026protnhf, |
| | title = {ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation}, |
| | author = {Raghavan, Bharath and Rogers, David M.}, |
| | journal = {arXiv preprint arXiv:xxxx.xxxxx}, |
| | year = {2026} |
| | } |
| | ``` |
| | |
| | ## License |
| | |
| | ProtNHF code and model weights are licensed under the BSD-3 license. |
| | |