File size: 2,227 Bytes
7a4d000
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77091bd
 
7a4d000
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2ab920d
 
 
 
7a4d000
 
77091bd
 
 
 
24916a9
77091bd
 
 
 
 
 
 
 
 
 
2769e3d
58ac0f4
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
license: bsd-3-clause
library_name: pytorch
tags:
  - protein
  - generative-model
  - computational-biology
  - neural-hamiltonian-flow
---

# ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation

**Authors:** Bharath Raghavan¹, David M. Rogers¹  

**Affiliations:**  
¹ National Center for Computational Sciences, Oak Ridge National Laboratory

## Introduction

ProtNHF is a generative model for protein sequences that enables continuous, controllable design without retraining. It leverages neural Hamiltonian flows with a Transformer-based energy function to map a latent Gaussian to protein embeddings. Sampling-time bias functions allow steering properties like amino acid composition or net charge smoothly and predictably. Generated sequences achieve high quality as measured by ESM-2 pseudo-perplexity and AlphaFold2 pLDDT scores. ProtNHF provides a flexible, physically interpretable framework for programmable protein sequence generation.

The source code is available here: https://github.com/bharath-raghavan/ProtNHF.git

## Model Details

This current upload corresponds to model/architecture version 1.

### Model Architecture

The following are the model parameters:

```
  dt: 0.05
  niter: 4
  hidden_dims: 128
  std: 0.7
  integrator: leapfrog
  n_types: 20
  energy:
    d_model: 320
    ff_dim: 1280
    n_heads: 20
    n_layers: 6
```

### Training

The training was performed using Pytorch DDP on 64*8 GPUs, with a batch size per GPU of 30. The training was performed for 650 epochs. The optimizer and LR scheduler parameter are given below:

```
  lr: 1e-4
  betas: [0.9, 0.95]
  weight_decay: 0.01
  warmup_epochs: 5
```

## Citation

If you use ProtNHF in your research, please cite:

B. Raghavan and D. M. Rogers  
**ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation**  
arXiv:xxxx.xxxxx (2026)

```bibtex
@article{raghavan2026protnhf,
  title   = {ProtNHF: Neural Hamiltonian Flows for Controllable Protein Sequence Generation},
  author  = {Raghavan, Bharath and Rogers, David M.},
  journal = {arXiv preprint arXiv:xxxx.xxxxx},
  year    = {2026}
}
```

## License

ProtNHF code and model weights are licensed under the BSD-3 license.