File size: 2,396 Bytes
7b3b09a
 
 
94c2704
90760cd
 
 
 
 
94c2704
7b3b09a
 
 
 
 
 
 
 
 
 
46408fe
94c2704
7b3b09a
94c2704
 
 
2bfed13
 
94c2704
7b3b09a
94c2704
7b3b09a
94c2704
7b3b09a
 
94c2704
7b3b09a
94c2704
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: cc-by-nc-nd-4.0
---

<h1 align="center">
  Minimal-Action Discrete Schrödinger Bridge Matching<br>
  for Peptide Sequence Design
</h1>


<div align="center">
  <a href="https://shreygoel09.github.io/" target="_blank">Shrey Goel</a><sup>1</sup>&ensp;<b>&middot;</b>&ensp;
  <a href="https://www.chatterjeelab.com/" target="_blank">Pranam Chatterjee</a><sup>2<sup>
  <br>
  <p style="font-size: 16px;">
  <sup>1</sup> Duke University &emsp; 
  <sup>2</sup> University of Pennsylvania &emsp; 
</div>
    
<div align="center">
 <a href="https://arxiv.org/abs/2601.22408v1"><img src="https://img.shields.io/badge/Arxiv-2601.22408-red?style=for-the-badge&logo=Arxiv" alt="arXiv"/></a>

</div>



![madsbm_gif](https://cdn-uploads.huggingface.co/production/uploads/64cd5b3f0494187a9e8b7c69/wHvtzQ5D_IaGklFE2zga6.gif)


Generative modeling of peptide sequences requires navigating a discrete and highly constrained space in which many intermediate states are chemically implausible or unstable. Existing discrete diffusion and flow-based methods rely on reversing fixed corruption processes or following prescribed probability paths, which can force generation through low-likelihood regions and require many sampling steps.

We introduce **Minimal-Action Discrete Schrödinger Bridge Matching (MadSBM)**, a rate-based generative framework for peptide design that formulates generation as a controlled continuous-time Markov process on the amino-acid edit graph. To produce probability trajectories that remain within high-likelihood sequence neighborhoods throughout generation, MadSBM:

1. Defines generation relative to a biologically informed reference process derived from pretrained protein language model logits.
2. Learns a time-dependent control field that biases transition rates to induce low-action transport paths from a masked prior to the data distribution.

Finally, we introduce an objective-guided sampling procedure that steers MadSBM generation toward specific functional targets, representing—to our knowledge—the first application of discrete classifier guidance within a Schrödinger bridge-based generative framework.


## **Repository Authors**
- <u>[Shrey Goel](https://shreygoel09.github.io/)</u> – undergraduate student at Duke University  
- <u>[Pranam Chatterjee](mailto:pranam@seas.upenn.edu)</u> – Assistant Professor at University of Pennsylvania