---
language:
- en
library_name: transformers
pipeline_tag: text2text-generation
license: mit
metrics:
- fid
- cmmd
- lpips
- accuracy
- bleu
- nll
tags:
- text2text
- image2image
- domain_translation
- optimal_transport
- discrete_diffusion
- schrödinger_bridge
---
# Categorical Schrödinger Bridge Matching (CSBM)
[Grigoriy Ksenofontov](https://scholar.google.com/citations?user=e0mirzYAAAAJ),
[Alexander Korotin](https://scholar.google.ru/citations?user=1rIIvjAAAAAJ)
[](https://arxiv.org/abs/2502.01416)
[](https://openreview.net/forum?id=RBly0nOr2h)
[](https://github.com/gregkseno/csbm)
[](https://huggingface.co/gregkseno/csbm)
[](https://wandb.ai/gregkseno/csbm)
This repository hosts the official checkpoints for the paper "Categorical Schrödinger Bridge Matching", accepted at ICML 2025.
## 📌 TL;DR
This paper extends the Schrödinger Bridge problem to work with discrete time and spaces.
## 💾 Checkpoints
### CSBM
| Dataset | Reference Process | α | N | Saved Iteration |
| ------------- | ----------------- | ----------- | --------------------- | --------------- |
| Colored MNIST | **gaussian** | 0.01 | 2, 4, 10, 25, 50, 100 | 3 |
| Colored MNIST | **uniform** | 0.01, 0.05 | 25 | 3 |
| CelebA | **uniform** | 0.01, 0.005 | 100 | 4 |
| Amazon Review | **uniform** | 0.01, 0.005 | 100 | 5 |
> [!NOTE]
> Each experiment directory includes a `config.yaml` file with the full training configuration.
### Additional Components
1. `vqgan_celeba_f8_1024.ckpt` — **VQ-GAN** pretrained on the CelebA dataset
2. `tokenizer_amazon.json` — **Tokenizer** trained on the Amazon Reviews dataset
## 🎓 Citation
```bibtex
@inproceedings{
ksenofontov2025categorical,
title={Categorical {Schr\"odinger} Bridge Matching},
author={Grigoriy Ksenofontov and Alexander Korotin},
booktitle={Forty-second International Conference on Machine Learning},
year={2025},
url={https://openreview.net/forum?id=RBly0nOr2h}
}
```
## 🙏 Credits
- [Weights & Biases](https://wandb.ai) — experiment-tracking and visualization toolkit;
- [Hugging Face](https://huggingface.co) — Tokenizers and Accelerate libraries for tokenizer implementation, parallel training, and checkpoint hosting on the Hub;
- [D3PM](https://github.com/google-research/google-research/tree/master/d3pm) — reference implementation of discrete-diffusion models;
- [Taming Transformers](https://github.com/CompVis/taming-transformers) — original VQ-GAN codebase;
- [VQ-Diffusion](https://github.com/microsoft/VQ-Diffusion) — vector-quantized diffusion architecture;
- [MDLM](https://github.com/kuleshov-group/mdlm) — diffusion architecture for text-generation experiments;
- [ASBM](https://arxiv.org/abs/2405.14449) — evaluation metrics and baseline models for CelebA face transfer;
- [Balancing the Style-Content Trade-Off in Sentiment Transfer Using Polarity-Aware Denoising](https://arxiv.org/abs/2312.14708) — processed Amazon Reviews dataset and sentiment-transfer baselines;
- [Inkscape](https://inkscape.org/) — an excellent open-source editor for vector graphics.