|
|
--- |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
pipeline_tag: text2text-generation |
|
|
license: mit |
|
|
metrics: |
|
|
- fid |
|
|
- cmmd |
|
|
- lpips |
|
|
- accuracy |
|
|
- bleu |
|
|
- nll |
|
|
tags: |
|
|
- text2text |
|
|
- image2image |
|
|
- domain_translation |
|
|
- optimal_transport |
|
|
- discrete_diffusion |
|
|
- schrödinger_bridge |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
# Categorical Schrödinger Bridge Matching (CSBM) |
|
|
|
|
|
[Grigoriy Ksenofontov](https://scholar.google.com/citations?user=e0mirzYAAAAJ), |
|
|
[Alexander Korotin](https://scholar.google.ru/citations?user=1rIIvjAAAAAJ) |
|
|
|
|
|
[](https://arxiv.org/abs/2502.01416) |
|
|
[](https://openreview.net/forum?id=RBly0nOr2h) |
|
|
[](https://github.com/gregkseno/csbm) |
|
|
[](https://huggingface.co/gregkseno/csbm) |
|
|
[](https://wandb.ai/gregkseno/csbm) |
|
|
|
|
|
</div> |
|
|
|
|
|
This repository hosts the official checkpoints for the paper "Categorical Schrödinger Bridge Matching", accepted at ICML 2025. |
|
|
|
|
|
## 📌 TL;DR |
|
|
|
|
|
This paper extends the Schrödinger Bridge problem to work with discrete time and spaces. |
|
|
|
|
|
<!--  --> |
|
|
|
|
|
## 💾 Checkpoints |
|
|
|
|
|
### CSBM |
|
|
|
|
|
| Dataset | Reference Process | α | N | Saved Iteration | |
|
|
| ------------- | ----------------- | ----------- | --------------------- | --------------- | |
|
|
| Colored MNIST | **gaussian** | 0.01 | 2, 4, 10, 25, 50, 100 | 3 | |
|
|
| Colored MNIST | **uniform** | 0.01, 0.05 | 25 | 3 | |
|
|
| CelebA | **uniform** | 0.01, 0.005 | 100 | 4 | |
|
|
| Amazon Review | **uniform** | 0.01, 0.005 | 100 | 5 | |
|
|
|
|
|
> [!NOTE] |
|
|
> Each experiment directory includes a `config.yaml` file with the full training configuration. |
|
|
|
|
|
### Additional Components |
|
|
|
|
|
1. `vqgan_celeba_f8_1024.ckpt` — **VQ-GAN** pretrained on the CelebA dataset |
|
|
2. `tokenizer_amazon.json` — **Tokenizer** trained on the Amazon Reviews dataset |
|
|
|
|
|
## 🎓 Citation |
|
|
|
|
|
```bibtex |
|
|
@inproceedings{ |
|
|
ksenofontov2025categorical, |
|
|
title={Categorical {Schr\"odinger} Bridge Matching}, |
|
|
author={Grigoriy Ksenofontov and Alexander Korotin}, |
|
|
booktitle={Forty-second International Conference on Machine Learning}, |
|
|
year={2025}, |
|
|
url={https://openreview.net/forum?id=RBly0nOr2h} |
|
|
} |
|
|
``` |
|
|
|
|
|
## 🙏 Credits |
|
|
|
|
|
- [Weights & Biases](https://wandb.ai) — experiment-tracking and visualization toolkit; |
|
|
- [Hugging Face](https://huggingface.co) — Tokenizers and Accelerate libraries for tokenizer implementation, parallel training, and checkpoint hosting on the Hub; |
|
|
- [D3PM](https://github.com/google-research/google-research/tree/master/d3pm) — reference implementation of discrete-diffusion models; |
|
|
- [Taming Transformers](https://github.com/CompVis/taming-transformers) — original VQ-GAN codebase; |
|
|
- [VQ-Diffusion](https://github.com/microsoft/VQ-Diffusion) — vector-quantized diffusion architecture; |
|
|
- [MDLM](https://github.com/kuleshov-group/mdlm) — diffusion architecture for text-generation experiments; |
|
|
- [ASBM](https://arxiv.org/abs/2405.14449) — evaluation metrics and baseline models for CelebA face transfer; |
|
|
- [Balancing the Style-Content Trade-Off in Sentiment Transfer Using Polarity-Aware Denoising](https://arxiv.org/abs/2312.14708) — processed Amazon Reviews dataset and sentiment-transfer baselines; |
|
|
- [Inkscape](https://inkscape.org/) — an excellent open-source editor for vector graphics. |