nielsr HF Staff commited on
Commit
2d1bc50
·
verified ·
1 Parent(s): a531d9b

Add model card for SEPO

Browse files

This PR adds a comprehensive model card for the SEPO model, linking it to the paper [Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods](https://huggingface.co/papers/2502.01384).

It includes the `pipeline_tag: text-generation` to make the model discoverable. A link to the official GitHub repository and a sample usage snippet for downloading the model checkpoint from the Hub are also provided.

Please review and merge this PR if everything looks good.

Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ ---
4
+
5
+ # Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
6
+
7
+ This repository contains the code for the `SEPO` algorithm presented in the paper: [Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods](https://huggingface.co/papers/2502.01384).
8
+
9
+ `SEPO` (Score Entropy Policy Optimization) is an efficient, broadly applicable, and theoretically justified policy gradient algorithm for fine-tuning discrete diffusion models over non-differentiable rewards. Our numerical experiments across several discrete generative tasks demonstrate the scalability and efficiency of our method, including applications on fine-tuning a masked diffusion language model on DNA sequences.
10
+
11
+ <p align="center">
12
+ <img src="https://github.com/ozekri/SEPO/blob/main/img/denoising_RLHF.gif" width=80% height=80% alt="Denoising RLHF process visualization">
13
+ </p>
14
+
15
+ For more details and the full implementation, please refer to the [official GitHub repository](https://github.com/ozekri/SEPO).
16
+
17
+ ## Sample Usage: Download Checkpoint
18
+
19
+ You can download the fine-tuned models from Hugging Face directly using the `huggingface_hub` Python library to reproduce results:
20
+
21
+ ```python
22
+ from huggingface_hub import hf_hub_download
23
+
24
+ # Example: Download the SEPO fine-tuned model checkpoint
25
+ ckpt_path = hf_hub_download(
26
+ repo_id="Xssama/SEPO_DNA",
27
+ filename="finetuned_sepo_kl.ckpt", # finetuned_sepo_kl_gf.ckpt for SEPO with gradient flow
28
+ cache_dir="./checkpoints" # Optional: specify your preferred local directory
29
+ )
30
+
31
+ print(f"Checkpoint downloaded to: {ckpt_path}")
32
+ ```
33
+
34
+ Alternatively, you can use `wget`:
35
+
36
+ ```bash
37
+ wget https://huggingface.co/Xssama/SEPO-DNA/resolve/main/finetuned_sepo_kl.ckpt -P ./checkpoints/
38
+ ```
39
+
40
+ ## Citation
41
+
42
+ If you find this work useful in your research, please consider citing:
43
+
44
+ ```bibtex
45
+ @article{zekri2025fine,
46
+ title={Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods},
47
+ author={Zekri, Oussama and Boull{\'e}, Nicolas},
48
+ journal={arXiv preprint arXiv:2502.01384},
49
+ year={2025}
50
+ }
51
+ ```