TaliDror commited on
Commit
25484d6
·
verified ·
1 Parent(s): 9d9d381

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -3
README.md CHANGED
@@ -1,3 +1,55 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ # Token-Based Audio Inpainting via Discrete Diffusion (AIDD)
5
+
6
+ Pretrained model weights for **AIDD**, introduced in:
7
+
8
+ **Token-Based Audio Inpainting via Discrete Diffusion**
9
+ ICLR 2026
10
+ https://arxiv.org/abs/2507.08333
11
+
12
+ AIDD performs audio inpainting by applying diffusion in a discrete token space, enabling semantically coherent reconstruction of missing audio segments, including long gaps of up to 750 ms.
13
+
14
+ ---
15
+
16
+ ## Model
17
+
18
+ The model operates on discrete audio tokens produced by a pretrained WavTokenizer and performs inpainting using a Diffusion Transformer (DiT) trained with a discrete diffusion objective. The training incorporates span-based masking to model structured missing regions and a derivative-based regularization loss that encourages smooth temporal dynamics in token embedding space. The model is designed for restoring missing segments in musical audio, including long gaps.
19
+
20
+ ---
21
+
22
+ ## Usage
23
+
24
+ This repository provides **model weights only**.
25
+ For code, see the official GitHub repository:
26
+
27
+ 👉 https://github.com/iftachShoham/AIDD
28
+
29
+ ---
30
+
31
+ ## Data & Evaluation
32
+
33
+ Trained and evaluated on **MusicNet** and **MAESTRO**, using FAD, LSD, ODG, and MOS metrics.
34
+ See the paper for full details.
35
+
36
+ ---
37
+
38
+ ## Acknowledgments
39
+
40
+ Built upon
41
+ [Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution](https://github.com/louaaron/Score-Entropy-Discrete-Diffusion.git) and
42
+ [WavTokenizer: An Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling](https://github.com/jishengpeng/WavTokenizer.git).
43
+ We thank the authors for making their work publicly available.
44
+
45
+ ---
46
+
47
+ ## Citation
48
+
49
+ ```bibtex
50
+ @article{dror2025token,
51
+ title={Token-based Audio Inpainting via Discrete Diffusion},
52
+ author={Dror, Tali and Shoham, Iftach and Buchris, Moshe and Gal, Oren and Permuter, Haim and Katz, Gilad and Nachmani, Eliya},
53
+ journal={arXiv preprint arXiv:2507.08333},
54
+ year={2025}
55
+ }