Any-to-Any
Transformers
Safetensors
lijiang commited on
Commit
2233ee1
·
verified ·
1 Parent(s): 71ca865

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -5
README.md CHANGED
@@ -12,10 +12,6 @@ Omni-Diffusion is the first any-to-any multimodal language model built entirely
12
  - **Project Page:** [https://omni-diffusion.github.io](https://omni-diffusion.github.io)
13
  - **Repository:** [https://github.com/VITA-MLLM/Omni-Diffusion](https://github.com/VITA-MLLM/Omni-Diffusion)
14
 
15
- ## Model Description
16
-
17
- Omni-Diffusion employs a unified mask-based discrete diffusion model to capture the joint distribution over discrete multimodal tokens. This approach supports not only bimodal tasks (such as text-to-image or speech-to-text) but also more complex scenarios involving multiple modalities simultaneously, such as spoken visual question answering. On a diverse set of benchmarks, the method outperforms or performs on par with existing multimodal systems, highlighting the potential of diffusion models for multimodal foundation models.
18
-
19
  ## Usage
20
 
21
  As the model uses a custom architecture, it can be loaded using the `transformers` library with `trust_remote_code=True`:
@@ -35,7 +31,7 @@ If you find this work helpful for your research, please consider citing:
35
  ```bibtex
36
  @article{li2026omni,
37
  title={Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion},
38
- author={Li, Lijiang and Long, Zuwei) and Shen, Yunhang and Gao, Heting and Cao, Haoyu and Sun, Xing and Shan, Caifeng and He, Ran and Fu, Chaoyou},
39
  journal={arXiv preprint arXiv:2603.06577},
40
  year={2026}
41
  }
 
12
  - **Project Page:** [https://omni-diffusion.github.io](https://omni-diffusion.github.io)
13
  - **Repository:** [https://github.com/VITA-MLLM/Omni-Diffusion](https://github.com/VITA-MLLM/Omni-Diffusion)
14
 
 
 
 
 
15
  ## Usage
16
 
17
  As the model uses a custom architecture, it can be loaded using the `transformers` library with `trust_remote_code=True`:
 
31
  ```bibtex
32
  @article{li2026omni,
33
  title={Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion},
34
+ author={Li, Lijiang and Long, Zuwei and Shen, Yunhang and Gao, Heting and Cao, Haoyu and Sun, Xing and Shan, Caifeng and He, Ran and Fu, Chaoyou},
35
  journal={arXiv preprint arXiv:2603.06577},
36
  year={2026}
37
  }