Any-to-Any
Transformers
Safetensors
chameleon
image-to-text
multimodal
reasoning
sft
rl
charlesdj commited on
Commit
20b00ad
·
verified ·
1 Parent(s): 0c73807

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - multimodal
5
+ - reasoning
6
+ - sft
7
+ - rl
8
+ datasets:
9
+ - LightChen2333/M3CoT
10
+ - ModalityDance/Omni-Bench
11
+ base_model:
12
+ - GAIR/Anole-7b-v0.1
13
+ license: mit
14
+ ---
15
+
16
+ # Omni-R1-Zero
17
+
18
+ Omni-R1-Zero is trained without multimodal annotations. It bootstraps step-wise visualizations from text-only CoT seeds, then follows the SFT→RL recipe to learn interleaved multimodal reasoning.
19
+
20
+ <p align="center">
21
+ <a href="https://arxiv.org/abs/2601.09536"><b>Paper</b>👁️</a> ·
22
+ <a href="https://github.com/ModalityDance/Omni-R1"><b>Code</b>🐙</a> ·
23
+ <a href="https://huggingface.co/datasets/ModalityDance/Omni-Bench"><b>Omni-Bench</b>🧪</a>
24
+ </p>
25
+
26
+ ## Citation
27
+ ```bibtex
28
+ @misc{cheng2026omnir1unifiedgenerativeparadigm,
29
+ title={Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning},
30
+ author={Dongjie Cheng and Yongqi Li and Zhixin Ma and Hongru Cai and Yupeng Hu and Wenjie Wang and Liqiang Nie and Wenjie Li},
31
+ year={2026},
32
+ eprint={2601.09536},
33
+ archivePrefix={arXiv},
34
+ primaryClass={cs.AI},
35
+ url={https://arxiv.org/abs/2601.09536},
36
+ }
37
+ ```