huckiyang commited on
Commit
7b3777b
·
verified ·
1 Parent(s): c48c32c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -1
README.md CHANGED
@@ -1,6 +1,94 @@
1
  ---
2
- license: other
3
  library_name: transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
  # <span style="background: linear-gradient(45deg, #667eea 0%, #764ba2 25%, #f093fb 50%, #f5576c 75%, #4facfe 100%); -webkit-background-clip: text; -webkit-text-fill-color: transparent; background-clip: text; font-weight: bold; font-size: 1.1em;">**OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM**</span> <br />
6
 
 
1
  ---
 
2
  library_name: transformers
3
+ license: apache-2.0
4
+ tags:
5
+ - omni-modal
6
+ - multimodal
7
+ - vision
8
+ - audio
9
+ - video
10
+ - llm
11
+ model-index:
12
+ - name: OmniVinci
13
+ results:
14
+ - task:
15
+ type: image-to-text
16
+ name: Image Understanding
17
+ dataset:
18
+ name: MVBench
19
+ type: mvbench
20
+ metrics:
21
+ - name: MVBench Score
22
+ type: accuracy
23
+ value: 70.6
24
+ source:
25
+ name: OmniVinci Technical Report
26
+ url: https://arxiv.org/abs/2510.15870
27
+ - task:
28
+ type: video-to-text
29
+ name: Video Understanding
30
+ dataset:
31
+ name: Video-MME
32
+ type: video-mme
33
+ metrics:
34
+ - name: Video-MME (w/o sub)
35
+ type: accuracy
36
+ value: 68.2
37
+ source:
38
+ name: OmniVinci Technical Report
39
+ url: https://arxiv.org/abs/2510.15870
40
+ - task:
41
+ type: video-to-text
42
+ name: Cross-Modal Understanding
43
+ dataset:
44
+ name: DailyOmni
45
+ type: dailyomni
46
+ metrics:
47
+ - name: DailyOmni Score
48
+ type: accuracy
49
+ value: 66.5
50
+ source:
51
+ name: OmniVinci Technical Report
52
+ url: https://arxiv.org/abs/2510.15870
53
+ - task:
54
+ type: audio-to-text
55
+ name: Audio Understanding
56
+ dataset:
57
+ name: MMAR
58
+ type: mmar
59
+ metrics:
60
+ - name: MMAR Score
61
+ type: accuracy
62
+ value: 58.4
63
+ source:
64
+ name: OmniVinci Technical Report
65
+ url: https://arxiv.org/abs/2510.15870
66
+ - task:
67
+ type: audio-to-text
68
+ name: Audio-Only Reasoning
69
+ dataset:
70
+ name: MMAU
71
+ type: mmau
72
+ metrics:
73
+ - name: MMAU Score
74
+ type: accuracy
75
+ value: 71.6
76
+ source:
77
+ name: OmniVinci Technical Report
78
+ url: https://arxiv.org/abs/2510.15870
79
+ - task:
80
+ type: video-to-text
81
+ name: Multi-Modal Reasoning
82
+ dataset:
83
+ name: Worldsense
84
+ type: worldsense
85
+ metrics:
86
+ - name: Worldsense Score
87
+ type: accuracy
88
+ value: 48.2
89
+ source:
90
+ name: OmniVinci Technical Report
91
+ url: https://arxiv.org/abs/2510.15870
92
  ---
93
  # <span style="background: linear-gradient(45deg, #667eea 0%, #764ba2 25%, #f093fb 50%, #f5576c 75%, #4facfe 100%); -webkit-background-clip: text; -webkit-text-fill-color: transparent; background-clip: text; font-weight: bold; font-size: 1.1em;">**OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM**</span> <br />
94