Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -41,8 +41,22 @@ We argue that the key to advancing towards AGI lies in the synergy effectβa ca
|
|
| 41 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/-Asn68kJGjgqbGqZMrk4E.png' width=950px>
|
| 42 |
</div>
|
| 43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
---
|
| 45 |
-
|
|
|
|
| 46 |
|
| 47 |
<div align="center">
|
| 48 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/32goE-PYuwOwRvYg4GcfK.png' width=900px>
|
|
@@ -51,10 +65,10 @@ We argue that the key to advancing towards AGI lies in the synergy effectβa ca
|
|
| 51 |
|
| 52 |
---
|
| 53 |
|
| 54 |
-
|
| 55 |
|
| 56 |
-
|
| 57 |
-
|
| 58 |
|
| 59 |
|
| 60 |
<div align="center">
|
|
@@ -73,7 +87,16 @@ This project introduces **General-Level** and **General-Bench**.
|
|
| 73 |
|
| 74 |
|
| 75 |
---
|
| 76 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
<div align="center">
|
| 79 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/d4TIWw3rlWuxpBCEpHYJB.jpeg'>
|
|
|
|
| 41 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/-Asn68kJGjgqbGqZMrk4E.png' width=950px>
|
| 42 |
</div>
|
| 43 |
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
This project introduces **General-Level** and **General-Bench**.
|
| 48 |
+
|
| 49 |
+
---
|
| 50 |
+
|
| 51 |
+
## πππ Keypoints
|
| 52 |
+
|
| 53 |
+
- [π Overall Leaderboard](#leaderboard)
|
| 54 |
+
- [π General-Level](#level)
|
| 55 |
+
- [π General-Bench](#bench)
|
| 56 |
+
|
| 57 |
---
|
| 58 |
+
|
| 59 |
+
# πππ Overall Leaderboard<a name="leaderboard" />
|
| 60 |
|
| 61 |
<div align="center">
|
| 62 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/32goE-PYuwOwRvYg4GcfK.png' width=900px>
|
|
|
|
| 65 |
|
| 66 |
---
|
| 67 |
|
| 68 |
+
# πππ General-Level<a name="level" />
|
| 69 |
|
| 70 |
+
**A 5-scale level evaluation system with a new norm for assessing the multimodal generalists (multimodal LLMs/agents).
|
| 71 |
+
The core is the use of <b style="color:red">synergy</b> as the evaluative criterion, categorizing capabilities based on whether MLLMs preserve synergy across comprehension and generation, as well as across multimodal interactions.**
|
| 72 |
|
| 73 |
|
| 74 |
<div align="center">
|
|
|
|
| 87 |
|
| 88 |
|
| 89 |
---
|
| 90 |
+
|
| 91 |
+
# πππ General-Bench<a name="bench" />
|
| 92 |
+
|
| 93 |
+
**A companion massive multimodal benchmark dataset, encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325K instances.**
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
We set two data domains:
|
| 97 |
+
- [**General-Bench-Openset**](https://huggingface.co/datasets/General-Level/General-Bench-Openset) with inputs and labels of samples all publicly open, for open-world use (e.g., academic experiment).
|
| 98 |
+
- [**General-Bench-Closeset**](https://huggingface.co/datasets/General-Level/General-Bench-Closeset) with only sample inputs available, which participants can use for ranking in our leaderboard.
|
| 99 |
+
|
| 100 |
|
| 101 |
<div align="center">
|
| 102 |
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/d4TIWw3rlWuxpBCEpHYJB.jpeg'>
|