Spaces:
Running
Running
File size: 1,587 Bytes
75d0766 b1bfa3e c9f89df b1bfa3e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
---
title: README
emoji: π
colorFrom: blue
colorTo: blue
sdk: static
pinned: false
---
<div align="center">
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/N8lP93rB6lL3iqzML4SKZ.png' width=100px>
<h1 align="center"><b>On Path to Multimodal Generalist: Levels and Benchmarks</b></h1>
<p align="center">
<a href="https://generalist.top/">[π Project]</a>
<a href="https://level.generalist.top">[π Leaderboard]</a>
<a href="https://arxiv.org/abs/2510.10101">[π Paper]</a>
<a href="https://huggingface.co/General-Level">[π€ Dataset-HF]</a>
<a href="https://github.com/path2generalist/GeneralBench">[π Dataset-Github]</a>
</p>
[](https://opensource.org/license/mit)
---
</div>
<h1 align="center" style="color:#F27E7E"><em>
Does higher performance across tasks indicate a stronger capability of MLLM, and closer to AGI?
<br>
NO! <b style="color:red">Synergy</b> does.
</em></h1>
This project introduces:
1. **General-Level**, a 5-scale level evaluation system with a new norm for assessing the multimodal generalists (multimodal LLMs/agents). The core is the use of Synergy as the evaluative criterion, categorizing capabilities based on whether MLLMs preserve synergy across comprehension and generation, as well as across multimodal interactions.
2. **General-Bench**, a companion massive multimodal benchmark dataset, encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325K instances.
|