Spaces:

General-Level
/

README

Running

File size: 1,587 Bytes

75d0766
 
 
 
 
 
 
 
 
b1bfa3e
 
c9f89df
b1bfa3e

---
title: README
emoji: 🌍
colorFrom: blue
colorTo: blue
sdk: static
pinned: false
---


<div align="center">
<img src='https://cdn-uploads.huggingface.co/production/uploads/647773a1168cb428e00e9a8f/N8lP93rB6lL3iqzML4SKZ.png'  width=100px>

<h1 align="center"><b>On Path to Multimodal Generalist: Levels and Benchmarks</b></h1>
<p align="center">
<a href="https://generalist.top/">[📖 Project]</a>
<a href="https://level.generalist.top">[🏆 Leaderboard]</a>
<a href="https://arxiv.org/abs/2510.10101">[📄 Paper]</a>
<a href="https://huggingface.co/General-Level">[🤗 Dataset-HF]</a>
<a href="https://github.com/path2generalist/GeneralBench">[📝 Dataset-Github]</a>
</p>

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/license/mit)

---
</div>



<h1 align="center" style="color:#F27E7E"><em>
Does higher performance across tasks indicate a stronger capability of MLLM, and closer to AGI?
<br>
NO! <b style="color:red">Synergy</b> does.
</em></h1>


This project introduces:

1. **General-Level**, a 5-scale level evaluation system with a new norm for assessing the multimodal generalists (multimodal LLMs/agents). The core is the use of Synergy as the evaluative criterion, categorizing capabilities based on whether MLLMs preserve synergy across comprehension and generation, as well as across multimodal interactions. 

2. **General-Bench**, a companion  massive multimodal benchmark dataset, encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325K instances.