Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,16 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
Project page: https://mj-bench.github.io/
|
| 13 |
Code repository: https://github.com/MJ-Bench/MJ-Bench
|
|
@@ -16,7 +25,7 @@ While text-to-image models like DALLE-3 and Stable Diffusion are rapidly prolife
|
|
| 16 |
|
| 17 |
To address this issue, we introduce MJ-Bench, a novel benchmark which incorporates a comprehensive preference dataset to evaluate multimodal judges in providing feedback for image generation models across four key perspectives: **alignment**, **safety**, **image quality**, and **bias**.
|
| 18 |
|
| 19 |
-

|
| 20 |
|
| 21 |
Specifically, we evaluate a large variety of multimodal judges including
|
| 22 |
|
|
@@ -24,7 +33,7 @@ Specifically, we evaluate a large variety of multimodal judges including
|
|
| 24 |
- 11 open-source VLMs (e.g. LLaVA family)
|
| 25 |
- 4 and close-source VLMs (e.g. GPT-4o, Claude 3)
|
| 26 |
-
|
| 27 |
-

|
| 28 |
|
| 29 |
|
| 30 |
🔥🔥We are actively updating the [leaderboard](https://mj-bench.github.io/) and you are welcome to submit the evaluation result of your multimodal judge on [our dataset](https://huggingface.co/datasets/MJ-Bench/MJ-Bench) to [huggingface leaderboard](https://huggingface.co/spaces/MJ-Bench/MJ-Bench-Leaderboard).
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# MJ-Bench Team: Align
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
## 😎 [**MJ-Video**: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation](https://aiming-lab.github.io/MJ-VIDEO.github.io/)
|
| 14 |
+
|
| 15 |
+
We release MJ-Bench-Video, a comprehensive fine-grained video preference benchmark, and MJ-Video, a powerful MoE-based multi-dimensional video reward model!
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
## 👩⚖️ [**MJ-Bench**: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?](https://mj-bench.github.io/)
|
| 20 |
|
| 21 |
Project page: https://mj-bench.github.io/
|
| 22 |
Code repository: https://github.com/MJ-Bench/MJ-Bench
|
|
|
|
| 25 |
|
| 26 |
To address this issue, we introduce MJ-Bench, a novel benchmark which incorporates a comprehensive preference dataset to evaluate multimodal judges in providing feedback for image generation models across four key perspectives: **alignment**, **safety**, **image quality**, and **bias**.
|
| 27 |
|
| 28 |
+
<!--  -->
|
| 29 |
|
| 30 |
Specifically, we evaluate a large variety of multimodal judges including
|
| 31 |
|
|
|
|
| 33 |
- 11 open-source VLMs (e.g. LLaVA family)
|
| 34 |
- 4 and close-source VLMs (e.g. GPT-4o, Claude 3)
|
| 35 |
-
|
| 36 |
+
<!--  -->
|
| 37 |
|
| 38 |
|
| 39 |
🔥🔥We are actively updating the [leaderboard](https://mj-bench.github.io/) and you are welcome to submit the evaluation result of your multimodal judge on [our dataset](https://huggingface.co/datasets/MJ-Bench/MJ-Bench) to [huggingface leaderboard](https://huggingface.co/spaces/MJ-Bench/MJ-Bench-Leaderboard).
|