|
|
--- |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# EvalAlign Model Card |
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
EVALALIGN, a metric characterized by its accuracy, stability, and fine granularity. |
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
The recent advancements in text-to-image generative models have been remarkable.Yet, the field suffers from a lack of evaluation metrics that accurately reflect the |
|
|
performance of these models, particularly lacking fine-grained metrics that can guide the optimization of the models. In this paper, we propose EVALALIGN, a |
|
|
metric characterized by its accuracy, stability, and fine granularity. Our approach leverages the capabilities of Multimodal Large Language Models (MLLMs) pre- |
|
|
trained on extensive datasets. We develop evaluation protocols that focus on two key dimensions: image faithfulness and text-image alignment. Each protocol |
|
|
comprises a set of detailed, fine-grained instructions linked to specific scoring options, enabling precise manual scoring of the generated images. We Supervised |
|
|
Fine-Tune (SFT) the MLLM to align closely with human evaluative judgments, resulting in a robust evaluation model. Our comprehensive tests across 24 text-to- |
|
|
image generation models demonstrate that EVALALIGN not only provides superior metric stability but also aligns more closely with human preferences than existing |
|
|
metrics, confirming its effectiveness and utility in model assessment. |
|
|
|
|
|
- **Model Release Date:** [June 2024] |
|
|
|
|
|
### Model Sources [optional] |
|
|
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
|
|
- **Repository:** [https://github.com/SAIS-FUXI/EvalAlign] |
|
|
- **Paper [optional]:** |
|
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Refer to our GitHub.https://github.com/SAIS-FUXI/EvalAlign |
|
|
|
|
|
|