Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
kennymckormick
commited on
Commit
Β·
4e15c72
1
Parent(s):
2347f30
update README
Browse files- lb_info.py +4 -3
lb_info.py
CHANGED
|
@@ -25,8 +25,9 @@ CITATION_BUTTON_TEXT = r"""@misc{2023opencompass,
|
|
| 25 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
| 26 |
# CONSTANTS-TEXT
|
| 27 |
LEADERBORAD_INTRODUCTION = """# OpenVLM Leaderboard
|
| 28 |
-
|
| 29 |
-
|
|
|
|
| 30 |
|
| 31 |
This leaderboard was last updated: {}.
|
| 32 |
"""
|
|
@@ -131,7 +132,7 @@ LEADERBOARD_MD['COCO_VAL'] = """
|
|
| 131 |
"""
|
| 132 |
|
| 133 |
LEADERBOARD_MD['ScienceQA_VAL'] = """
|
| 134 |
-
|
| 135 |
|
| 136 |
- We benchmark the **image** subset of ScienceQA validation and test set, and report the Top-1 accuracy.
|
| 137 |
- During evaluation, we use `GPT-3.5-Turbo-0613` as the choice extractor for all VLMs if the choice can not be extracted via heuristic matching. **Zero-shot** inference is adopted.
|
|
|
|
| 25 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
| 26 |
# CONSTANTS-TEXT
|
| 27 |
LEADERBORAD_INTRODUCTION = """# OpenVLM Leaderboard
|
| 28 |
+
## Welcome to the OpenVLM Leaderboard! On this leaderboard we share the evaluation results of VLMs obtained by the OpenSource Framework:
|
| 29 |
+
## [*VLMEvalKit*: A Toolkit for Evaluating Large Vision-Language Models](https://github.com/open-compass/VLMEvalKit) π
|
| 30 |
+
## Currently, OpenVLM Leaderboard covers {} different VLMs (including GPT-4v, Gemini, QwenVLPlus, LLaVA, etc.) and {} different multi-modal benchmarks.
|
| 31 |
|
| 32 |
This leaderboard was last updated: {}.
|
| 33 |
"""
|
|
|
|
| 132 |
"""
|
| 133 |
|
| 134 |
LEADERBOARD_MD['ScienceQA_VAL'] = """
|
| 135 |
+
## ScienceQA Evaluation Results
|
| 136 |
|
| 137 |
- We benchmark the **image** subset of ScienceQA validation and test set, and report the Top-1 accuracy.
|
| 138 |
- During evaluation, we use `GPT-3.5-Turbo-0613` as the choice extractor for all VLMs if the choice can not be extracted via heuristic matching. **Zero-shot** inference is adopted.
|