Spaces:
Runtime error
Runtime error
赵福来 commited on
Commit ·
c79d904
1
Parent(s): aa6ecb5
Update the repo name and relevent name from EmbodiedVerse to FlagEval-Robo
Browse files- src/about.py +14 -14
- src/envs.py +1 -1
src/about.py
CHANGED
|
@@ -49,32 +49,32 @@ NUM_FEWSHOT = 0 # Change with your few shot
|
|
| 49 |
|
| 50 |
|
| 51 |
# Your leaderboard name
|
| 52 |
-
TITLE = """<h1 align="center" id="space-title">FlagEval-
|
| 53 |
|
| 54 |
# What does your leaderboard evaluate?
|
| 55 |
|
| 56 |
INTRODUCTION_TEXT = """
|
| 57 |
-
欢迎使用FlagEval-
|
| 58 |
-
FlagEval-
|
| 59 |
|
| 60 |
-
Welcome to the FlagEval-
|
| 61 |
-
FlagEval-
|
| 62 |
-
FlagEvalMM provides a multimodal evaluation framework, while
|
| 63 |
"""
|
| 64 |
# Which evaluations are you running? how can people reproduce what you have?
|
| 65 |
LLM_BENCHMARKS_TEXT = f"""
|
| 66 |
|
| 67 |
-
# The Goal of FlagEval
|
| 68 |
|
| 69 |
-
感谢您积极的参与评测,在未来,我们会持续推动 FlagEval
|
| 70 |
|
| 71 |
-
Thanks for your active
|
| 72 |
|
| 73 |
# Context
|
| 74 |
|
| 75 |
-
FlagEval-
|
| 76 |
|
| 77 |
-
FlagEval-
|
| 78 |
|
| 79 |
We hope to promote a more open ecosystem for embodied model developers to participate and contribute accordingly to the advancement of embodied models. To achieve the goal of fairness,all models are evaluated under the FlagEvalMM framework using standardized GPUs and a unified environment to ensure fairness.
|
| 80 |
|
|
@@ -268,7 +268,7 @@ TABLE_TEXT = """
|
|
| 268 |
"""
|
| 269 |
|
| 270 |
LLM_BENCHMARKS_TEXT2 = """
|
| 271 |
-
##
|
| 272 |
|
| 273 |
FlagEvalMM是一个开源评估框架,旨在全面评估多模态模型,其提供了一种标准化的方法来评估跨各种任务和指标使用多种模式(文本、图像、视频)的模型。
|
| 274 |
|
|
@@ -296,9 +296,9 @@ FlagEvalMM is an open-source evaluation framework designed to comprehensively as
|
|
| 296 |
|
| 297 |
You can find:
|
| 298 |
|
| 299 |
-
- detailed numerical results in the results Hugging Face dataset:
|
| 300 |
|
| 301 |
-
- community queries and running status in the requests Hugging Face dataset:
|
| 302 |
|
| 303 |
|
| 304 |
# Useful links
|
|
|
|
| 49 |
|
| 50 |
|
| 51 |
# Your leaderboard name
|
| 52 |
+
TITLE = """<h1 align="center" id="space-title">FlagEval-Robo</h1>"""
|
| 53 |
|
| 54 |
# What does your leaderboard evaluate?
|
| 55 |
|
| 56 |
INTRODUCTION_TEXT = """
|
| 57 |
+
欢迎使用FlagEval-Robo!
|
| 58 |
+
FlagEval-Robo 旨在通过FlagEval具身工具链跟踪、排名和评估具身大模型(Embodied model),其中FlagEvalMM提供了多模态评估架构,FlagEval-Robo构建了一种基于具身智能高质量评测数据集的能力体系,Leaderboard则通过榜单实时跟踪并呈现不同具身大模型综合能力。
|
| 59 |
|
| 60 |
+
Welcome to the FlagEval-Robo!
|
| 61 |
+
FlagEval-Robo aims to track, rank, and evaluate embodied models through the FlagEval embodied toolchain.
|
| 62 |
+
FlagEvalMM provides a multimodal evaluation framework, while FlagEval-Robo builds a capability system based on high-quality evaluation datasets for embodied intelligence. The Leaderboard tracks and presents the comprehensive capabilities of different embodied large models in real time through a leaderboard.
|
| 63 |
"""
|
| 64 |
# Which evaluations are you running? how can people reproduce what you have?
|
| 65 |
LLM_BENCHMARKS_TEXT = f"""
|
| 66 |
|
| 67 |
+
# The Goal of FlagEval-Robo
|
| 68 |
|
| 69 |
+
感谢您积极的参与评测,在未来,我们会持续推动 FlagEval-Robo 更加完善,维护生态开放,欢迎开发者参与评测方法、工具和数据集的探讨,让我们一起建设更加科学、开放的具身评测工具链。
|
| 70 |
|
| 71 |
+
Thanks for your active participation in the evaluation. In the future, we will continue to promote FlagEval-Robo to be more perfect and maintain the openness of the ecosystem, and we welcome developers to participate in the discussion of evaluation methodology, tools and datasets, so that we can build a more scientific and open embodied evaluation toolchain together.
|
| 72 |
|
| 73 |
# Context
|
| 74 |
|
| 75 |
+
FlagEval-Robo是科学、全面的具身评测工具链,具体包括FlagEvalMM多模态评估框架、EmbodiedVerse具身智能高质量评测数据集以及Leaderboard具身模型能力可视化榜单。我们希望能够推动更加开放的生态,让具身智能大模型开发者参与进来,为推动具身智能大模型进步做出相应的贡献。为了实现公平性的目标,所有模型都在 FlagEvalMM框架下使用标准化 GPU 和统一环境进行评估,以确保公平性。
|
| 76 |
|
| 77 |
+
FlagEval-Robo is a scientific and comprehensive embodied evaluation toolchain, which specifically includes the FlagEvalMM multimodal evaluation framework, the EmbodiedVerse high-quality embodied intelligence evaluation dataset, and the Leaderboard for visualizing the capabilities of embodied models.
|
| 78 |
|
| 79 |
We hope to promote a more open ecosystem for embodied model developers to participate and contribute accordingly to the advancement of embodied models. To achieve the goal of fairness,all models are evaluated under the FlagEvalMM framework using standardized GPUs and a unified environment to ensure fairness.
|
| 80 |
|
|
|
|
| 268 |
"""
|
| 269 |
|
| 270 |
LLM_BENCHMARKS_TEXT2 = """
|
| 271 |
+
## FlagEval-Robo - FlagEvalMM
|
| 272 |
|
| 273 |
FlagEvalMM是一个开源评估框架,旨在全面评估多模态模型,其提供了一种标准化的方法来评估跨各种任务和指标使用多种模式(文本、图像、视频)的模型。
|
| 274 |
|
|
|
|
| 296 |
|
| 297 |
You can find:
|
| 298 |
|
| 299 |
+
- - detailed numerical results in the results Hugging Face dataset: https://huggingface.co/datasets/open-cn-llm-leaderboard/FlagEval-Robo_results
|
| 300 |
|
| 301 |
+
- - community queries and running status in the requests Hugging Face dataset: https://huggingface.co/datasets/open-cn-llm-leaderboard/FlagEval-Robo_request
|
| 302 |
|
| 303 |
|
| 304 |
# Useful links
|
src/envs.py
CHANGED
|
@@ -13,7 +13,7 @@ TOKEN = os.environ.get("HF_TOKEN") # A read/write token for your org
|
|
| 13 |
#RESULTS_REPO = f"{OWNER}/results"
|
| 14 |
#DYNAMIC_INFO_REPO = f"{OWNER}/dynamic_model_information"
|
| 15 |
|
| 16 |
-
REPO_ID = "BAAI/
|
| 17 |
QUEUE_REPO = "open-cn-llm-leaderboard/EmbodiedVerse_requests"
|
| 18 |
DYNAMIC_INFO_REPO = "open-cn-llm-leaderboard/EmbodiedVerse_dynamic_model_information"
|
| 19 |
RESULTS_REPO = "open-cn-llm-leaderboard/EmbodiedVerse_results"
|
|
|
|
| 13 |
#RESULTS_REPO = f"{OWNER}/results"
|
| 14 |
#DYNAMIC_INFO_REPO = f"{OWNER}/dynamic_model_information"
|
| 15 |
|
| 16 |
+
REPO_ID = "BAAI/FlagEval-Robo"
|
| 17 |
QUEUE_REPO = "open-cn-llm-leaderboard/EmbodiedVerse_requests"
|
| 18 |
DYNAMIC_INFO_REPO = "open-cn-llm-leaderboard/EmbodiedVerse_dynamic_model_information"
|
| 19 |
RESULTS_REPO = "open-cn-llm-leaderboard/EmbodiedVerse_results"
|