ha251 commited on
Commit
6c674b5
·
verified ·
1 Parent(s): 82fe3ca

Update miniapp_leaderboard.py

Browse files
Files changed (1) hide show
  1. miniapp_leaderboard.py +24 -9
miniapp_leaderboard.py CHANGED
@@ -162,13 +162,21 @@ def submit(model_name, model_family, zip_file, profile: gr.OAuthProfile):
162
  with gr.Blocks(title=f"{APP_NAME} leaderboard") as demo:
163
  gr.Markdown(f"# {APP_NAME} Leaderboard")
164
  gr.Markdown("""
165
- ### Data
166
 
167
  MiniAppBench is the first comprehensive benchmark designed to evaluate principle-driven, interactive application generation. Unlike prior benchmarks that emphasize static UI layouts or isolated algorithmic code snippets, MiniAppBench targets **MiniApps**—HTML-based applications that require both faithful visual rendering and non-trivial interaction logic.
168
 
169
  The dataset is split into two subsets: **validation (100 instances)** and **test (400 instances)**, and can be accessed at **[MiniAppBench dataset](https://huggingface.co/datasets/MiniAppBench/Dataset)**. The **validation** set includes publicly available **evaluation references** to support reproducible experiments, while the **test** set keeps the references hidden to enable unbiased evaluation.
170
  """)
171
 
 
 
 
 
 
 
 
 
172
  leaderboard = gr.Dataframe(
173
  value=pd.DataFrame(columns=COLUMNS), # 启动不访问Hub
174
  interactive=False,
@@ -185,16 +193,23 @@ with gr.Blocks(title=f"{APP_NAME} leaderboard") as demo:
185
 
186
  gr.Markdown(
187
  """
188
- **Submission requirements**
189
- - Please **sign in with Hugging Face** before submitting.
190
- - **One submission per user per day** (UTC).
191
- - Upload a **.zip** file only.
192
- - The `.zip` must contain the HTML outputs for the **test set queries**.
193
- - Each file should be named using the query index: `<index>.html` (e.g., `1.html`, `2.html`, ...).
194
- - After you submit, we will update the result in 3 days.
195
- """,
 
 
 
 
 
196
  )
197
 
 
 
198
  model_name = gr.Textbox(label="Model name", placeholder="e.g. MyModel v1")
199
  model_family = gr.Textbox(label="Model family", placeholder="e.g. Llama / Qwen / InternLM ...")
200
  zip_file = gr.File(label="Upload zip (.zip only)", file_types=[".zip"])
 
162
  with gr.Blocks(title=f"{APP_NAME} leaderboard") as demo:
163
  gr.Markdown(f"# {APP_NAME} Leaderboard")
164
  gr.Markdown("""
165
+ ## Data
166
 
167
  MiniAppBench is the first comprehensive benchmark designed to evaluate principle-driven, interactive application generation. Unlike prior benchmarks that emphasize static UI layouts or isolated algorithmic code snippets, MiniAppBench targets **MiniApps**—HTML-based applications that require both faithful visual rendering and non-trivial interaction logic.
168
 
169
  The dataset is split into two subsets: **validation (100 instances)** and **test (400 instances)**, and can be accessed at **[MiniAppBench dataset](https://huggingface.co/datasets/MiniAppBench/Dataset)**. The **validation** set includes publicly available **evaluation references** to support reproducible experiments, while the **test** set keeps the references hidden to enable unbiased evaluation.
170
  """)
171
 
172
+ gr.Markdown(
173
+ """
174
+ ## Leaderboard
175
+
176
+ All results shown on this leaderboard are evaluated on the **test split** of MiniAppBench.
177
+ """,
178
+ )
179
+
180
  leaderboard = gr.Dataframe(
181
  value=pd.DataFrame(columns=COLUMNS), # 启动不访问Hub
182
  interactive=False,
 
193
 
194
  gr.Markdown(
195
  """
196
+ **Submission requirements**
197
+ - Please **sign in with Hugging Face** before submitting.
198
+ - **One submission per user per day (UTC)**.
199
+ - Upload a **.zip** file only.
200
+ - The `.zip` must contain the HTML outputs for the **test set queries**.
201
+ - Each file should be named using the query index: `<index>.html` (e.g., `1.html`, `2.html`, ...).
202
+ - We may contact you via email for verification and request additional materials. Please be prepared to provide:
203
+ - **Model access** (one of the following):
204
+ - Preferred: an **inference API endpoint** we can use to reproduce the results.
205
+ - Alternatively: **model checkpoints (ckpts)** plus clear **deployment / inference instructions** (environment, dependencies, and how to run).
206
+ - **A related paper**, if available (e.g., an **arXiv link** or a PDF).
207
+ - After you submit, we will update the results within **3 days**.
208
+ """,
209
  )
210
 
211
+
212
+
213
  model_name = gr.Textbox(label="Model name", placeholder="e.g. MyModel v1")
214
  model_family = gr.Textbox(label="Model family", placeholder="e.g. Llama / Qwen / InternLM ...")
215
  zip_file = gr.File(label="Upload zip (.zip only)", file_types=[".zip"])