Spaces:

Fysics-AI
/

FysicsWorld-LeaderBoard

Running

App Files Files Community

FRENKIE-CHIANG commited on 25 days ago

Commit

7cd47e9

verified ·

1 Parent(s): affe8ae

Upload app.py with huggingface_hub

Browse files

Files changed (1) hide show

app.py +29 -23

app.py CHANGED Viewed

@@ -341,51 +341,57 @@ with gr.Blocks(
         outputs=[omni_table, image_table, video_table, audio_table],
     )
-    gr.Markdown(
         """
         <div class="overall-definition">
         <h3>📊 Overall Score Definition</h3>
         <p>
-        To facilitate clearer and more consistent comparison across models, we introduce an
-        <b>Overall</b> score for each leaderboard track. The aggregation strategy is tailored
-        to the evaluation protocol of each task category:
         </p>
         <p><b>1. OmniLLM / MLLM</b><br>
-        The <b>Overall</b> score is computed as the arithmetic mean of all reported task-specific scores.
         </p>
         <p><b>2. Image Generation</b><br>
-        The evaluation involves metrics defined on different numerical scales.
-        <b>WIScore</b> is used for image generation, while <b>VIEScore</b> (averaged over three dimensions)
-        is used for image editing.
         </p>
-        <p>
-        The <b>Overall</b> score is defined as:
         </p>
-        <p style="text-align:center; font-size:16px;">
-        \\[
-        \\text{Overall} = \\frac{(\\text{WIScore} \\times 10) + \\left(\\frac{\\sum \\text{VIEScore}}{3}\\right)}{2}
-        \\]
-        </p>
         <p>
-        This normalization-based formulation ensures a balanced contribution from both image generation
-        and image editing performance.
         </p>
         <p><b>3. Video Generation</b><br>
-        The <b>Overall</b> score is calculated as the arithmetic mean of all evaluated dimensions,
-        including imaging quality, aesthetics, motion, and temporal consistency.
         </p>
         </div>
-        """,
-        unsafe_allow_html=True,
     )
 demo.launch()

         outputs=[omni_table, image_table, video_table, audio_table],
     )
+    # ---------- Overall definition (bottom) ----------
+    gr.HTML(
         """
         <div class="overall-definition">
         <h3>📊 Overall Score Definition</h3>
         <p>
+            To facilitate clearer and more consistent comparison across models, we introduce an
+            <b>Overall</b> score for each leaderboard track. The aggregation strategy is tailored
+            to the evaluation protocol of each task category:
         </p>
         <p><b>1. OmniLLM / MLLM</b><br>
+            The <b>Overall</b> score is computed as the arithmetic mean of all reported task-specific scores.
         </p>
         <p><b>2. Image Generation</b><br>
+            The evaluation involves metrics defined on different numerical scales.
+            <b>WIScore</b> is used for image generation, while <b>VIEScore</b> (averaged over three dimensions)
+            is used for image editing.
         </p>
+        <p style="margin-bottom: 6px;">
+            The <b>Overall</b> score is defined as:
         </p>
+        </div>
+        """
+    )
+    gr.Markdown(
+        r"""
+        \[
+        \text{Overall}=\frac{(\text{WIScore}\times 10)+\left(\frac{\sum \text{VIEScore}}{3}\right)}{2}
+        \]
+        """
+    )
+    gr.HTML(
+        """
+        <div class="overall-definition" style="margin-top: -24px;">
         <p>
+            This normalization-based formulation ensures a balanced contribution from both image generation
+            and image editing performance.
         </p>
         <p><b>3. Video Generation</b><br>
+            The <b>Overall</b> score is calculated as the arithmetic mean of all evaluated dimensions,
+            including imaging quality, aesthetics, motion, and temporal consistency.
         </p>
         </div>
+        """
     )
 demo.launch()