Turing-test-web-en

Sleeping

App Files Files Community

Spark Chou commited on Jul 22, 2025

Commit

4a831fc

1 Parent(s): a90eaa6

new

Browse files

Files changed (1) hide show

app.py +9 -6

app.py CHANGED Viewed

@@ -898,9 +898,11 @@ with gr.Blocks(theme=gr.themes.Soft(), css=".gradio-container {max-width: 960px
             go_to_pretest_btn = gr.Button("Got it, start the test", variant="primary")
     with pretest_page:
-        gr.Markdown("""## Pre-Test Instructions
-- For each question, you'll evaluate the **response** (not the initiator) across **5 dimensions**.
 - Under each dimension, score **every listed feature** from **0 to 5**:
 ### 🔢 Scoring Guide:
@@ -911,17 +913,18 @@ with gr.Blocks(theme=gr.themes.Soft(), css=".gradio-container {max-width: 960px
 - **4** – Somewhat human-like
 - **5** – Strongly human-like
-- After rating all dimensions, make a final judgment: is the **responder** a human or an AI?
 - You can freely switch between dimensions using the **Previous** and **Next** buttons.
 ---
 ### ⚠️ Important Notes:
-- Focus on whether the **responder's speech** sounds more **human-like or machine-like** for each feature — not just whether the feature is "present".
 > For example: correct pronunciation doesn't always mean "human", and mispronunciation doesn't mean "AI". Think in terms of human-likeness.
-- Even if you're confident early on about the responder's identity, still evaluate **each dimension independently**.
   Avoid just labeling all dimensions as "machine-like" or "human-like" without listening carefully.
 """)
         go_to_test_btn = gr.Button("Start the Test", variant="primary")

             go_to_pretest_btn = gr.Button("Got it, start the test", variant="primary")
     with pretest_page:
+        gr.Markdown("""## Test Instructions
+- Every dialogue includes 2 speakers and lasts around 1 minute.
+- **Initiator:** The one who talks the first in the dialogue.
+- **Respondent:** The other one.
+- For each question, you'll evaluate the **respondent** (not the initiator) across **5 dimensions**.
 - Under each dimension, score **every listed feature** from **0 to 5**:
 ### 🔢 Scoring Guide:
 - **4** – Somewhat human-like
 - **5** – Strongly human-like
+- After rating all dimensions, make a final judgment: is the **respondent** a human or an AI?
 - You can freely switch between dimensions using the **Previous** and **Next** buttons.
 ---
 ### ⚠️ Important Notes:
+- Once you start the test, try not to refresh the page or quit it. You need to grade 5 recordings every test.
+- Focus on whether the **respondent's speech** sounds more **human-like or machine-like** for each feature — not just whether the feature is "present".
 > For example: correct pronunciation doesn't always mean "human", and mispronunciation doesn't mean "AI". Think in terms of human-likeness.
+- Even if you're confident early on about the respondent's identity, still evaluate **each dimension independently**.
   Avoid just labeling all dimensions as "machine-like" or "human-like" without listening carefully.
 """)
         go_to_test_btn = gr.Button("Start the Test", variant="primary")