roc-hci commited on
Commit
3d3b8ba
·
verified ·
1 Parent(s): 0b74f4a

Update about.py

Browse files
Files changed (1) hide show
  1. about.py +12 -9
about.py CHANGED
@@ -2,20 +2,22 @@ TITLE = "# Turing Bench"
2
 
3
  INTRODUCTION_TEXT = """
4
  ### Welcome to the Turing Bench Leaderboard.
 
 
5
 
6
- This is a benchmark evaluating a model's ability to
7
- recognize text that is produced by humans, compared to text that is produced by AI.
8
  """
 
9
  DESCRIPTION_TEXT = """
10
- ## Dataset: Turing Test Judge Benchmark
11
- From a paired dialogue (Human-Human vs Human-AI), the task is to predict which dialogue is **human-human**
12
-
13
- ### What this dataset is
14
- A collection of **paired dialogue examples**. Each example contains two full dialogue transcripts (**A** and **B**):
15
  - One transcript is **human-human** (two humans talking).
16
  - The other transcript is **human-AI** (a human talking to an AI).
17
 
18
- ### Task
19
  **Binary classification (A/B):** Given `dialogueA` and `dialogueB`, predict which one is the **human-human** dialogue.
20
 
21
  - **Input:** `dialogueA` (string), `dialogueB` (string)
@@ -25,7 +27,8 @@ DESCRIPTION_TEXT = """
25
  Each row (one example) includes:
26
  - `id` (int): pair ID number
27
  - `dialogueA` (string): transcript A
28
- - `dialogueB` (string): transcript B"""
 
29
 
30
  CITATION_BUTTON_TEXT = """
31
  If you use this benchmark, please cite:
 
2
 
3
  INTRODUCTION_TEXT = """
4
  ### Welcome to the Turing Bench Leaderboard.
5
+ This is a benchmark evaluating a model's ability to recognize humans from AI in conversations.
6
+ """
7
 
8
+ LEADERBOARD_DETAIL_TEXT = """
9
+ Submissions are ordered by accuracy (by default). Refresh to fetch the latest evaluated runs.
10
  """
11
+
12
  DESCRIPTION_TEXT = """
13
+ The Turing Benchmark evaluates a model's ability to differentiate human-human conversations from human-AI conversations.
14
+
15
+ ### We provide the dataset
16
+ A collection of **paired dialogues**. Each example contains two full dialogue transcripts (**A** and **B**):
 
17
  - One transcript is **human-human** (two humans talking).
18
  - The other transcript is **human-AI** (a human talking to an AI).
19
 
20
+ ### Your model's task
21
  **Binary classification (A/B):** Given `dialogueA` and `dialogueB`, predict which one is the **human-human** dialogue.
22
 
23
  - **Input:** `dialogueA` (string), `dialogueB` (string)
 
27
  Each row (one example) includes:
28
  - `id` (int): pair ID number
29
  - `dialogueA` (string): transcript A
30
+ - `dialogueB` (string): transcript B
31
+ """
32
 
33
  CITATION_BUTTON_TEXT = """
34
  If you use this benchmark, please cite: