File size: 9,120 Bytes
c8dc8ed
 
 
 
 
2a54ac9
c8dc8ed
c39cde6
c8dc8ed
 
686061d
2a54ac9
686061d
2a54ac9
be372a2
92e4fdf
686061d
2a54ac9
397640a
 
2a54ac9
df6357d
c1e0969
77952cc
8345b67
 
 
a87907d
df6357d
259655e
2c8bcb7
3aa6328
df6357d
686061d
3aa6328
df6357d
 
 
74c89ca
df6357d
686061d
df6357d
259655e
df6357d
686061d
 
df6357d
3c799fc
ed6159c
df6357d
ba2c636
 
 
df6357d
 
686061d
df6357d
397640a
88bafe2
df6357d
88bafe2
df6357d
fb1d9bd
 
8a49292
fd6daba
33cd4a2
2a54ac9
 
686061d
df6357d
686061d
 
397640a
df6357d
 
 
397640a
 
df6357d
 
686061d
77952cc
 
 
686061d
df6357d
686061d
397640a
 
 
686061d
 
259655e
 
 
686061d
259655e
 
397640a
686061d
259655e
2a54ac9
259655e
2a54ac9
259655e
 
 
 
2a54ac9
a87907d
259655e
 
 
74c89ca
df6357d
ba2c636
 
 
259655e
 
74c89ca
df6357d
c39cde6
397640a
259655e
74c89ca
259655e
dd80eb0
93603ec
 
fb1d9bd
529e637
fd6daba
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
title: README
emoji: ๐Ÿ“Š
colorFrom: purple
colorTo: indigo
โ€ฆ: static
pinned: false
sdk: static
---

# Labeling Guideline

๋ ˆ์ด๋ธ”๋ง์— ์ฐธ์—ฌํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

์—ฌ๋Ÿฌ๋ถ„์€ ๋™์ผํ•œ ์กฐ์‚ฌ ๋Œ€์ƒ์ž์— ๋Œ€ํ•œ **์„œ๋กœ ๋‹ค๋ฅธ ๋‘ AI ์‹ฌ๋ฌธ๊ด€ A์™€ B**์˜ ์ธํ„ฐ๋ทฐ ๋Œ€ํ™” ๊ธฐ๋ก์˜ ์ผ๋ถ€๋ฅผ ๋ณด๊ณ , AI ์‹ฌ๋ฌธ๊ด€์˜ ์งˆ๋ฌธ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. 

์—ฌ๋Ÿฌ๋ถ„์ด ํ•ด์ฃผ์…”์•ผ ํ•  ํƒœ์Šคํฌ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ๋‘ ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค.

* ์„œ๋กœ ๋‹ค๋ฅธ ๋‘ ์‹ฌ๋ฌธ๊ด€ (A , B) ์ค‘, ์–ด๋–ค ์‹ฌ๋ฌธ๊ด€์ด ๋” ๋‚˜์€ ์งˆ๋ฌธ์„ ํ•˜๋Š”์ง€ ํŒ๋‹จํ•˜์„ธ์š”.
* ๊ฐ ์‹ฌ๋ฌธ๊ด€์˜ ์ž์งˆ์„ 5์  ์ฒ™๋„๋กœ ํ‰๊ฐ€ํ•ด ์ฃผ์„ธ์š”.

# ํ‰๊ฐ€ ๊ธฐ์ค€
๋ณธ ์ธํ„ฐ๋ทฐ๋Š” ์ธํ„ฐ๋ทฐ์ด๊ฐ€ ๋ณธ์ธ์˜ ์ •๋ณด, ๊ธฐ์–ต, ๊ฒฝํ—˜์„ ์ผ๊ด€๋˜๊ฒŒ ๋‹ต๋ณ€ํ•˜๊ณ , ๋˜ ํ•ด๋‹น ๋‹ต๋ณ€๋“ค์ด ์™ธ๋ถ€ ์„ธ๊ณ„์™€๋„ ๋ชจ์ˆœ์ด ์—†๋Š”์ง€๋ฅผ ํ™•์ธํ•˜๊ณ ์ž ํ•˜๋Š” ๊ณผ์ •์ž…๋‹ˆ๋‹ค. 

๋”ฐ๋ผ์„œ ๋ณธ ์ธํ„ฐ๋ทฐ์—์„œ ์ข‹์€ ์งˆ๋ฌธ์ด๋ž€ '์ธํ„ฐ๋ทฐ์ด์™€ ๊ด€๋ จ๋œ ์ตœ๋Œ€ํ•œ ๊ตฌ์ฒด์ ์ด๊ณ  ๊ฒ€์ฆ ๊ฐ€๋Šฅํ•œ ๋‹ต๋ณ€๋“ค์„ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ์งˆ๋ฌธ'์ž…๋‹ˆ๋‹ค. 

์ข€ ๋” ๊ตฌ์ฒดํ™”๋ฅผ ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

### ์ข‹์€ ์งˆ๋ฌธ์˜ ๊ธฐ์ค€
* ํ•œ ์ฃผ์ œ์— ๋Œ€ํ•ด ๋‹ต๋ณ€์ด ์ถฉ๋ถ„ํžˆ ๊ตฌ์ฒดํ™”๊ฐ€ ๋  ๋•Œ๊นŒ์ง€ ์งˆ๋ฌธํ–ˆ๋Š”๊ฐ€
  * ๋งŒ์•ฝ ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋‹ต๋ณ€์„ ์–ป์ง€ ๋ชปํ•ด์„œ ์žฌ์งˆ๋ฌธํ•˜๊ณ ์ž ํ•  ๊ฒฝ์šฐ, ์งˆ๋ฌธ์„ ๋‹ค๋ฅธ ํ‘œํ˜„์œผ๋กœ ๋ฐ”๊พธ์–ด์„œ(paraphraseํ•ด์„œ) ๋ฌผ์–ด๋ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  * ๋‹จ, ๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์ธํ„ฐ๋ทฐ์ด๊ฐ€ ๊ด€๋ จ ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋‹ต๋ณ€์„ ๊ณ„์† ๊ฑฐ๋ถ€ํ•  ๊ฒฝ์šฐ ๋‹ค๋ฅธ ์ฃผ์ œ๋กœ ๋„˜์–ด๊ฐˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
* ๊ฒ€์ฆ ๊ฐ€๋Šฅํ•œ ์ •๋ณด๋“ค์„ ๋ฝ‘์•„๋‚ผ ์ˆ˜ ์žˆ๋Š” ์งˆ๋ฌธ ์œ„์ฃผ๋กœ ํ–ˆ๋Š”๊ฐ€ (= ๋ชจ์ˆœ์„ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๊ฑฐ๋‚˜, ์™ธ๋ถ€ ๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๊ฒ€์ฆํ•  ์ˆ˜ ์žˆ์„ ๋งŒํ•œ ์งˆ๋ฌธ์ธ๊ฐ€) 
  
  (e.g., "๋‚ ์งœ, ์ฃผ์†Œ, ์†Œ์† ID, ๊ธฐ๊ด€ ์ด๋ฆ„, ์ด๋ฉ”์ผ, ๋‹ค๋‹ˆ๋Š” ํšŒ์‚ฌ ์ƒ์‚ฌ ๋“ฑ ๊ด€๊ณ„์ž ์ด๋ฆ„" ๊ด€๋ จ๋œ ์งˆ๋ฌธ๋“ค)
* ์งˆ๋ฌธ์ด ์ธํ„ฐ๋ทฐ์ด์— ํŠนํ™”๋œ ์งˆ๋ฌธ์ธ๊ฐ€ (์ฆ‰ ์ธํ„ฐ๋ทฐ์ด์˜ ๊ตฌ์ฒด์ ์ธ ๊ฒฝํ—˜, ๋‹ต๋ณ€๊ณผ ์—ฐ๊ด€์„ฑ์ด ๋†’์€ ์งˆ๋ฌธ์ธ๊ฐ€)
* ์งˆ๋ฌธ๋“ค ๊ฐ„์˜ ์ƒํ˜ธ ์—ฐ๊ด€์„ฑ์ด ๋†’์€๊ฐ€ 
* ์ด์ „ ๋Œ€ํ™”์—์„œ ๋ชจ์ˆœ์ด๋‚˜ ์˜๋ฌธ์ ์ด ๋ฐœ๊ฒฌ๋˜์—ˆ์„ ๊ฒฝ์šฐ, ๋ฐœ์ƒํ•œ ๋ชจ์ˆœ๊ณผ ๊ด€๋ จ๋œ ์งˆ๋ฌธ์„ ๋งŽ์ด ํ–ˆ๋Š”๊ฐ€ 

### ์งˆ๋ฌธ์„ ์ž˜ํ•˜์ง€ ๋ชปํ•œ ๊ฒฝ์šฐ
๋ฐ˜๋Œ€๋กœ ์งˆ๋ฌธ์„ ๋ชปํ•œ ์ผ€์ด์Šค๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. 
* ํ•˜๋‚˜์˜ ์ฃผ์ œ์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํžˆ ๊ตฌ์ฒดํ™”๊ฐ€ ๋˜์ง€ ์•Š์•˜๋Š”๋ฐ ๋ฐ”๋กœ ์™„์ „ํžˆ ๋‹ค๋ฅธ ์ฃผ์ œ๋กœ ๋„˜์–ด๊ฐ€๋ฒ„๋ฆฐ ๊ฒฝ์šฐ
* ๋™์ผํ•œ ์งˆ๋ฌธ์„ ๋‹ค๋ฅธ ํ‘œํ˜„์œผ๋กœ ๋ฐ”๊พธ์ง€ ์•Š๊ณ (paraphrase ํ•˜์ง€ ์•Š๊ณ ) ๊ทธ๋Œ€๋กœ ๋ฐ˜๋ณตํ•  ๊ฒฝ์šฐ
* ๋ชจ์ˆœ ์—ฌ๋ถ€๋‚˜ ์‚ฌ์‹ค ๊ด€๊ณ„๋ฅผ ๊ฒ€์ฆํ•˜๊ธฐ ์–ด๋ ค์šด ์ถ”์ƒ์ ์ธ ์งˆ๋ฌธ์„ ํ•œ ๊ฒฝ์šฐ 
  
  (e.g., "๋„ˆ์˜ ์ทจ๋ฏธ๋Š” ๋ญ์•ผ?", "๋„ˆ์˜ ์ธ์ƒ์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฐ€์น˜๋Š” ๋ญ์•ผ?")
* ์ธํ„ฐ๋ทฐ์ด ๋ณธ์ธ์˜ ์ •๋ณด ๋ฐ ๊ฒฝํ—˜๊ณผ๋Š” ์—ฐ๊ด€์ด ๋‚ฎ๊ณ  ์™ธ๋ถ€ ์ง€์‹์„ ์ด์šฉํ•ด ๋‹ต๋ณ€ํ•ด์•ผ ํ•˜๋Š” ์งˆ๋ฌธ์„ ํ•œ ๊ฒฝ์šฐ

  (e.g., "๋‚˜๋Š” ๊ตฌ๊ธ€์— ๋‹ค๋…€." โ†’ "๊ตฌ๊ธ€ ์„ค๋ฆฝ ์—ฐ๋„๋Š” ์–ธ์ œ์•ผ?")
  * ์˜ˆ์™ธ) "๋‚˜๋Š” ๊ตฌ๊ธ€ ์ฐฝ๋ฆฝ์ž์•ผ." โ†’ "๊ตฌ๊ธ€์˜ ์„ค๋ฆฝ ์—ฐ๋„๋Š” ์–ธ์ œ์•ผ?" ์ฒ˜๋Ÿผ ์ธํ„ฐ๋ทฐ์ด๊ฐ€ ์ง์ ‘ ์ฐธ์—ฌํ•œ ์ด๋ฒคํŠธ/์‚ฌ๊ฑด/๊ฒฝํ—˜๊ณผ ๋ฐ€์ ‘ํ•œ ์งˆ๋ฌธ์€ ํ—ˆ์šฉํ•จ. ๋”ฐ๋ผ์„œ ์ด์ „ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€๋“ค์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•ด์„œ ํ‰๊ฐ€ํ•ด์•ผ ํ•จ.
* ์™ธ๋ถ€ ์ง€์‹์„ ๊ฐ€์ ธ์™€ ํ™•์ธํ•˜๋Š” ์งˆ๋ฌธ์„ ํ•œ ๊ฒฝ์šฐ
  * ์งˆ๋ฌธ ํ˜•์‹ : "Would you confirm that..." (e.g., "Would you confirm that the 'KAIST' you metioned is the research-oriented science and engineering university in South Korea?")
  * ์™ธ๋ถ€ ์ง€์‹์„ ํ™•์ธํ•˜๋Š” ๊ณผ์ •์€ ๋”ฐ๋กœ ์กด์žฌํ•˜๋ฏ€๋กœ, ๋ฉ”์ธ ์งˆ๋ฌธ ๊ณผ์ •์—์„œ๋Š” ์ธํ„ฐ๋ทฐ์ด์™€ ๊ด€๋ จ๋œ ์งˆ๋ฌธ๋งŒ์„ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
* ์งˆ๋ฌธ๋“ค ์‚ฌ์ด์˜ ๊ด€๋ จ์„ฑ์ด ๋‚ฎ์•„ ์ƒํ˜ธ ๋ชจ์ˆœ์„ ํŒ๋‹จํ•˜๊ธฐ ์–ด๋ ค์šด ๊ฒฝ์šฐ
* ์ด์ „ ๋Œ€ํ™”์—์„œ ๋ชจ์ˆœ์ด ๋ฐœ๊ฒฌ๋˜์—ˆ์Œ์—๋„ ์—ฐ๊ด€์„ฑ ์—†๋Š” ๋‹ค๋ฅธ ์งˆ๋ฌธ์œผ๋กœ ๋„˜์–ด๊ฐ€๋ฒ„๋ฆฐ ๊ฒฝ์šฐ

# ์ฃผ์˜ ์‚ฌํ•ญ
* **์‹ฌ๋ฌธ๊ด€์„ ํ‰๊ฐ€ํ•  ๋•Œ, ์ธํ„ฐ๋ทฐ์ด์˜ ๋‹ต๋ณ€์€ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ์‹ฌ๋ฌธ๊ด€์˜ ์งˆ๋ฌธ ๋Šฅ๋ ฅ๋งŒ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋‹ต๋ณ€์ด ์•„๋‹Œ ์งˆ๋ฌธ์˜ ์–‘์ƒ๊ณผ ํ€„๋ฆฌํ‹ฐ์— ์ง‘์ค‘ํ•ด์ฃผ์„ธ์š”.**
* ๊ฐœ๋ณ„ ์งˆ๋ฌธ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ „์ฒด์ ์ธ ์งˆ๋ฌธ ์ „๋žต์„ ๊ณ ๋ คํ•ด์ฃผ์„ธ์š”.

# ์ฐธ๊ณ  ์‚ฌํ•ญ
* Chrome์˜ ๋ฒˆ์—ญ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•ด์„œ ํ•œ๊ธ€๋กœ ๋ฒˆ์—ญ ํ›„ ํ‰๊ฐ€ํ•˜์…”๋„ ๋ฉ๋‹ˆ๋‹ค!
* ๋ ˆ์ด๋ธ”๋ง์„ ํ•˜๋‹ค๊ฐ€ ๊ธฐ์ค€์ด ๊ธฐ์–ต์ด ์•ˆ ๋‚˜๊ฑฐ๋‚˜ ํ—ท๊ฐˆ๋ฆฌ์‹œ๋ฉด, ํ™”๋ฉด์˜ ์ขŒ์ธก ํ•˜๋‹จ์— ์œ„์น˜ํ•œ **GUIDELINES**๋ฅผ ํด๋ฆญํ•˜์—ฌ ๋‚ด์šฉ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
* ๋งŒ์•ฝ ๋‘ A,B ์ค‘ ํ•˜๋‚˜๋ฅผ ๊ณ ๋ฅด๊ธฐ ์–ด๋ ต๋‹ค๋ฉด 1) ์ฃผ๊ด€์‹ ์งˆ๋ฌธ์„ ์ข€ ๋” ๋งŽ์ด ํ•œ ์‹ฌ๋ฌธ๊ด€์„ ๋” ์ž˜ํ–ˆ๋‹ค๊ณ  ๊ณจ๋ผ์ฃผ์‹œ๊ฑฐ๋‚˜ 2) ๋‘˜๋‹ค ์ข‹๋‹ค/๋‘˜๋‹ค ํ˜•ํŽธ์—†๋‹ค ์ค‘์— ๊ณจ๋ผ์ฃผ์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

๐Ÿ“ง minskim010203@gmail.com, imsujeong2190@gmail.com

---

# Labeling Guideline

Thank you for participating in this labeling project!

You will review interview transcripts of **two different AI interrogators (A and B)** interacting with the same interviewee. Your task is to evaluate the questioning capabilities of these AI interrogators.

There are two main tasks to complete:

* **Comparison:** Determine which of the two interrogators (A or B) asks better questions.
* **Rating:** Evaluate the quality of each interrogator on a 5-point scale.

# Evaluation Criteria

The purpose of this interview is to ensure the interviewee provides consistent accounts of their data, memories, and experiences, and that these accounts do not contradict external reality. 

Consequently, effective questioning should focus on extracting highly detailed and verifiable information from the interviewee.

### Criteria for Good Questions

* **Depth & Persistence:** Did the interrogator ask follow-up questions until the topic was sufficiently detailed?
* If the interrogator needs to ask again because they didnโ€™t get a clear answer, they should **paraphrase** the question.
* *Exception:* If the interviewee repeatedly refuses to answer despite paraphrasing, the interrogator may move to a different topic.


* **Verifiability:** Did the questions focus on extracting verifiable information? (i.e., information that can reveal contradictions or be verified through external search).
* Examples: Questions regarding dates, addresses, affiliation IDs, organization names, emails, or names of relevant parties like supervisors.


* **Personalization:** Are the questions tailored to the interviewee? (i.e., highly relevant to the intervieweeโ€™s specific experiences and previous answers).
* **Cohesion:** Is there a high degree of interconnection between the questions?
* **Addressing Contradictions:** If a contradiction or point of doubt was found in previous dialogue, did the interrogator focus on questions related to that contradiction?

### Criteria for Poor Questions

Conversely, the following cases indicate poor questioning performance:

* **Premature Topic Shifts:** Moving to a completely different topic before the current subject has been sufficiently detailed.
* **Repetition without Paraphrasing:** Repeating the exact same question without changing the phrasing.
* **Abstract/Unverifiable Questions:** Asking abstract questions where it is difficult to judge contradictions or verify facts.
* Examples: "What are your hobbies?", "What is the most important value in your life?"


* **External Knowledge Over Personal Experience:** Asking questions that require external knowledge rather than the intervieweeโ€™s own information/experience.
* Example: "I work at Google." โ†’ "What year was Google founded?"
* *Exception:* Questions closely related to events/experiences the interviewee directly participated in are allowed. (e.g., "I am the founder of Google." โ†’ "What year was Google founded?") You must evaluate this based on the context of the previous dialogue.


* **Fact-Checking External Knowledge:** Using the main questioning phase to verify external facts rather than focusing on the interviewee.
  * Question Format : "Would you confirm that ..." (e.g., "Would you confirm that the 'KAIST' you metioned is the research-oriented science and engineering university in South Korea?")
  * There is a separate process for external fact-checking
* **Low Correlation:** Questions that lack relevance to each other, making it difficult to identify mutual contradictions.
* **Ignoring Inconsistencies:** Moving to an unrelated question even though a contradiction was detected in the previous conversation.

# Important Notes

* When evaluating the interrogator, **do not judge the interviewee's answers.** Focus solely on the pattern and quality of the interrogator's questions.
* Consider the **overall questioning strategy** as a whole, rather than just looking at individual questions in isolation.

# Reference

* You may use the Chrome translation tool to view and evaluate the content in your language of choice.
* If you forget the labeling criteria or get confused, you can click the **GUIDELINES** button in the bottom-left corner of the screen to review them.
* If you're having trouble choosing between A and B, lean toward the one asking more subjective questions. If it's still a tie, feel free to select 'Both are equally good' or 'Both are equally poor.'

๐Ÿ“ง minskim010203@gmail.com, imsujeong2190@gmail.com