tnwjddla2190 commited on
Commit
df6357d
ยท
verified ยท
1 Parent(s): 8eec9b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -46
README.md CHANGED
@@ -12,91 +12,103 @@ sdk: static
12
 
13
  ๋ ˆ์ด๋ธ”๋ง์— ์ฐธ์—ฌํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!
14
 
15
- ์—ฌ๋Ÿฌ๋ถ„์€ ๋™์ผํ•œ ๋ฉด์ ‘ ๋Œ€์ƒ์ž์— ๋Œ€ํ•œ **์„œ๋กœ ๋‹ค๋ฅธ ๋‘ AI ๋ฉด์ ‘๊ด€ (A , B)**์˜ ์ธํ„ฐ๋ทฐ ๋Œ€ํ™” ๊ธฐ๋ก์„ ๋ณด๊ณ , AI ๋ฉด์ ‘๊ด€์˜ ์งˆ๋ฌธ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
16
 
17
  ์—ฌ๋Ÿฌ๋ถ„์ด ํ•ด์ฃผ์…”์•ผ ํ•  ํƒœ์Šคํฌ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ๋‘ ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค.
18
 
19
- - ์„œ๋กœ ๋‹ค๋ฅธ ๋‘ ๋ฉด์ ‘๊ด€ (A , B) ์ค‘, ์–ด๋–ค ๋ฉด์ ‘๊ด€์ด ๋” ๋‚˜์€ ์งˆ๋ฌธ์„ ํ•˜๋Š”์ง€ ํŒ๋‹จํ•˜์„ธ์š”.
20
- - ๊ฐ ๋ฉด์ ‘๊ด€์˜ ์ž์งˆ์„ 5์  ์ฒ™๋„๋กœ ํ‰๊ฐ€ํ•ด ์ฃผ์„ธ์š”.
21
 
22
- # Evaluation Guideline
23
  ์งˆ๋ฌธ ๋Šฅ๋ ฅ ํ‰๊ฐ€ ๊ธฐ์ค€์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ๋‹ค์„ฏ๊ฐ€์ง€ ํ•ญ๋ชฉ์— ๋Œ€ํ•ด ๋ชจ๋‘ ์ถฉ์กฑํ•  ๊ฒฝ์šฐ ๊ฐ€์žฅ ์งˆ๋ฌธ์„ ์ž˜ํ•œ ๋ฉด์ ‘๊ด€์ž…๋‹ˆ๋‹ค.
24
 
25
- ### Very Good ๊ธฐ์ค€
26
- - ํ•œ ์ฃผ์ œ์— ๋Œ€ํ•ด ๋‹ต๋ณ€์ด ์ถฉ๋ถ„ํžˆ ๊ตฌ์ฒดํ™”๊ฐ€ ๋  ๋•Œ๊นŒ์ง€ ์งˆ๋ฌธํ–ˆ๋Š”๊ฐ€
27
- - ๊ฒ€์ฆ ๊ฐ€๋Šฅํ•œ ์ •๋ณด๋“ค์„ ๋ฝ‘์•„๋‚ผ ์ˆ˜ ์žˆ๋Š” ์งˆ๋ฌธ ์œ„์ฃผ๋กœ ํ–ˆ๋Š”๊ฐ€ (= ๋ชจ์ˆœ์„ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๊ฑฐ๋‚˜, ์™ธ๋ถ€ ๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๊ฒ€์ฆํ•  ์ˆ˜ ์žˆ์„ ๋งŒํ•œ ์งˆ๋ฌธ์ธ๊ฐ€)
28
 
29
  (e.g., "๋‚ ์งœ, ์ฃผ์†Œ, ์†Œ์† ID, ๊ธฐ๊ด€ ์ด๋ฆ„, ์ด๋ฉ”์ผ, ๋‹ค๋‹ˆ๋Š” ํšŒ์‚ฌ ์ƒ์‚ฌ ๋“ฑ ๊ด€๊ณ„์ž ์ด๋ฆ„" ๊ด€๋ จ ์งˆ๋ฌธ๋“ค)
30
- - ์งˆ๋ฌธ์ด ์ธํ„ฐ๋ทฐ์ด์— ํŠนํ™”๋œ ์งˆ๋ฌธ์ธ๊ฐ€ (์ฆ‰ ์ธํ„ฐ๋ทฐ์ด์˜ ๊ตฌ์ฒด์ ์ธ ๊ฒฝํ—˜, ๋‹ต๋ณ€๊ณผ ์—ฐ๊ด€์„ฑ์ด ๋†’์€ ์งˆ๋ฌธ์ธ๊ฐ€)
31
- - ์งˆ๋ฌธ๋“ค ๊ฐ„์˜ ์ƒํ˜ธ ์—ฐ๊ด€์„ฑ์ด ๋†’์€๊ฐ€
32
- - ์ด์ „ ๋Œ€ํ™”์—์„œ ๋ชจ์ˆœ์ด๋‚˜ ์˜๋ฌธ์ ์ด ๋ฐœ๊ฒฌ๋˜์—ˆ์„ ๊ฒฝ์šฐ, ๋ฐœ์ƒํ•œ ๋ชจ์ˆœ๊ณผ ๊ด€๋ จ๋œ ์งˆ๋ฌธ์„ ๋งŽ์ด ํ–ˆ๋Š”๊ฐ€
33
 
34
- ### Very Poor ์ผ€์ด์Šค
35
  ๋ฐ˜๋Œ€๋กœ ์งˆ๋ฌธ์„ ๋ชปํ•œ ์ผ€์ด์Šค๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
36
- - ํ•˜๋‚˜์˜ ์ฃผ์ œ์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํžˆ ๊ตฌ์ฒดํ™”๊ฐ€ ๋˜์ง€ ์•Š์•˜๋Š”๋ฐ ๋ฐ”๋กœ ์™„์ „ํžˆ ๋‹ค๋ฅธ ์ฃผ์ œ๋กœ ๋„˜์–ด๊ฐ€๋ฒ„๋ฆฐ ๊ฒฝ์šฐ
37
- - ๋ชจ์ˆœ ์—ฌ๋ถ€๋‚˜ ์‚ฌ์‹ค ๊ด€๊ณ„๋ฅผ ๊ฒ€์ฆํ•˜๊ธฐ ์–ด๋ ค์šด ์ถ”์ƒ์ ์ธ ์งˆ๋ฌธ์„ ํ•œ ๊ฒฝ์šฐ
38
 
39
  (e.g., "๋„ˆ์˜ ์ทจ๋ฏธ๋Š” ๋ญ์•ผ?", "๋„ˆ์˜ ์ธ์ƒ์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฐ€์น˜๋Š” ๋ญ์•ผ?")
40
- - ์ธํ„ฐ๋ทฐ์ด ๋ณธ์ธ์˜ ์ •๋ณด ๋ฐ ๊ฒฝํ—˜๊ณผ๋Š” ์—ฐ๊ด€์ด ๋‚ฎ๊ณ  ์™ธ๋ถ€ ์ง€์‹์„ ์ด์šฉํ•ด ๋‹ต๋ณ€ํ•ด์•ผ ํ•˜๋Š” ์งˆ๋ฌธ์„ ํ•œ ๊ฒฝ์šฐ
41
 
42
  (e.g., "๋‚˜๋Š” ๊ตฌ๊ธ€์— ๋‹ค๋…€." โ†’ "๊ตฌ๊ธ€ ์„ค๋ฆฝ ์—ฐ๋„๋Š” ์–ธ์ œ์•ผ?")
43
- - ์˜ˆ์™ธ) "๋‚˜๋Š” ๊ตฌ๊ธ€ ์ฐฝ๋ฆฝ์ž์•ผ." โ†’ "๊ตฌ๊ธ€์˜ ์„ค๋ฆฝ ์—ฐ๋„๋Š” ์–ธ์ œ์•ผ?" ์ฒ˜๋Ÿผ ์ธํ„ฐ๋ทฐ์ด๊ฐ€ ์ง์ ‘ ์ฐธ์—ฌํ•œ ์ด๋ฒคํŠธ/์‚ฌ๊ฑด/๊ฒฝํ—˜๊ณผ ๋ฐ€์ ‘ํ•œ ์งˆ๋ฌธ์€ ํ—ˆ์šฉํ•จ. ๋”ฐ๋ผ์„œ ์ด์ „ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€๋“ค์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•ด์„œ ํ‰๊ฐ€ํ•ด์•ผ ํ•จ.
44
- - ์งˆ๋ฌธ๋“ค ์‚ฌ์ด์˜ ๊ด€๋ จ์„ฑ์ด ๋‚ฎ์•„ ์ƒํ˜ธ ๋ชจ์ˆœ์„ ํŒ๋‹จํ•˜๊ธฐ ์–ด๋ ค์šด ๊ฒฝ์šฐ
45
- - ์ด์ „ ๋Œ€ํ™”์—์„œ ๋ชจ์ˆœ์ด ๋ฐœ๊ฒฌ๋˜์—ˆ์Œ์—๋„ ์—ฐ๊ด€์„ฑ ์—†๋Š” ๋‹ค๋ฅธ ์งˆ๋ฌธ์œผ๋กœ ๋„˜์–ด๊ฐ€๋ฒ„๋ฆฐ ๊ฒฝ์šฐ
46
 
47
- # Caution
48
- - ๋ฉด์ ‘๊ด€์„ ํ‰๊ฐ€ํ•  ๋•Œ, ์ธํ„ฐ๋ทฐ์ด์˜ ๋‹ต๋ณ€์€ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ๋ฉด์ ‘๊ด€์˜ ์งˆ๋ฌธ ๋Šฅ๋ ฅ๋งŒ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋‹ต๋ณ€์ด ์•„๋‹Œ ์งˆ๋ฌธ์— ์ง‘์ค‘ํ•ด์ฃผ์„ธ์š”.
49
- - ๊ฐœ๋ณ„ ์งˆ๋ฌธ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ „์ฒด์ ์ธ ์งˆ๋ฌธ ์ „๋žต์„ ๊ณ ๋ คํ•˜์‹ญ์‹œ์˜ค.
 
 
 
50
 
51
  ๐Ÿ“ง minskim010203@gmail.com, imsujeong2190@gmail.com
52
 
53
  ---
54
 
55
  # Labeling Guideline
 
56
  Thank you for participating in this labeling project!
57
 
58
- In this task, you will review interview transcripts of two different AI interviewers (A and B) interacting with the same interviewee. Your goal is to evaluate the questioning capabilities of these AI interviewers.
 
 
 
 
 
59
 
60
- There are two main tasks you need to perform:
 
 
61
 
62
- Compare: Determine which of the two interviewers (A or B) asks better questions.
63
 
64
- Rate: Evaluate the quality of each interviewer on a 5-point scale.
65
 
66
- ## Evaluation Guideline
67
- The criteria for evaluating questioning ability are as follows. An interviewer who satisfies all five of the following items is considered to have performed excellently.
 
68
 
69
- ### "Very Good" Criteria
70
- - Depth: Did the interviewer continue questioning a single topic until the response became sufficiently specific?
71
- - Verifiability: Did the interviewer focus on questions that extract verifiable information? (i.e., information that can be checked for contradictions or verified through external search).
72
- - Examples: Questions regarding dates, addresses, affiliation IDs, organization names, emails, names of supervisors/colleagues, etc.
73
- - Personalization: Are the questions tailored specifically to the interviewee? (i.e., highly relevant to the interviewee's specific experiences and previous answers).
74
- - Coherence: Is there a high degree of interconnection between the questions?
75
- - Contradiction Handling: If contradictions or questionable points arose in previous dialogue, did the interviewer follow up with questions related to those contradictions?
76
 
77
- ### "Very Poor" Cases
78
- Conversely, the following cases indicate poor questioning:
 
79
 
80
- - Premature Topic Switching: Moving to a completely different topic before a current topic has been sufficiently detailed.
81
 
82
- - Abstract Questions: Asking abstract questions that make it difficult to verify facts or detect contradictions.
83
 
84
- - Examples: "What is your hobby?", "What is the most important value in your life?"
 
 
85
 
86
- - External Knowledge Dependency: Asking questions that rely on general external knowledge rather than the interviewee's own information and experiences.
87
 
88
- - Example: Interviewee says "I work at Google" โ†’ Interviewer asks "When was Google founded?"
 
 
89
 
90
- - Exception: If the question is closely linked to an event/experience the interviewee was directly involved in, it is acceptable. (e.g., Interviewee says "I am the founder of Google" โ†’ "When was Google founded?"). Evaluation must consider the context of previous questions and answers.
91
 
92
- - Low Relatedness: Questions that are so unrelated that it is impossible to judge mutual contradictions.
 
 
 
93
 
94
- - Ignoring Inconsistencies: Moving on to an unrelated question even though a contradiction was detected in the previous dialogue.
95
 
 
 
96
 
97
- # Caution
98
- When evaluating the interviewer, focus only on the interviewer's questioning ability. Do not let the quality of the interviewee's answers influence your score.
99
 
100
- Consider the overall questioning strategy throughout the transcript, not just individual questions in isolation.
101
 
102
  ๐Ÿ“ง minskim010203@gmail.com, imsujeong2190@gmail.com
 
12
 
13
  ๋ ˆ์ด๋ธ”๋ง์— ์ฐธ์—ฌํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!
14
 
15
+ ์—ฌ๋Ÿฌ๋ถ„์€ ๋™์ผํ•œ ๋ฉด์ ‘ ๋Œ€์ƒ์ž์— ๋Œ€ํ•œ **์„œ๋กœ ๋‹ค๋ฅธ ๋‘ AI ๋ฉด์ ‘๊ด€ A์™€ B**์˜ ์ธํ„ฐ๋ทฐ ๋Œ€ํ™” ๊ธฐ๋ก์„ ๋ณด๊ณ , AI ๋ฉด์ ‘๊ด€์˜ ์งˆ๋ฌธ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
16
 
17
  ์—ฌ๋Ÿฌ๋ถ„์ด ํ•ด์ฃผ์…”์•ผ ํ•  ํƒœ์Šคํฌ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ๋‘ ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค.
18
 
19
+ * ์„œ๋กœ ๋‹ค๋ฅธ ๋‘ ๋ฉด์ ‘๊ด€ (A , B) ์ค‘, ์–ด๋–ค ๋ฉด์ ‘๊ด€์ด ๋” ๋‚˜์€ ์งˆ๋ฌธ์„ ํ•˜๋Š”์ง€ ํŒ๋‹จํ•˜์„ธ์š”.
20
+ * ๊ฐ ๋ฉด์ ‘๊ด€์˜ ์ž์งˆ์„ 5์  ์ฒ™๋„๋กœ ํ‰๊ฐ€ํ•ด ์ฃผ์„ธ์š”.
21
 
22
+ # ํ‰๊ฐ€ ๊ธฐ์ค€
23
  ์งˆ๋ฌธ ๋Šฅ๋ ฅ ํ‰๊ฐ€ ๊ธฐ์ค€์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ๋‹ค์„ฏ๊ฐ€์ง€ ํ•ญ๋ชฉ์— ๋Œ€ํ•ด ๋ชจ๋‘ ์ถฉ์กฑํ•  ๊ฒฝ์šฐ ๊ฐ€์žฅ ์งˆ๋ฌธ์„ ์ž˜ํ•œ ๋ฉด์ ‘๊ด€์ž…๋‹ˆ๋‹ค.
24
 
25
+ ### ์ข‹์€ ์งˆ๋ฌธ์˜ ๊ธฐ์ค€
26
+ * ํ•œ ์ฃผ์ œ์— ๋Œ€ํ•ด ๋‹ต๋ณ€์ด ์ถฉ๋ถ„ํžˆ ๊ตฌ์ฒดํ™”๊ฐ€ ๋  ๋•Œ๊นŒ์ง€ ์งˆ๋ฌธํ–ˆ๋Š”๊ฐ€
27
+ * ๊ฒ€์ฆ ๊ฐ€๋Šฅํ•œ ์ •๋ณด๋“ค์„ ๋ฝ‘์•„๋‚ผ ์ˆ˜ ์žˆ๋Š” ์งˆ๋ฌธ ์œ„์ฃผ๋กœ ํ–ˆ๋Š”๊ฐ€ (= ๋ชจ์ˆœ์„ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๊ฑฐ๋‚˜, ์™ธ๋ถ€ ๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๊ฒ€์ฆํ•  ์ˆ˜ ์žˆ์„ ๋งŒํ•œ ์งˆ๋ฌธ์ธ๊ฐ€)
28
 
29
  (e.g., "๋‚ ์งœ, ์ฃผ์†Œ, ์†Œ์† ID, ๊ธฐ๊ด€ ์ด๋ฆ„, ์ด๋ฉ”์ผ, ๋‹ค๋‹ˆ๋Š” ํšŒ์‚ฌ ์ƒ์‚ฌ ๋“ฑ ๊ด€๊ณ„์ž ์ด๋ฆ„" ๊ด€๋ จ ์งˆ๋ฌธ๋“ค)
30
+ * ์งˆ๋ฌธ์ด ์ธํ„ฐ๋ทฐ์ด์— ํŠนํ™”๋œ ์งˆ๋ฌธ์ธ๊ฐ€ (์ฆ‰ ์ธํ„ฐ๋ทฐ์ด์˜ ๊ตฌ์ฒด์ ์ธ ๊ฒฝํ—˜, ๋‹ต๋ณ€๊ณผ ์—ฐ๊ด€์„ฑ์ด ๋†’์€ ์งˆ๋ฌธ์ธ๊ฐ€)
31
+ * ์งˆ๋ฌธ๋“ค ๊ฐ„์˜ ์ƒํ˜ธ ์—ฐ๊ด€์„ฑ์ด ๋†’์€๊ฐ€
32
+ * ์ด์ „ ๋Œ€ํ™”์—์„œ ๋ชจ์ˆœ์ด๋‚˜ ์˜๋ฌธ์ ์ด ๋ฐœ๊ฒฌ๋˜์—ˆ์„ ๊ฒฝ์šฐ, ๋ฐœ์ƒํ•œ ๋ชจ์ˆœ๊ณผ ๊ด€๋ จ๋œ ์งˆ๋ฌธ์„ ๋งŽ์ด ํ–ˆ๋Š”๊ฐ€
33
 
34
+ ### ์งˆ๋ฌธ์„ ์ž˜ํ•˜์ง€ ๋ชปํ•œ ๊ฒฝ์šฐ
35
  ๋ฐ˜๋Œ€๋กœ ์งˆ๋ฌธ์„ ๋ชปํ•œ ์ผ€์ด์Šค๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
36
+ * ํ•˜๋‚˜์˜ ์ฃผ์ œ์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํžˆ ๊ตฌ์ฒดํ™”๊ฐ€ ๋˜์ง€ ์•Š์•˜๋Š”๋ฐ ๋ฐ”๋กœ ์™„์ „ํžˆ ๋‹ค๋ฅธ ์ฃผ์ œ๋กœ ๋„˜์–ด๊ฐ€๋ฒ„๋ฆฐ ๊ฒฝ์šฐ
37
+ * ๋ชจ์ˆœ ์—ฌ๋ถ€๋‚˜ ์‚ฌ์‹ค ๊ด€๊ณ„๋ฅผ ๊ฒ€์ฆํ•˜๊ธฐ ์–ด๋ ค์šด ์ถ”์ƒ์ ์ธ ์งˆ๋ฌธ์„ ํ•œ ๊ฒฝ์šฐ
38
 
39
  (e.g., "๋„ˆ์˜ ์ทจ๋ฏธ๋Š” ๋ญ์•ผ?", "๋„ˆ์˜ ์ธ์ƒ์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฐ€์น˜๋Š” ๋ญ์•ผ?")
40
+ * ์ธํ„ฐ๋ทฐ์ด ๋ณธ์ธ์˜ ์ •๋ณด ๋ฐ ๊ฒฝํ—˜๊ณผ๋Š” ์—ฐ๊ด€์ด ๋‚ฎ๊ณ  ์™ธ๋ถ€ ์ง€์‹์„ ์ด์šฉํ•ด ๋‹ต๋ณ€ํ•ด์•ผ ํ•˜๋Š” ์งˆ๋ฌธ์„ ํ•œ ๊ฒฝ์šฐ
41
 
42
  (e.g., "๋‚˜๋Š” ๊ตฌ๊ธ€์— ๋‹ค๋…€." โ†’ "๊ตฌ๊ธ€ ์„ค๋ฆฝ ์—ฐ๋„๋Š” ์–ธ์ œ์•ผ?")
43
+ * ์˜ˆ์™ธ) "๋‚˜๋Š” ๊ตฌ๊ธ€ ์ฐฝ๋ฆฝ์ž์•ผ." โ†’ "๊ตฌ๊ธ€์˜ ์„ค๋ฆฝ ์—ฐ๋„๋Š” ์–ธ์ œ์•ผ?" ์ฒ˜๋Ÿผ ์ธํ„ฐ๋ทฐ์ด๊ฐ€ ์ง์ ‘ ์ฐธ์—ฌํ•œ ์ด๋ฒคํŠธ/์‚ฌ๊ฑด/๊ฒฝํ—˜๊ณผ ๋ฐ€์ ‘ํ•œ ์งˆ๋ฌธ์€ ํ—ˆ์šฉํ•จ. ๋”ฐ๋ผ์„œ ์ด์ „ ์งˆ๋ฌธ๊ณผ ๋‹ต๋ณ€๋“ค์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•ด์„œ ํ‰๊ฐ€ํ•ด์•ผ ํ•จ.
44
+ * ์งˆ๋ฌธ๋“ค ์‚ฌ์ด์˜ ๊ด€๋ จ์„ฑ์ด ๋‚ฎ์•„ ์ƒํ˜ธ ๋ชจ์ˆœ์„ ํŒ๋‹จํ•˜๊ธฐ ์–ด๋ ค์šด ๊ฒฝ์šฐ
45
+ * ์ด์ „ ๋Œ€ํ™”์—์„œ ๋ชจ์ˆœ์ด ๋ฐœ๊ฒฌ๋˜์—ˆ์Œ์—๋„ ์—ฐ๊ด€์„ฑ ์—†๋Š” ๋‹ค๋ฅธ ์งˆ๋ฌธ์œผ๋กœ ๋„˜์–ด๊ฐ€๋ฒ„๋ฆฐ ๊ฒฝ์šฐ
46
 
47
+ # ์ฃผ์˜ ์‚ฌํ•ญ
48
+ * ๋ฉด์ ‘๊ด€์„ ํ‰๊ฐ€ํ•  ๋•Œ, ์ธํ„ฐ๋ทฐ์ด์˜ ๋‹ต๋ณ€์€ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ๋ฉด์ ‘๊ด€์˜ ์งˆ๋ฌธ ๋Šฅ๋ ฅ๋งŒ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋‹ต๋ณ€์ด ์•„๋‹Œ ์งˆ๋ฌธ์˜ ์–‘์ƒ๊ณผ ํ€„๋ฆฌํ‹ฐ์— ์ง‘์ค‘ํ•ด์ฃผ์„ธ์š”.
49
+ * ๊ฐœ๋ณ„ ์งˆ๋ฌธ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ „์ฒด์ ์ธ ์งˆ๋ฌธ ์ „๋žต์„ ๊ณ ๋ คํ•˜์‹ญ์‹œ์˜ค.
50
+
51
+ # P.S.
52
+ * Chrome์˜ ๋ฒˆ์—ญ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•ด์„œ ํ•œ๊ธ€๋กœ ๋ฒˆ์—ญ ํ›„ ํ‰๊ฐ€ํ•˜์…”๋„ ๋ฉ๏ฟฝ๏ฟฝ๋‹ค!
53
 
54
  ๐Ÿ“ง minskim010203@gmail.com, imsujeong2190@gmail.com
55
 
56
  ---
57
 
58
  # Labeling Guideline
59
+
60
  Thank you for participating in this labeling project!
61
 
62
+ You will be reviewing interview transcripts of **two different AI interviewers (A and B)** conducting sessions with the same interviewee. Your task is to evaluate the questioning capabilities of these AI interviewers.
63
+
64
+ There are two main tasks to complete:
65
+
66
+ * **Comparison:** Determine which of the two interviewers (A or B) demonstrates superior questioning skills.
67
+ * **Rating:** Rate the quality of each interviewer on a 5-point scale.
68
 
69
+ ---
70
+
71
+ # Evaluation Criteria
72
 
73
+ The quality of an interviewer is judged by the following criteria. An ideal interviewer satisfies all five of the points listed below.
74
 
75
+ ### Criteria for Good Questions
76
 
77
+ * **Depth:** Does the interviewer continue questioning a single topic until the responses are sufficiently detailed and specific?
78
+ * **Verifiability:** Do the questions focus on extracting verifiable information? (i.e., information that can reveal contradictions or be verified via external search).
79
+ * *e.g., Questions regarding dates, addresses, affiliation IDs, organization names, emails, or names of supervisors/colleagues.*
80
 
 
 
 
 
 
 
 
81
 
82
+ * **Personalization:** Are the questions tailored to the interviewee? (i.e., highly relevant to the intervieweeโ€™s specific experiences and previous answers).
83
+ * **Cohesion:** Is there a high degree of logical interconnection between the questions?
84
+ * **Critical Follow-up:** If contradictions or questionable points arose in previous dialogue, did the interviewer ask follow-up questions specifically addressing those inconsistencies?
85
 
86
+ ### Indicators of Poor Questioning
87
 
88
+ An interviewer is considered less effective if they exhibit the following:
89
 
90
+ * **Abrupt Topic Switching:** Moving to a completely different topic before the current subject has been sufficiently explored.
91
+ * **Abstract Questions:** Asking questions that make it difficult to verify facts or detect contradictions.
92
+ * *e.g., "What are your hobbies?", "What is the most important value in your life?"*
93
 
 
94
 
95
+ * **External Knowledge Dependency:** Asking questions that rely on general external knowledge rather than the interviewee's personal information or experiences.
96
+ * *e.g., "I work at Google." โ†’ "In what year was Google founded?"*
97
+ * **Exception:** If the question relates to an event the interviewee was directly involved in, it is acceptable. (e.g., "I am the founder of Google." โ†’ "In what year was Google founded?") Please evaluate by considering the context of the previous dialogue.
98
 
 
99
 
100
+ * **Low Relevancy:** Questions lack connection to one another, making it difficult to judge internal consistency or logic.
101
+ * **Ignoring Contradictions:** Moving on to unrelated questions even after a contradiction was clearly detected in the previous dialogue.
102
+
103
+ ---
104
 
105
+ # Important Notes
106
 
107
+ * **Evaluate the Interviewer Only:** When evaluating, do not judge the intervieweeโ€™s answers. Focus solely on the pattern and quality of the **interviewerโ€™s questions**.
108
+ * **Strategy over Items:** Consider the overall questioning strategy and flow, not just individual questions in isolation.
109
 
110
+ # P.S.
 
111
 
112
+ * You may use the Chrome translation feature to translate the text into Korean while performing your evaluation!
113
 
114
  ๐Ÿ“ง minskim010203@gmail.com, imsujeong2190@gmail.com