Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -12,91 +12,103 @@ sdk: static
|
|
| 12 |
|
| 13 |
๋ ์ด๋ธ๋ง์ ์ฐธ์ฌํด์ฃผ์
์ ๊ฐ์ฌํฉ๋๋ค!
|
| 14 |
|
| 15 |
-
์ฌ๋ฌ๋ถ์ ๋์ผํ ๋ฉด์ ๋์์์ ๋ํ **์๋ก ๋ค๋ฅธ ๋ AI ๋ฉด์ ๊ด
|
| 16 |
|
| 17 |
์ฌ๋ฌ๋ถ์ด ํด์ฃผ์
์ผ ํ ํ์คํฌ๋ ์๋์ ๊ฐ์ด ๋ ๊ฐ์ง์
๋๋ค.
|
| 18 |
|
| 19 |
-
|
| 20 |
-
|
| 21 |
|
| 22 |
-
#
|
| 23 |
์ง๋ฌธ ๋ฅ๋ ฅ ํ๊ฐ ๊ธฐ์ค์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค. ์๋ ๋ค์ฏ๊ฐ์ง ํญ๋ชฉ์ ๋ํด ๋ชจ๋ ์ถฉ์กฑํ ๊ฒฝ์ฐ ๊ฐ์ฅ ์ง๋ฌธ์ ์ํ ๋ฉด์ ๊ด์
๋๋ค.
|
| 24 |
|
| 25 |
-
###
|
| 26 |
-
|
| 27 |
-
|
| 28 |
|
| 29 |
(e.g., "๋ ์ง, ์ฃผ์, ์์ ID, ๊ธฐ๊ด ์ด๋ฆ, ์ด๋ฉ์ผ, ๋ค๋๋ ํ์ฌ ์์ฌ ๋ฑ ๊ด๊ณ์ ์ด๋ฆ" ๊ด๋ จ ์ง๋ฌธ๋ค)
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
|
| 34 |
-
###
|
| 35 |
๋ฐ๋๋ก ์ง๋ฌธ์ ๋ชปํ ์ผ์ด์ค๋ ๋ค์๊ณผ ๊ฐ์ต๋๋ค.
|
| 36 |
-
|
| 37 |
-
|
| 38 |
|
| 39 |
(e.g., "๋์ ์ทจ๋ฏธ๋ ๋ญ์ผ?", "๋์ ์ธ์์์ ๊ฐ์ฅ ์ค์ํ ๊ฐ์น๋ ๋ญ์ผ?")
|
| 40 |
-
|
| 41 |
|
| 42 |
(e.g., "๋๋ ๊ตฌ๊ธ์ ๋ค๋
." โ "๊ตฌ๊ธ ์ค๋ฆฝ ์ฐ๋๋ ์ธ์ ์ผ?")
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
|
| 47 |
-
#
|
| 48 |
-
|
| 49 |
-
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|
| 52 |
|
| 53 |
---
|
| 54 |
|
| 55 |
# Labeling Guideline
|
|
|
|
| 56 |
Thank you for participating in this labeling project!
|
| 57 |
|
| 58 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
-
|
|
|
|
|
|
|
| 61 |
|
| 62 |
-
|
| 63 |
|
| 64 |
-
|
| 65 |
|
| 66 |
-
|
| 67 |
-
|
|
|
|
| 68 |
|
| 69 |
-
### "Very Good" Criteria
|
| 70 |
-
- Depth: Did the interviewer continue questioning a single topic until the response became sufficiently specific?
|
| 71 |
-
- Verifiability: Did the interviewer focus on questions that extract verifiable information? (i.e., information that can be checked for contradictions or verified through external search).
|
| 72 |
-
- Examples: Questions regarding dates, addresses, affiliation IDs, organization names, emails, names of supervisors/colleagues, etc.
|
| 73 |
-
- Personalization: Are the questions tailored specifically to the interviewee? (i.e., highly relevant to the interviewee's specific experiences and previous answers).
|
| 74 |
-
- Coherence: Is there a high degree of interconnection between the questions?
|
| 75 |
-
- Contradiction Handling: If contradictions or questionable points arose in previous dialogue, did the interviewer follow up with questions related to those contradictions?
|
| 76 |
|
| 77 |
-
|
| 78 |
-
|
|
|
|
| 79 |
|
| 80 |
-
|
| 81 |
|
| 82 |
-
|
| 83 |
|
| 84 |
-
|
|
|
|
|
|
|
| 85 |
|
| 86 |
-
- External Knowledge Dependency: Asking questions that rely on general external knowledge rather than the interviewee's own information and experiences.
|
| 87 |
|
| 88 |
-
|
|
|
|
|
|
|
| 89 |
|
| 90 |
-
- Exception: If the question is closely linked to an event/experience the interviewee was directly involved in, it is acceptable. (e.g., Interviewee says "I am the founder of Google" โ "When was Google founded?"). Evaluation must consider the context of previous questions and answers.
|
| 91 |
|
| 92 |
-
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
-
|
| 95 |
|
|
|
|
|
|
|
| 96 |
|
| 97 |
-
#
|
| 98 |
-
When evaluating the interviewer, focus only on the interviewer's questioning ability. Do not let the quality of the interviewee's answers influence your score.
|
| 99 |
|
| 100 |
-
|
| 101 |
|
| 102 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|
|
|
|
| 12 |
|
| 13 |
๋ ์ด๋ธ๋ง์ ์ฐธ์ฌํด์ฃผ์
์ ๊ฐ์ฌํฉ๋๋ค!
|
| 14 |
|
| 15 |
+
์ฌ๋ฌ๋ถ์ ๋์ผํ ๋ฉด์ ๋์์์ ๋ํ **์๋ก ๋ค๋ฅธ ๋ AI ๋ฉด์ ๊ด A์ B**์ ์ธํฐ๋ทฐ ๋ํ ๊ธฐ๋ก์ ๋ณด๊ณ , AI ๋ฉด์ ๊ด์ ์ง๋ฌธ ๋ฅ๋ ฅ์ ํ๊ฐํ๊ฒ ๋ฉ๋๋ค.
|
| 16 |
|
| 17 |
์ฌ๋ฌ๋ถ์ด ํด์ฃผ์
์ผ ํ ํ์คํฌ๋ ์๋์ ๊ฐ์ด ๋ ๊ฐ์ง์
๋๋ค.
|
| 18 |
|
| 19 |
+
* ์๋ก ๋ค๋ฅธ ๋ ๋ฉด์ ๊ด (A , B) ์ค, ์ด๋ค ๋ฉด์ ๊ด์ด ๋ ๋์ ์ง๋ฌธ์ ํ๋์ง ํ๋จํ์ธ์.
|
| 20 |
+
* ๊ฐ ๋ฉด์ ๊ด์ ์์ง์ 5์ ์ฒ๋๋ก ํ๊ฐํด ์ฃผ์ธ์.
|
| 21 |
|
| 22 |
+
# ํ๊ฐ ๊ธฐ์ค
|
| 23 |
์ง๋ฌธ ๋ฅ๋ ฅ ํ๊ฐ ๊ธฐ์ค์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค. ์๋ ๋ค์ฏ๊ฐ์ง ํญ๋ชฉ์ ๋ํด ๋ชจ๋ ์ถฉ์กฑํ ๊ฒฝ์ฐ ๊ฐ์ฅ ์ง๋ฌธ์ ์ํ ๋ฉด์ ๊ด์
๋๋ค.
|
| 24 |
|
| 25 |
+
### ์ข์ ์ง๋ฌธ์ ๊ธฐ์ค
|
| 26 |
+
* ํ ์ฃผ์ ์ ๋ํด ๋ต๋ณ์ด ์ถฉ๋ถํ ๊ตฌ์ฒดํ๊ฐ ๋ ๋๊น์ง ์ง๋ฌธํ๋๊ฐ
|
| 27 |
+
* ๊ฒ์ฆ ๊ฐ๋ฅํ ์ ๋ณด๋ค์ ๋ฝ์๋ผ ์ ์๋ ์ง๋ฌธ ์์ฃผ๋ก ํ๋๊ฐ (= ๋ชจ์์ ํ๋จํ ์ ์๊ฑฐ๋, ์ธ๋ถ ๊ฒ์์ ํตํด ๊ฒ์ฆํ ์ ์์ ๋งํ ์ง๋ฌธ์ธ๊ฐ)
|
| 28 |
|
| 29 |
(e.g., "๋ ์ง, ์ฃผ์, ์์ ID, ๊ธฐ๊ด ์ด๋ฆ, ์ด๋ฉ์ผ, ๋ค๋๋ ํ์ฌ ์์ฌ ๋ฑ ๊ด๊ณ์ ์ด๋ฆ" ๊ด๋ จ ์ง๋ฌธ๋ค)
|
| 30 |
+
* ์ง๋ฌธ์ด ์ธํฐ๋ทฐ์ด์ ํนํ๋ ์ง๋ฌธ์ธ๊ฐ (์ฆ ์ธํฐ๋ทฐ์ด์ ๊ตฌ์ฒด์ ์ธ ๊ฒฝํ, ๋ต๋ณ๊ณผ ์ฐ๊ด์ฑ์ด ๋์ ์ง๋ฌธ์ธ๊ฐ)
|
| 31 |
+
* ์ง๋ฌธ๋ค ๊ฐ์ ์ํธ ์ฐ๊ด์ฑ์ด ๋์๊ฐ
|
| 32 |
+
* ์ด์ ๋ํ์์ ๋ชจ์์ด๋ ์๋ฌธ์ ์ด ๋ฐ๊ฒฌ๋์์ ๊ฒฝ์ฐ, ๋ฐ์ํ ๋ชจ์๊ณผ ๊ด๋ จ๋ ์ง๋ฌธ์ ๋ง์ด ํ๋๊ฐ
|
| 33 |
|
| 34 |
+
### ์ง๋ฌธ์ ์ํ์ง ๋ชปํ ๊ฒฝ์ฐ
|
| 35 |
๋ฐ๋๋ก ์ง๋ฌธ์ ๋ชปํ ์ผ์ด์ค๋ ๋ค์๊ณผ ๊ฐ์ต๋๋ค.
|
| 36 |
+
* ํ๋์ ์ฃผ์ ์ ๋ํด ์ถฉ๋ถํ ๊ตฌ์ฒดํ๊ฐ ๋์ง ์์๋๋ฐ ๋ฐ๋ก ์์ ํ ๋ค๋ฅธ ์ฃผ์ ๋ก ๋์ด๊ฐ๋ฒ๋ฆฐ ๊ฒฝ์ฐ
|
| 37 |
+
* ๋ชจ์ ์ฌ๋ถ๋ ์ฌ์ค ๊ด๊ณ๋ฅผ ๊ฒ์ฆํ๊ธฐ ์ด๋ ค์ด ์ถ์์ ์ธ ์ง๋ฌธ์ ํ ๊ฒฝ์ฐ
|
| 38 |
|
| 39 |
(e.g., "๋์ ์ทจ๋ฏธ๋ ๋ญ์ผ?", "๋์ ์ธ์์์ ๊ฐ์ฅ ์ค์ํ ๊ฐ์น๋ ๋ญ์ผ?")
|
| 40 |
+
* ์ธํฐ๋ทฐ์ด ๋ณธ์ธ์ ์ ๋ณด ๋ฐ ๊ฒฝํ๊ณผ๋ ์ฐ๊ด์ด ๋ฎ๊ณ ์ธ๋ถ ์ง์์ ์ด์ฉํด ๋ต๋ณํด์ผ ํ๋ ์ง๋ฌธ์ ํ ๊ฒฝ์ฐ
|
| 41 |
|
| 42 |
(e.g., "๋๋ ๊ตฌ๊ธ์ ๋ค๋
." โ "๊ตฌ๊ธ ์ค๋ฆฝ ์ฐ๋๋ ์ธ์ ์ผ?")
|
| 43 |
+
* ์์ธ) "๋๋ ๊ตฌ๊ธ ์ฐฝ๋ฆฝ์์ผ." โ "๊ตฌ๊ธ์ ์ค๋ฆฝ ์ฐ๋๋ ์ธ์ ์ผ?" ์ฒ๋ผ ์ธํฐ๋ทฐ์ด๊ฐ ์ง์ ์ฐธ์ฌํ ์ด๋ฒคํธ/์ฌ๊ฑด/๊ฒฝํ๊ณผ ๋ฐ์ ํ ์ง๋ฌธ์ ํ์ฉํจ. ๋ฐ๋ผ์ ์ด์ ์ง๋ฌธ๊ณผ ๋ต๋ณ๋ค์ ํจ๊ป ๊ณ ๋ คํด์ ํ๊ฐํด์ผ ํจ.
|
| 44 |
+
* ์ง๋ฌธ๋ค ์ฌ์ด์ ๊ด๋ จ์ฑ์ด ๋ฎ์ ์ํธ ๋ชจ์์ ํ๋จํ๊ธฐ ์ด๋ ค์ด ๊ฒฝ์ฐ
|
| 45 |
+
* ์ด์ ๋ํ์์ ๋ชจ์์ด ๋ฐ๊ฒฌ๋์์์๋ ์ฐ๊ด์ฑ ์๋ ๋ค๋ฅธ ์ง๋ฌธ์ผ๋ก ๋์ด๊ฐ๋ฒ๋ฆฐ ๊ฒฝ์ฐ
|
| 46 |
|
| 47 |
+
# ์ฃผ์ ์ฌํญ
|
| 48 |
+
* ๋ฉด์ ๊ด์ ํ๊ฐํ ๋, ์ธํฐ๋ทฐ์ด์ ๋ต๋ณ์ ๊ณ ๋ คํ์ง ์๊ณ ๋ฉด์ ๊ด์ ์ง๋ฌธ ๋ฅ๋ ฅ๋ง์ ํ๊ฐํฉ๋๋ค. ๋ต๋ณ์ด ์๋ ์ง๋ฌธ์ ์์๊ณผ ํ๋ฆฌํฐ์ ์ง์คํด์ฃผ์ธ์.
|
| 49 |
+
* ๊ฐ๋ณ ์ง๋ฌธ๋ฟ๋ง ์๋๋ผ ์ ์ฒด์ ์ธ ์ง๋ฌธ ์ ๋ต์ ๊ณ ๋ คํ์ญ์์ค.
|
| 50 |
+
|
| 51 |
+
# P.S.
|
| 52 |
+
* Chrome์ ๋ฒ์ญ ๊ธฐ๋ฅ์ ์ฌ์ฉํด์ ํ๊ธ๋ก ๋ฒ์ญ ํ ํ๊ฐํ์
๋ ๋ฉ๏ฟฝ๏ฟฝ๋ค!
|
| 53 |
|
| 54 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|
| 55 |
|
| 56 |
---
|
| 57 |
|
| 58 |
# Labeling Guideline
|
| 59 |
+
|
| 60 |
Thank you for participating in this labeling project!
|
| 61 |
|
| 62 |
+
You will be reviewing interview transcripts of **two different AI interviewers (A and B)** conducting sessions with the same interviewee. Your task is to evaluate the questioning capabilities of these AI interviewers.
|
| 63 |
+
|
| 64 |
+
There are two main tasks to complete:
|
| 65 |
+
|
| 66 |
+
* **Comparison:** Determine which of the two interviewers (A or B) demonstrates superior questioning skills.
|
| 67 |
+
* **Rating:** Rate the quality of each interviewer on a 5-point scale.
|
| 68 |
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
# Evaluation Criteria
|
| 72 |
|
| 73 |
+
The quality of an interviewer is judged by the following criteria. An ideal interviewer satisfies all five of the points listed below.
|
| 74 |
|
| 75 |
+
### Criteria for Good Questions
|
| 76 |
|
| 77 |
+
* **Depth:** Does the interviewer continue questioning a single topic until the responses are sufficiently detailed and specific?
|
| 78 |
+
* **Verifiability:** Do the questions focus on extracting verifiable information? (i.e., information that can reveal contradictions or be verified via external search).
|
| 79 |
+
* *e.g., Questions regarding dates, addresses, affiliation IDs, organization names, emails, or names of supervisors/colleagues.*
|
| 80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
| 82 |
+
* **Personalization:** Are the questions tailored to the interviewee? (i.e., highly relevant to the intervieweeโs specific experiences and previous answers).
|
| 83 |
+
* **Cohesion:** Is there a high degree of logical interconnection between the questions?
|
| 84 |
+
* **Critical Follow-up:** If contradictions or questionable points arose in previous dialogue, did the interviewer ask follow-up questions specifically addressing those inconsistencies?
|
| 85 |
|
| 86 |
+
### Indicators of Poor Questioning
|
| 87 |
|
| 88 |
+
An interviewer is considered less effective if they exhibit the following:
|
| 89 |
|
| 90 |
+
* **Abrupt Topic Switching:** Moving to a completely different topic before the current subject has been sufficiently explored.
|
| 91 |
+
* **Abstract Questions:** Asking questions that make it difficult to verify facts or detect contradictions.
|
| 92 |
+
* *e.g., "What are your hobbies?", "What is the most important value in your life?"*
|
| 93 |
|
|
|
|
| 94 |
|
| 95 |
+
* **External Knowledge Dependency:** Asking questions that rely on general external knowledge rather than the interviewee's personal information or experiences.
|
| 96 |
+
* *e.g., "I work at Google." โ "In what year was Google founded?"*
|
| 97 |
+
* **Exception:** If the question relates to an event the interviewee was directly involved in, it is acceptable. (e.g., "I am the founder of Google." โ "In what year was Google founded?") Please evaluate by considering the context of the previous dialogue.
|
| 98 |
|
|
|
|
| 99 |
|
| 100 |
+
* **Low Relevancy:** Questions lack connection to one another, making it difficult to judge internal consistency or logic.
|
| 101 |
+
* **Ignoring Contradictions:** Moving on to unrelated questions even after a contradiction was clearly detected in the previous dialogue.
|
| 102 |
+
|
| 103 |
+
---
|
| 104 |
|
| 105 |
+
# Important Notes
|
| 106 |
|
| 107 |
+
* **Evaluate the Interviewer Only:** When evaluating, do not judge the intervieweeโs answers. Focus solely on the pattern and quality of the **interviewerโs questions**.
|
| 108 |
+
* **Strategy over Items:** Consider the overall questioning strategy and flow, not just individual questions in isolation.
|
| 109 |
|
| 110 |
+
# P.S.
|
|
|
|
| 111 |
|
| 112 |
+
* You may use the Chrome translation feature to translate the text into Korean while performing your evaluation!
|
| 113 |
|
| 114 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|