Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -8,61 +8,92 @@ pinned: false
|
|
| 8 |
sdk: static
|
| 9 |
---
|
| 10 |
|
| 11 |
-
#
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
-
|
|
|
|
| 16 |
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
3. repeating some questions for clarification
|
| 20 |
|
| 21 |
-
|
| 22 |
-
|
| 23 |
|
| 24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
-
|
| 30 |
-
|
| 31 |
-
|
|
|
|
| 32 |
|
| 33 |
-
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
You can also leave a comment in "๐ซฑCommunity" --> ["Bug/Error Report during Interview"](https://huggingface.co/spaces/human-labeling/README/discussions/1) on the top right corner of this guideline.
|
| 37 |
|
| 38 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|
| 39 |
|
| 40 |
---
|
| 41 |
-
# Interview Guidelines
|
| 42 |
|
| 43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
-
|
| 46 |
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
3. ํ์ธ์ ์ํด ์ผ๋ถ ์ง๋ฌธ์ ๋ํด ์ฌ์ง๋ฌธ
|
| 50 |
|
| 51 |
-
|
| 52 |
-
์ธํฐ๋ทฐ๋ฅผ ์์ํ์๊ธฐ ์ ์, **์๋ ์ฃผ์์ฌํญ์ ๋ฐ๋์ ์์งํ์๊ธธ ๋ถํ๋๋ฆฝ๋๋ค.**
|
| 53 |
|
| 54 |
-
|
| 55 |
|
| 56 |
-
-
|
| 57 |
-
- ์ฐธ์ฌ์์ ๊ฐ์ธ์ ๋ณด๋ ์ค๋ก์ง ์ฐ๊ตฌ ๋ชฉ์ ์ผ๋ก๋ง ์์งํ๋ฉฐ, **์ด๋ค ๋ฐฉ์์ผ๋ก๋ ํ์ฉ๋์ง ์๊ณ ๋ณดํธ๋จ**์ ๋น๋ถ๋๋ฆฝ๋๋ค.
|
| 58 |
-
- ์์ง๋ ๊ฐ์ธ์ ๋ณด๋ ํ๋ก์ ํธ์ ์ข
๋ฃ์ ๋ฐ๋ผ ์ฆ์ ์ญ์ ๋ฉ๋๋ค.
|
| 59 |
-
- ์ง๋ฌธ์ ๋ตํ ๋์๋ ์ง์ค๋๊ฒ ๋ตํ๋, ๋ต๋ณ์ ๊ฑฐ๋ถํ ๋์๋ ํ์คํ๊ฒ ์์ฌํํ์ ํ์๋ฉด ๋ฉ๋๋ค.
|
| 60 |
-
- ๋จ์ํ ์ง๋ฌธ์ ์๋ต์ ๊ฑฐ๋ถํ ์๋ ์๊ณ , ํ์ ๋ฅผ ๋ฐ๊ฟ ์๋ ์์ต๋๋ค(์: ๋ค๋ฅธ ์ฃผ์ ๋ก ๋ํํ ์ ์์๊น์?).
|
| 61 |
-
- **ํ์ด์ง๋ฅผ ์๋ก๊ณ ์นจํ๊ฑฐ๋ ์ฐฝ์ ๋ซ์ง ๋ง์ธ์!** ์ธํฐ๋ทฐ ๊ธฐ๋ก์ด ์์ค๋์ด ์ฒ์๋ถํฐ ๋ค์ ์์ํด์ผ ํฉ๋๋ค.
|
| 62 |
|
| 63 |
-
|
| 64 |
|
| 65 |
-
|
| 66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|
|
|
|
| 8 |
sdk: static
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Labeling Guideline
|
| 12 |
|
| 13 |
+
๋ ์ด๋ธ๋ง์ ์ฐธ์ฌํด์ฃผ์
์ ๊ฐ์ฌํฉ๋๋ค!
|
| 14 |
|
| 15 |
+
ํด๋น ๋ ์ด๋ธ๋ง์์๋ ๋์ผํ ๋ฉด์ ๋์์์ ๋ํ ์๋ก ๋ค๋ฅธ ๋ AI ๋ฉด์ ๊ด์ ์ธํฐ๋ทฐ ๋ํ ๊ธฐ๋ก์ ๋ณด๊ณ , AI ๋ฉด์ ๊ด์ ์ง๋ฌธ ๋ฅ๋ ฅ์ ํ๊ฐํ๊ฒ ๋ฉ๋๋ค.
|
| 16 |
+
์ฌ๋ฌ๋ถ์ด ํด์ฃผ์
์ผ ํ ํ์คํฌ๋ ์๋์ ๊ฐ์ด ๋ ๊ฐ์ง์
๋๋ค.
|
| 17 |
|
| 18 |
+
- ์๋ก ๋ค๋ฅธ ๋ ๋ฉด์ ๊ด ์ค, ์ด๋ค ๋ฉด์ ๊ด์ด ๋ ๋์ ์ง๋ฌธ์ ํ๋์ง ํ๋จํ์ธ์.
|
| 19 |
+
- ๊ฐ ๋ฉด์ ๊ด์ ์์ง์ 5์ ์ฒ๋๋ก ํ๊ฐํด ์ฃผ์ธ์.
|
|
|
|
| 20 |
|
| 21 |
+
# Evaluation Guideline
|
| 22 |
+
์ง๋ฌธ ๋ฅ๋ ฅ ํ๊ฐ ๊ธฐ์ค์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค. ์๋ ๋ค์ฏ๊ฐ์ง ํญ๋ชฉ์ ๋ํด ๋ชจ๋ ์ถฉ์กฑํ ๊ฒฝ์ฐ ๊ฐ์ฅ ์ง๋ฌธ์ ์ํ ๋ฉด์ ๊ด์
๋๋ค.
|
| 23 |
|
| 24 |
+
### Very Good ๊ธฐ์ค
|
| 25 |
+
- ํ ์ฃผ์ ์ ๋ํด ๋ต๋ณ์ด ์ถฉ๋ถํ ๊ตฌ์ฒดํ๊ฐ ๋ ๋๊น์ง ์ง๋ฌธํ๋๊ฐ
|
| 26 |
+
- ๊ฒ์ฆ ๊ฐ๋ฅํ ์ ๋ณด๋ค์ ๋ฝ์๋ผ ์ ์๋ ์ง๋ฌธ ์์ฃผ๋ก ํ๋๊ฐ (= ๋ชจ์์ ํ๋จํ ์ ์๊ฑฐ๋, ์ธ๋ถ ๊ฒ์์ ํตํด ๊ฒ์ฆํ ์ ์์ ๋งํ ์ง๋ฌธ์ธ๊ฐ)
|
| 27 |
+
|
| 28 |
+
(e.g., "๋ ์ง, ์ฃผ์, ์์ ID, ๊ธฐ๊ด ์ด๋ฆ, ์ด๋ฉ์ผ, ๋ค๋๋ ํ์ฌ ์์ฌ ๋ฑ ๊ด๊ณ์ ์ด๋ฆ" ๊ด๋ จ ์ง๋ฌธ๋ค)
|
| 29 |
+
- ์ธํฐ๋ทฐ์ด ๋ณธ์ธ์ ์ ๋ณด๋ฅผ ๋ฌผ์ด๋ณด๋ ์ง๋ฌธ๋ค๋ก ๊ตฌ์ฑ๋์๋๊ฐ
|
| 30 |
+
- ์ง๋ฌธ๋ค ๊ฐ์ ์ํธ ์ฐ๊ด์ฑ์ด ๋์๊ฐ
|
| 31 |
+
- ์ด์ ๋ํ์์ ๋ชจ์์ด๋ ์๋ฌธ์ ์ด ๋ฐ๊ฒฌ๋์์ ๊ฒฝ์ฐ, ๋ฐ์ํ ๋ชจ์๊ณผ ๊ด๋ จ๋ ์ง๋ฌธ์ ๋ง์ด ํ๋๊ฐ
|
| 32 |
|
| 33 |
+
### Very Bad ์ผ์ด์ค
|
| 34 |
+
๋ฐ๋๋ก ์ง๋ฌธ์ ๋ชปํ ์ผ์ด์ค๋ ๋ค์๊ณผ ๊ฐ์ต๋๋ค.
|
| 35 |
+
- ํ๋์ ์ฃผ์ ์ ๋ํด ์ถฉ๋ถํ ๊ตฌ์ฒดํ๊ฐ ๋์ง ์์๋๋ฐ ๋ฐ๋ก ์์ ํ ๋ค๋ฅธ ์ฃผ์ ๋ก ๋์ด๊ฐ๋ฒ๋ฆฐ ๊ฒฝ์ฐ
|
| 36 |
+
- ๋ชจ์ ์ฌ๋ถ๋ ์ฌ์ค ๊ด๊ณ๋ฅผ ๊ฒ์ฆํ๊ธฐ ์ด๋ ค์ด ์ถ์์ ์ธ ์ง๋ฌธ์ ํ ๊ฒฝ์ฐ
|
| 37 |
+
|
| 38 |
+
(e.g., "๋์ ์ทจ๋ฏธ๋ ๋ญ์ผ?", "๋์ ์ธ์์์ ๊ฐ์ฅ ์ค์ํ ๊ฐ์น๋ ๋ญ์ผ?")
|
| 39 |
+
- ์ธํฐ๋ทฐ์ด ๋ณธ์ธ๊ณผ ์ฐ๊ด์ฑ์ด ๋ฎ์ ์ง๋ฌธ์ ํ ๊ฒฝ์ฐ
|
| 40 |
|
| 41 |
+
(e.g., "๋๋ ๊ตฌ๊ธ์ ๋ค๋
." โ "๊ตฌ๊ธ CEO์ ์์ผ์ ์ธ์ ์ผ?")
|
| 42 |
+
- ์ง๋ฌธ๋ค ์ฌ์ด์ ๊ด๋ จ์ฑ์ด ๋ฎ์ ์ํธ ๋ชจ์์ ํ๋จํ๊ธฐ ์ด๋ ค์ด ๊ฒฝ์ฐ
|
| 43 |
+
- ์ด์ ๋ํ์์ ๋ชจ์์ด ๋ฐ๊ฒฌ๋์์์๋ ์ฐ๊ด์ฑ ์๋ ๋ค๋ฅธ ์ง๋ฌธ์ผ๋ก ๋์ด๊ฐ๋ฒ๋ฆฐ ๊ฒฝ์ฐ
|
| 44 |
+
|
| 45 |
+
### Scoring Summary
|
| 46 |
+
Very Good ๊ธฐ์ค์ ๋ชจ๋ ์ถฉ์กฑํ ๊ฒฝ์ฐ Very Good, 1-2๊ฐ ์์ฌ์ด ๋ถ๋ถ์ด ์๋ค๋ฉด Good, 3๊ฐ ์ ๋ ์ถฉ์กฑํ์ง ๋ชปํ๋ฉด So-so, 1๊ฐ๋ง ์ถฉ์กฑํ๋ค๋ฉด Bad ์
๋๋ค.
|
| 47 |
+
๋ชจ๋ ๋ค ์ถฉ์กฑํ์ง ๋ชปํ๋ฉด Very Bad๋ฅผ ์ฃผ์๋ฉด ๋ฉ๋๋ค.
|
| 48 |
|
|
|
|
| 49 |
|
| 50 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|
| 51 |
|
| 52 |
---
|
|
|
|
| 53 |
|
| 54 |
+
# Labeling Guideline
|
| 55 |
+
Thank you for participating in this labeling project!
|
| 56 |
+
|
| 57 |
+
In this task, you will review interview transcripts of two different AI interviewers engaging with the same interviewee. Your goal is to evaluate the questioning capabilities of each AI interviewer.
|
| 58 |
+
|
| 59 |
+
There are two main tasks you need to perform:
|
| 60 |
+
|
| 61 |
+
Compare and Choose: Determine which of the two interviewers asks better questions.
|
| 62 |
+
|
| 63 |
+
Score: Rate the quality of each interviewer on a 5-point scale.
|
| 64 |
+
|
| 65 |
+
# Evaluation Guideline
|
| 66 |
+
The criteria for evaluating questioning ability are as follows. An interviewer who meets all five of the following items is considered to have performed excellently.
|
| 67 |
+
|
| 68 |
+
### Criteria for "Very Good"
|
| 69 |
+
- Depth of Inquiry: Did the interviewer continue questioning until the response regarding a specific topic was sufficiently detailed?
|
| 70 |
+
|
| 71 |
+
- Verifiability: Did the interviewer focus on questions that extract verifiable information? (i.e., questions that allow for detecting contradictions or can be cross-referenced via external search, such as dates, addresses, IDs, organization names, emails, or names of supervisors/colleagues).
|
| 72 |
+
|
| 73 |
+
- Relevance: Were the questions specifically tailored and relevant to the interviewee?
|
| 74 |
+
|
| 75 |
+
- Cohesion: Is there a high degree of interconnection between the questions?
|
| 76 |
|
| 77 |
+
- Conflict Resolution: If contradictions or questionable points arose in previous dialogue, did the interviewer follow up with multiple questions related to those contradictions?
|
| 78 |
|
| 79 |
+
### "Very Bad" Cases
|
| 80 |
+
Conversely, the following are examples of poor questioning:
|
|
|
|
| 81 |
|
| 82 |
+
- Premature Topic Switching: Moving to a completely different topic before a single subject has been sufficiently explored.
|
|
|
|
| 83 |
|
| 84 |
+
- Abstract Questions: Asking vague questions where it is difficult to verify facts or detect contradictions (e.g., "What are your hobbies?", "What is the most important value in your life?").
|
| 85 |
|
| 86 |
+
- Low Personal Relevance: Asking questions with little connection to the interviewee (e.g., Interviewee: "I work at Google." -> Interviewer: "When is the CEO of Googleโs birthday?").
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
+
- Lack of Correlation: Questions are so unrelated that it is impossible to judge internal consistency or contradictions.
|
| 89 |
|
| 90 |
+
- Ignoring Discrepancies: Moving on to unrelated questions even after a contradiction was detected in the previous conversation.
|
| 91 |
|
| 92 |
+
### Scoring Summary
|
| 93 |
+
Very Good: Meets all "Very Good" criteria.
|
| 94 |
+
Good: Misses 1โ2 criteria or has minor room for improvement.
|
| 95 |
+
So-so: Fails to meet approximately 3 of the criteria.
|
| 96 |
+
Bad: Meets only 1 of the criteria.
|
| 97 |
+
Very Bad: Meets none of the criteria.
|
| 98 |
|
| 99 |
๐ง minskim010203@gmail.com, imsujeong2190@gmail.com
|