Spaces:

human-labeling
/

README

Running

App Files Files Community

tnwjddla2190 commited on Dec 23, 2025

Commit

df6357d

verified ·

1 Parent(s): 8eec9b4

Update README.md

Browse files

Files changed (1) hide show

README.md +58 -46

README.md CHANGED Viewed

@@ -12,91 +12,103 @@ sdk: static
 레이블링에 참여해주셔서 감사합니다!
-여러분은 동일한 면접 대상자에 대한 **서로 다른 두 AI 면접관 (A , B)**의 인터뷰 대화 기록을 보고, AI 면접관의 질문 능력을 평가하게 됩니다.
 여러분이 해주셔야 할 태스크는 아래와 같이 두 가지입니다.
-- 서로 다른 두 면접관 (A , B) 중, 어떤 면접관이 더 나은 질문을 하는지 판단하세요.
-- 각 면접관의 자질을 5점 척도로 평가해 주세요.
-# Evaluation Guideline
 질문 능력 평가 기준은 다음과 같습니다. 아래 다섯가지 항목에 대해 모두 충족할 경우 가장 질문을 잘한 면접관입니다.
-### Very Good 기준
-- 한 주제에 대해 답변이 충분히 구체화가 될 때까지 질문했는가
-- 검증 가능한 정보들을 뽑아낼 수 있는 질문 위주로 했는가 (= 모순을 판단할 수 있거나, 외부 검색을 통해 검증할 수 있을 만한 질문인가)
   (e.g., "날짜, 주소, 소속 ID, 기관 이름, 이메일, 다니는 회사 상사 등 관계자 이름" 관련 질문들)
-- 질문이 인터뷰이에 특화된 질문인가 (즉 인터뷰이의 구체적인 경험, 답변과 연관성이 높은 질문인가)
-- 질문들 간의 상호 연관성이 높은가
-- 이전 대화에서 모순이나 의문점이 발견되었을 경우, 발생한 모순과 관련된 질문을 많이 했는가
-### Very Poor 케이스
 반대로 질문을 못한 케이스는 다음과 같습니다.
-- 하나의 주제에 대해 충분히 구체화가 되지 않았는데 바로 완전히 다른 주제로 넘어가버린 경우
-- 모순 여부나 사실 관계를 검증하기 어려운 추상적인 질문을 한 경우
   (e.g., "너의 취미는 뭐야?", "너의 인생에서 가장 중요한 가치는 뭐야?")
-- 인터뷰이 본인의 정보 및 경험과는 연관이 낮고 외부 지식을 이용해 답변해야 하는 질문을 한 경우
   (e.g., "나는 구글에 다녀." → "구글 설립 연도는 언제야?")
-  - 예외) "나는 구글 창립자야." → "구글의 설립 연도는 언제야?" 처럼 인터뷰이가 직접 참여한 이벤트/사건/경험과 밀접한 질문은 허용함. 따라서 이전 질문과 답변들을 함께 고려해서 평가해야 함.
-- 질문들 사이의 관련성이 낮아 상호 모순을 판단하기 어려운 경우
-- 이전 대화에서 모순이 발견되었음에도 연관성 없는 다른 질문으로 넘어가버린 경우
-# Caution
-- 면접관을 평가할 때, 인터뷰이의 답변은 고려하지 않고 면접관의 질문 능력만을 평가합니다. 답변이 아닌 질문에 집중해주세요.
-- 개별 질문뿐만 아니라 전체적인 질문 전략을 고려하십시오.
 📧 minskim010203@gmail.com, imsujeong2190@gmail.com
 ---
 # Labeling Guideline
 Thank you for participating in this labeling project!
-In this task, you will review interview transcripts of two different AI interviewers (A and B) interacting with the same interviewee. Your goal is to evaluate the questioning capabilities of these AI interviewers.
-There are two main tasks you need to perform:
-Compare: Determine which of the two interviewers (A or B) asks better questions.
-Rate: Evaluate the quality of each interviewer on a 5-point scale.
-## Evaluation Guideline
-The criteria for evaluating questioning ability are as follows. An interviewer who satisfies all five of the following items is considered to have performed excellently.
-### "Very Good" Criteria
-- Depth: Did the interviewer continue questioning a single topic until the response became sufficiently specific?
-- Verifiability: Did the interviewer focus on questions that extract verifiable information? (i.e., information that can be checked for contradictions or verified through external search).
-- Examples: Questions regarding dates, addresses, affiliation IDs, organization names, emails, names of supervisors/colleagues, etc.
-- Personalization: Are the questions tailored specifically to the interviewee? (i.e., highly relevant to the interviewee's specific experiences and previous answers).
-- Coherence: Is there a high degree of interconnection between the questions?
-- Contradiction Handling: If contradictions or questionable points arose in previous dialogue, did the interviewer follow up with questions related to those contradictions?
-### "Very Poor" Cases
-Conversely, the following cases indicate poor questioning:
-- Premature Topic Switching: Moving to a completely different topic before a current topic has been sufficiently detailed.
-- Abstract Questions: Asking abstract questions that make it difficult to verify facts or detect contradictions.
-  - Examples: "What is your hobby?", "What is the most important value in your life?"
-- External Knowledge Dependency: Asking questions that rely on general external knowledge rather than the interviewee's own information and experiences.
-  - Example: Interviewee says "I work at Google" → Interviewer asks "When was Google founded?"
-  - Exception: If the question is closely linked to an event/experience the interviewee was directly involved in, it is acceptable. (e.g., Interviewee says "I am the founder of Google" → "When was Google founded?"). Evaluation must consider the context of previous questions and answers.
-- Low Relatedness: Questions that are so unrelated that it is impossible to judge mutual contradictions.
-- Ignoring Inconsistencies: Moving on to an unrelated question even though a contradiction was detected in the previous dialogue.
-# Caution
-When evaluating the interviewer, focus only on the interviewer's questioning ability. Do not let the quality of the interviewee's answers influence your score.
-Consider the overall questioning strategy throughout the transcript, not just individual questions in isolation.
 📧 minskim010203@gmail.com, imsujeong2190@gmail.com

 레이블링에 참여해주셔서 감사합니다!
+여러분은 동일한 면접 대상자에 대한 **서로 다른 두 AI 면접관 A와 B**의 인터뷰 대화 기록을 보고, AI 면접관의 질문 능력을 평가하게 됩니다.
 여러분이 해주셔야 할 태스크는 아래와 같이 두 가지입니다.
+* 서로 다른 두 면접관 (A , B) 중, 어떤 면접관이 더 나은 질문을 하는지 판단하세요.
+* 각 면접관의 자질을 5점 척도로 평가해 주세요.
+# 평가 기준
 질문 능력 평가 기준은 다음과 같습니다. 아래 다섯가지 항목에 대해 모두 충족할 경우 가장 질문을 잘한 면접관입니다.
+### 좋은 질문의 기준
+* 한 주제에 대해 답변이 충분히 구체화가 될 때까지 질문했는가
+* 검증 가능한 정보들을 뽑아낼 수 있는 질문 위주로 했는가 (= 모순을 판단할 수 있거나, 외부 검색을 통해 검증할 수 있을 만한 질문인가)
   (e.g., "날짜, 주소, 소속 ID, 기관 이름, 이메일, 다니는 회사 상사 등 관계자 이름" 관련 질문들)
+* 질문이 인터뷰이에 특화된 질문인가 (즉 인터뷰이의 구체적인 경험, 답변과 연관성이 높은 질문인가)
+* 질문들 간의 상호 연관성이 높은가
+* 이전 대화에서 모순이나 의문점이 발견되었을 경우, 발생한 모순과 관련된 질문을 많이 했는가
+### 질문을 잘하지 못한 경우
 반대로 질문을 못한 케이스는 다음과 같습니다.
+* 하나의 주제에 대해 충분히 구체화가 되지 않았는데 바로 완전히 다른 주제로 넘어가버린 경우
+* 모순 여부나 사실 관계를 검증하기 어려운 추상적인 질문을 한 경우
   (e.g., "너의 취미는 뭐야?", "너의 인생에서 가장 중요한 가치는 뭐야?")
+* 인터뷰이 본인의 정보 및 경험과는 연관이 낮고 외부 지식을 이용해 답변해야 하는 질문을 한 경우
   (e.g., "나는 구글에 다녀." → "구글 설립 연도는 언제야?")
+  * 예외) "나는 구글 창립자야." → "구글의 설립 연도는 언제야?" 처럼 인터뷰이가 직접 참여한 이벤트/사건/경험과 밀접한 질문은 허용함. 따라서 이전 질문과 답변들을 함께 고려해서 평가해야 함.
+* 질문들 사이의 관련성이 낮아 상호 모순을 판단하기 어려운 경우
+* 이전 대화에서 모순이 발견되었음에도 연관성 없는 다른 질문으로 넘어가버린 경우
+# 주의 사항
+* 면접관을 평가할 때, 인터뷰이의 답변은 고려하지 않고 면접관의 질문 능력만을 평가합니다. 답변이 아닌 질문의 양상과 퀄리티에 집중해주세요.
+* 개별 질문뿐만 아니라 전체적인 질문 전략을 고려하십시오.
+# P.S.
+* Chrome의 번역 기능을 사용해서 한글로 번역 후 평가하셔도 됩��다!
 📧 minskim010203@gmail.com, imsujeong2190@gmail.com
 ---
 # Labeling Guideline
 Thank you for participating in this labeling project!
+You will be reviewing interview transcripts of **two different AI interviewers (A and B)** conducting sessions with the same interviewee. Your task is to evaluate the questioning capabilities of these AI interviewers.
+There are two main tasks to complete:
+* **Comparison:** Determine which of the two interviewers (A or B) demonstrates superior questioning skills.
+* **Rating:** Rate the quality of each interviewer on a 5-point scale.
+---
+# Evaluation Criteria
+The quality of an interviewer is judged by the following criteria. An ideal interviewer satisfies all five of the points listed below.
+### Criteria for Good Questions
+* **Depth:** Does the interviewer continue questioning a single topic until the responses are sufficiently detailed and specific?
+* **Verifiability:** Do the questions focus on extracting verifiable information? (i.e., information that can reveal contradictions or be verified via external search).
+* *e.g., Questions regarding dates, addresses, affiliation IDs, organization names, emails, or names of supervisors/colleagues.*
+* **Personalization:** Are the questions tailored to the interviewee? (i.e., highly relevant to the interviewee’s specific experiences and previous answers).
+* **Cohesion:** Is there a high degree of logical interconnection between the questions?
+* **Critical Follow-up:** If contradictions or questionable points arose in previous dialogue, did the interviewer ask follow-up questions specifically addressing those inconsistencies?
+### Indicators of Poor Questioning
+An interviewer is considered less effective if they exhibit the following:
+* **Abrupt Topic Switching:** Moving to a completely different topic before the current subject has been sufficiently explored.
+* **Abstract Questions:** Asking questions that make it difficult to verify facts or detect contradictions.
+* *e.g., "What are your hobbies?", "What is the most important value in your life?"*
+* **External Knowledge Dependency:** Asking questions that rely on general external knowledge rather than the interviewee's personal information or experiences.
+* *e.g., "I work at Google." → "In what year was Google founded?"*
+* **Exception:** If the question relates to an event the interviewee was directly involved in, it is acceptable. (e.g., "I am the founder of Google." → "In what year was Google founded?") Please evaluate by considering the context of the previous dialogue.
+* **Low Relevancy:** Questions lack connection to one another, making it difficult to judge internal consistency or logic.
+* **Ignoring Contradictions:** Moving on to unrelated questions even after a contradiction was clearly detected in the previous dialogue.
+---
+# Important Notes
+* **Evaluate the Interviewer Only:** When evaluating, do not judge the interviewee’s answers. Focus solely on the pattern and quality of the **interviewer’s questions**.
+* **Strategy over Items:** Consider the overall questioning strategy and flow, not just individual questions in isolation.
+# P.S.
+* You may use the Chrome translation feature to translate the text into Korean while performing your evaluation!
 📧 minskim010203@gmail.com, imsujeong2190@gmail.com