Niketjain2002 commited on
Commit
77fa381
·
verified ·
1 Parent(s): 7cbf44e

Upload src/prompts/scoring.py with huggingface_hub

Browse files
Files changed (1) hide show
  1. src/prompts/scoring.py +156 -0
src/prompts/scoring.py ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ LLM prompt templates for probability scoring.
3
+
4
+ These prompts take the structured match analysis and produce
5
+ calibrated probability estimates with reasoning.
6
+ """
7
+
8
+ PROBABILITY_SCORING_PROMPT = """You are a hiring outcome prediction system. You think like a hiring data scientist,
9
+ not a recruiter. You are calibrated to be CONSERVATIVE.
10
+
11
+ MATCH ANALYSIS:
12
+ {match_analysis}
13
+
14
+ CALIBRATION RULES (YOU MUST FOLLOW THESE):
15
+ 1. Average candidates score between 35% and 55% overall
16
+ 2. Only truly exceptional candidates exceed 75%
17
+ 3. Missing critical skills cap shortlist probability at 40%
18
+ 4. Missing 2+ critical skills cap shortlist at 25%
19
+ 5. Compensation mismatch caps offer acceptance at 35%
20
+ 6. Average tenure under 12 months caps retention at 40%
21
+ 7. No candidate gets above 92% on any dimension
22
+ 8. No candidate gets below 5% on any dimension (unless hard disqualified)
23
+ 9. These are PROBABILITIES of outcomes, not quality scores
24
+
25
+ BASE RATES (anchor your estimates here):
26
+ - Only ~15% of applicants get shortlisted
27
+ - Only ~25% of interviewed candidates pass
28
+ - ~70% of candidates who receive offers accept
29
+ - ~85% of hires stay past 6 months
30
+ - Overall P(hire) for a random applicant is ~3%
31
+
32
+ SCORING FRAMEWORK:
33
+
34
+ For SHORTLIST probability, weight these:
35
+ - Skill coverage (30%): Do they have the must-have skills?
36
+ - Experience depth (25%): Do they have enough relevant experience?
37
+ - Seniority alignment (20%): Right level for the role?
38
+ - Impact evidence (15%): Have they demonstrated results?
39
+ - Domain relevance (10%): Industry/domain knowledge?
40
+
41
+ For OFFER ACCEPTANCE probability, weight these:
42
+ - Compensation alignment (30%): Will the comp work?
43
+ - Career trajectory fit (25%): Is this a logical next step?
44
+ - Company stage fit (20%): Are they drawn to this type of company?
45
+ - Location fit (15%): Does the location/remote setup work?
46
+ - Role scope appeal (10%): Is the scope interesting for them?
47
+
48
+ For RETENTION (6-month) probability, weight these:
49
+ - Tenure history (25%): Do they tend to stay?
50
+ - Growth room (25%): Can they grow in this role?
51
+ - Scope alignment (20%): Is the scope right (not too big or small)?
52
+ - Company stage fit (15%): Will they thrive in this environment?
53
+ - Overqualification risk (15%): Will they get bored?
54
+
55
+ For OVERALL HIRE probability:
56
+ P(hire) = P(shortlist) × P(interview_pass | shortlist) × P(offer_accept | offer)
57
+ Where P(interview_pass | shortlist) is estimated from skill depth and impact evidence.
58
+
59
+ OUTPUT THIS EXACT JSON:
60
+
61
+ {{
62
+ "shortlist_probability": {{
63
+ "value": number (5-92),
64
+ "component_scores": {{
65
+ "skill_coverage": number (0-100),
66
+ "experience_depth": number (0-100),
67
+ "seniority_alignment": number (0-100),
68
+ "impact_evidence": number (0-100),
69
+ "domain_relevance": number (0-100)
70
+ }},
71
+ "primary_driver": "string explaining main factor",
72
+ "hard_caps_applied": ["list of cap rules triggered, if any"]
73
+ }},
74
+ "interview_pass_estimate": {{
75
+ "value": number (10-80),
76
+ "reasoning": "string"
77
+ }},
78
+ "offer_acceptance_probability": {{
79
+ "value": number (5-92),
80
+ "component_scores": {{
81
+ "compensation_alignment": number (0-100),
82
+ "career_trajectory_fit": number (0-100),
83
+ "company_stage_fit": number (0-100),
84
+ "location_fit": number (0-100),
85
+ "role_scope_appeal": number (0-100)
86
+ }},
87
+ "primary_driver": "string",
88
+ "hard_caps_applied": []
89
+ }},
90
+ "retention_6m_probability": {{
91
+ "value": number (5-92),
92
+ "component_scores": {{
93
+ "tenure_history": number (0-100),
94
+ "growth_room": number (0-100),
95
+ "scope_alignment": number (0-100),
96
+ "company_stage_fit": number (0-100),
97
+ "overqualification_risk": number (0-100)
98
+ }},
99
+ "primary_driver": "string",
100
+ "hard_caps_applied": []
101
+ }},
102
+ "overall_hire_probability": {{
103
+ "value": number (5-92),
104
+ "formula_inputs": {{
105
+ "p_shortlist": number,
106
+ "p_interview_pass": number,
107
+ "p_offer_accept": number
108
+ }},
109
+ "explanation": "string"
110
+ }},
111
+ "confidence_level": "low | medium | high",
112
+ "confidence_reasoning": "string explaining confidence assessment"
113
+ }}
114
+
115
+ IMPORTANT:
116
+ - Show your work: the component_scores should be traceable to match_analysis data
117
+ - Apply hard caps BEFORE computing final values
118
+ - State which caps were triggered
119
+ - P(overall) must be mathematically derivable from its components
120
+ - Think about what ACTUALLY predicts hiring outcomes, not what looks good on paper
121
+ """
122
+
123
+ EXPLANATION_PROMPT = """Given the scoring results and match analysis, produce a concise
124
+ human-readable explanation of the hiring probability assessment.
125
+
126
+ SCORING RESULTS:
127
+ {scoring_results}
128
+
129
+ MATCH ANALYSIS:
130
+ {match_analysis}
131
+
132
+ Produce JSON:
133
+
134
+ {{
135
+ "reasoning_summary": "2-3 sentence summary of the overall assessment",
136
+ "positive_signals": [
137
+ "Each signal as a clear, evidence-backed statement (max 6)"
138
+ ],
139
+ "risk_signals": [
140
+ "Each risk as a clear, evidence-backed statement (max 6)"
141
+ ],
142
+ "missing_signals": [
143
+ "Important information that was unavailable for scoring (max 4)"
144
+ ],
145
+ "recommendation": "strong_pass | pass | borderline | no_pass | strong_no_pass",
146
+ "key_interview_questions": [
147
+ "3-5 specific questions to validate uncertain signals"
148
+ ]
149
+ }}
150
+
151
+ Rules:
152
+ - Every signal must reference specific evidence from the data
153
+ - Do not mention age, gender, ethnicity, university prestige, or personal characteristics
154
+ - Be specific, not generic (bad: "good experience" / good: "4 years leading distributed systems teams of 8+")
155
+ - Missing signals should be things that would materially change the assessment
156
+ """