RetainIQ / README.md

Upload 3 files

d9dddc1 verified 5 days ago

5.01 kB

	---
	license: mit
	library_name: sklearn
	pipeline_tag: tabular-classification
	tags:
	- tabular-classification
	- hr-analytics
	- fairness
	- logistic-regression
	- ai-governance
	- model-card
	datasets:
	- REPLACE-WITH-YOUR-ORG/retainiq-data
	---

	# Model Card — RetainIQ (v2.0, bias-mitigated)

	*Employee-attrition risk model · Hireloom, Inc. · Structured per Mitchell et al. (2019),
	"Model Cards for Model Reporting." Worked example for The AI Governance Lab.
	Educational use only — not for real employment decisions.*

	## 1. Model Details

	\| \| \|
	\|---\|---\|
	\| Model name \| RetainIQ — Employee Attrition Risk Model \|
	\| Owner \| Hireloom, Inc. (provider and deployer — self-oversight / dogfooding) \|
	\| Version / date \| v2.0 · June 2026 (supersedes v1.0 biased champion; see §8) \|
	\| Model type \| Binary classification — logistic regression (`C=0.1`, `class_weight=balanced`) \|
	\| Bias mitigation \| Fairlearn CorrelationRemover on Age only (the ADEA-protected attribute) \|
	\| Pipeline \| One-Hot + CorrelationRemover(Age) + StandardScaler → LogReg; 50 features; scikit-learn 1.7.2; seed 42 \|
	\| Artifact \| `RetainIQ_Champion_AgeOnly.pkl` (8,969 B) + readable coefficients (`_Inside.csv`) \|
	\| Integrity (SHA-256) \| `32a1a21bf9b4984eb5466f955bbef9a729a5535ca81ea279fb48a23513d74057` \|
	\| Contact \| [Hireloom AI Governance lead — placeholder] \|

	## 2. Intended Use
	- Primary use: flag employees at elevated attrition risk so HR can prioritize retention outreach. Decision-SUPPORT only; monthly batch-scored, capacity-ranked list.
	- Intended users: Hireloom HR / People team and people-managers.
	- Out of scope: NOT for termination, discipline, compensation denial, or any adverse action; not autonomous; not validated outside Hireloom's workforce.

	## 3. Factors
	Protected groups assessed: Age (ADEA-protected 40 line), Gender, Marital Status. Outputs a probability converted to a flag via a chosen threshold.

	## 4. Metrics
	ROC-AUC and PR-AUC (threshold-independent), plus recall, precision, F1 at 0.50. Because leavers are ~16%, PR-AUC and recall are prioritized over plain accuracy. The production operating threshold is a separate, capacity-driven choice set by HR.

	## 5. Evaluation Data
	Sealed test set: 294 employees (20% stratified hold-out, seed 42), 47 real leavers (16.0%). Quarantined from Module 5; opened once for the final grade.

	## 6. Training Data
	1,176 employees (80% stratified split), 16.2% leavers. Protected attributes retained through preprocessing so fairness could be measured; the deployed pipeline then applies age-only CorrelationRemover (removes the Age signal; retains Gender/Marital Status).

	## 7. Quantitative Analyses
	Sealed-test performance of the deployed v2.0 (age-only) champion, with the biased v1.0 for comparison:

	\| Metric \| Deployed v2.0 (age-only) \| Biased v1.0 (rejected) \|
	\|---\|---\|---\|
	\| ROC-AUC \| 0.811 \| 0.812 \|
	\| PR-AUC \| 0.534 \| 0.588 \|
	\| Recall \| 0.702 \| 0.681 \|
	\| Precision \| 0.407 \| 0.381 \|
	\| F1 \| 0.516 \| 0.489 \|

	Held-out fairness (equalized-odds difference; lower = fairer): Age 40+ 0.045 (was 0.369 — gap closed; 40+ recall 0.44→0.69); Marital Status 0.281 (improved); Gender 0.026 (held fair). Caveat: PR-AUC is marginally lower (0.534), so the gain is clearest at the chosen operating point; per-group counts are small (40+ n=16, Divorced n=4).

	Why age-only: a three-way bake-off showed all-attribute CorrelationRemover worsened gender (EOD 0.033→0.137) and cost more accuracy, while age-only closed the ADEA gap with no gender regression and no accuracy cost.

	## 8. Ethical Considerations
	- Deployment decision (R-16) — RESOLVED: deploy the age-only bias-mitigated champion (v2.0), prioritizing the ADEA-protected age group (largest gap + greatest legal exposure). The biased v1.0 and the all-3-attribute variant are documented as considered-and-rejected alternatives.
	- Residual / monitoring targets: the marital-status spread persists, and gender balance must be watched to confirm the untargeted groups don't drift — both are named monitoring targets (Module 9).
	- Self-oversight / dogfooding: Hireloom is both provider and deployer; independent fairness review recommended.
	- Misuse risk: flags are decision-support, not verdicts; adverse use would be inappropriate and likely unlawful.

	## 9. Caveats & Recommendations
	- Synthetic data stand-in: revalidate on real Hireloom data before any deployment.
	- Threshold: set/document the production operating point by HR capacity; re-run the fairness audit AT that threshold per protected group.
	- Monitoring: watch drift, the gender EOD, and the marital spread; define retrain/retire triggers.
	- Legal/compliance (not legal advice): obtain review of ADEA, Title VII, NYC Local Law 144, EEOC guidance, EU AI Act (high-risk employment) and GDPR special-category obligations before real deployment.

	## Reference
	Mitchell, M., et al. (2019). Model Cards for Model Reporting. FAT* '19. arXiv:1810.03993.