|
|
--- |
|
|
tags: |
|
|
- text-classification |
|
|
- recruitment |
|
|
- forensics |
|
|
- security |
|
|
license: mit |
|
|
datasets: |
|
|
- dcata004/recruiter-harvesting-dataset-v1 |
|
|
pipeline_tag: text-classification |
|
|
--- |
|
|
|
|
|
# 🐍 V.I.P.E.R. Classification Engine (v1.0) |
|
|
**Maintainer:** [Cata Risk Lab](https://huggingface.co/Cata-Risk-Lab) |
|
|
|
|
|
## 🧠 Model Overview |
|
|
This repository contains the configuration and architecture definitions for the **V.I.P.E.R.** recruitment auditing system. It defines the risk thresholds and vectorization parameters used to detect "Resume Harvesting" attacks. |
|
|
|
|
|
## 🛠️ Configuration |
|
|
The model operates on a `TfidfVectorizer` pipeline optimized for short-text classification of email subjects and bodies. |
|
|
|
|
|
- **Risk Threshold:** 0.75 (Confidence score required to flag as SPAM) |
|
|
- **Labels:** `['harvesting', 'legitimate']` |
|
|
- **Dataset:** Trained on forensic recruitment data (Swiss/US/UK). |
|
|
|
|
|
## ⚖️ Sovereign AI |
|
|
Designed for local inference to protect user data privacy. |