Fake Job Prediction Model (RecruitGuardian)
This model detects fraudulent job postings based on their title, description, company profile, and requirements. It uses a robust scikit-learn pipeline consisting of TF-IDF vectorization and a Random Forest Classifier.
Model Details
- Model Type: Random Forest Classifier
- Feature Extraction: TF-IDF Vectorization (ngram_range=(1, 2), max_features=5000)
- Dataset: Real or Fake: Fake Job Postings Prediction
- Accuracy: 97.87%
- Precision (Fraud): 100%
- Recall (Fraud): 56%
Usage
import joblib
import re
# Load the model
pipeline = joblib.load('fake_job_model_pipeline.pkl')
def clean_text(text):
text = text.lower()
text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
return text.strip()
# Predict
sample_text = "Urgent Hiring for Data Entry. High salary, no experience required."
cleaned = clean_text(sample_text)
prediction = pipeline.predict([cleaned])[0]
probability = pipeline.predict_proba([cleaned])[0][1]
print(f"Is Fraudulent: {bool(prediction)} (Confidence: {probability:.4f})")
Intended Use
This model is intended for educational purposes and as a tool to help job seekers identify potentially suspicious job postings. It should not be used as the sole basis for recruitment or legal decisions.
- Downloads last month
- -