Fake Job Prediction Model (RecruitGuardian)

This model detects fraudulent job postings based on their title, description, company profile, and requirements. It uses a robust scikit-learn pipeline consisting of TF-IDF vectorization and a Random Forest Classifier.

Model Details

Model Type: Random Forest Classifier
Feature Extraction: TF-IDF Vectorization (ngram_range=(1, 2), max_features=5000)
Dataset: Real or Fake: Fake Job Postings Prediction
Accuracy: 97.87%
Precision (Fraud): 100%
Recall (Fraud): 56%

Usage

import joblib
import re

# Load the model
pipeline = joblib.load('fake_job_model_pipeline.pkl')

def clean_text(text):
    text = text.lower()
    text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
    return text.strip()

# Predict
sample_text = "Urgent Hiring for Data Entry. High salary, no experience required."
cleaned = clean_text(sample_text)
prediction = pipeline.predict([cleaned])[0]
probability = pipeline.predict_proba([cleaned])[0][1]

print(f"Is Fraudulent: {bool(prediction)} (Confidence: {probability:.4f})")

Intended Use

This model is intended for educational purposes and as a tool to help job seekers identify potentially suspicious job postings. It should not be used as the sole basis for recruitment or legal decisions.

Downloads last month: -