Model Card for Random Forest Job Candidate Classifier

Model Summary

A Random Forest-based model for multi-label classification of job candidates, focusing on hiring decisions and salary predictions. The model achieves high performance in predicting key hiring-related outcomes and is designed to handle imbalanced datasets effectively.

Model Details

Model Description

This model leverages a Random Forest Classifier for multi-label classification in the hiring domain. It analyzes structured and textual data about candidates to make predictions about hiring potential and salary considerations.

  • Developed by: Ryan Smith
  • Model type: Random Forest Classifier
  • Language(s): English
  • License: Apache 2.0
  • Training Framework: Scikit-learn, Imbalanced-learn

Uses

Direct Use

  • Screening job candidates
  • Automated initial assessment of job applications
  • Salary range prediction
  • Interview recommendation systems

Out-of-Scope Use

  • Final hiring decisions without human oversight
  • Automated rejection of candidates
  • Demographic analysis or profiling
  • Use in regions with different labor laws or hiring practices

Bias, Risks, and Limitations

  • Performance may reflect historical biases present in the hiring data.
  • Predictions are influenced by the quality and balance of the training data.
  • Should not be used as the sole decision-maker in hiring processes.
  • Limited interpretability compared to simpler models like logistic regression.

Training Details

Training Data

  • Source: Proprietary dataset containing structured and textual candidate data.
  • Features: Skills, experience, academic performance, projects, extra activities, prior offers, etc.
  • Data balancing: Applied SMOTE to handle class imbalance.

Training Procedure

Training Hyperparameters:

  • Number of Trees: 100
  • Max Depth: None (expanded until all leaves are pure or contain <2 samples)
  • Class Weights: Computed automatically to balance classes
  • Evaluation Metrics: Accuracy, Precision, Recall, F1-Score

Preprocessing Steps:

  1. Textual data vectorized using CountVectorizer.
  2. Synthetic samples generated using SMOTE to balance minority classes.
  3. Dataset split into training and validation sets.

Evaluation

Metrics

Overall Performance:

  • Accuracy: 0.92
  • Precision: 0.90
  • Recall: 0.88
  • Macro F1-Score: 0.89

Per-Class F1-Scores:

  • No: 0.95
  • Interview: 0.85
  • Yes: 0.87

Results Summary

The model demonstrates robust performance across all prediction classes, with particularly strong results for predicting clear rejections. While the model performs well in most tasks, its performance for interview recommendations is slightly lower, indicating room for improvement in this class.

Technical Specifications

Hardware Requirements

  • Suitable for running on standard CPU infrastructure
  • Minimum 8GB RAM recommended

Software

  • Scikit-learn
  • Imbalanced-learn
  • Python 3.6+

Model Card Authors

Ryan Smith

Model Card Contact

Ryan Smith ryan721s-code@yahoo.com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train RyanS974/510app_model_rf

Collection including RyanS974/510app_model_rf