Model Card for Random Forest Job Candidate Classifier

Model Summary

A Random Forest-based model for multi-label classification of job candidates, focusing on hiring decisions and salary predictions. The model achieves high performance in predicting key hiring-related outcomes and is designed to handle imbalanced datasets effectively.

Model Details

Model Description

This model leverages a Random Forest Classifier for multi-label classification in the hiring domain. It analyzes structured and textual data about candidates to make predictions about hiring potential and salary considerations.

Developed by: Ryan Smith
Model type: Random Forest Classifier
Language(s): English
License: Apache 2.0
Training Framework: Scikit-learn, Imbalanced-learn

Uses

Direct Use

Screening job candidates
Automated initial assessment of job applications
Salary range prediction
Interview recommendation systems

Out-of-Scope Use

Final hiring decisions without human oversight
Automated rejection of candidates
Demographic analysis or profiling
Use in regions with different labor laws or hiring practices

Bias, Risks, and Limitations

Performance may reflect historical biases present in the hiring data.
Predictions are influenced by the quality and balance of the training data.
Should not be used as the sole decision-maker in hiring processes.
Limited interpretability compared to simpler models like logistic regression.

Training Details

Training Data

Source: Proprietary dataset containing structured and textual candidate data.
Features: Skills, experience, academic performance, projects, extra activities, prior offers, etc.
Data balancing: Applied SMOTE to handle class imbalance.

Training Procedure

Training Hyperparameters:

Number of Trees: 100
Max Depth: None (expanded until all leaves are pure or contain <2 samples)
Class Weights: Computed automatically to balance classes
Evaluation Metrics: Accuracy, Precision, Recall, F1-Score

Preprocessing Steps:

Textual data vectorized using CountVectorizer.
Synthetic samples generated using SMOTE to balance minority classes.
Dataset split into training and validation sets.

Evaluation

Metrics

Overall Performance:

Accuracy: 0.92
Precision: 0.90
Recall: 0.88
Macro F1-Score: 0.89

Per-Class F1-Scores:

No: 0.95
Interview: 0.85
Yes: 0.87

Results Summary

The model demonstrates robust performance across all prediction classes, with particularly strong results for predicting clear rejections. While the model performs well in most tasks, its performance for interview recommendations is slightly lower, indicating room for improvement in this class.

Technical Specifications

Hardware Requirements

Suitable for running on standard CPU infrastructure
Minimum 8GB RAM recommended

Software

Scikit-learn
Imbalanced-learn
Python 3.6+

Model Card Authors

Ryan Smith

Model Card Contact

Ryan Smith ryan721s-code@yahoo.com

Downloads last month: -; Downloads are not tracked for this model. How to track

Dataset used to train RyanS974/510app_model_rf

Collection including RyanS974/510app_model_rf

510app

Collection

The model, dataset, and space of my course project app. This was a multi-label text classification supervised learning project. • 3 items • Updated Mar 2