Model Card for Random Forest Job Candidate Classifier
Model Summary
A Random Forest-based model for multi-label classification of job candidates, focusing on hiring decisions and salary predictions. The model achieves high performance in predicting key hiring-related outcomes and is designed to handle imbalanced datasets effectively.
Model Details
Model Description
This model leverages a Random Forest Classifier for multi-label classification in the hiring domain. It analyzes structured and textual data about candidates to make predictions about hiring potential and salary considerations.
- Developed by: Ryan Smith
- Model type: Random Forest Classifier
- Language(s): English
- License: Apache 2.0
- Training Framework: Scikit-learn, Imbalanced-learn
Uses
Direct Use
- Screening job candidates
- Automated initial assessment of job applications
- Salary range prediction
- Interview recommendation systems
Out-of-Scope Use
- Final hiring decisions without human oversight
- Automated rejection of candidates
- Demographic analysis or profiling
- Use in regions with different labor laws or hiring practices
Bias, Risks, and Limitations
- Performance may reflect historical biases present in the hiring data.
- Predictions are influenced by the quality and balance of the training data.
- Should not be used as the sole decision-maker in hiring processes.
- Limited interpretability compared to simpler models like logistic regression.
Training Details
Training Data
- Source: Proprietary dataset containing structured and textual candidate data.
- Features: Skills, experience, academic performance, projects, extra activities, prior offers, etc.
- Data balancing: Applied SMOTE to handle class imbalance.
Training Procedure
Training Hyperparameters:
- Number of Trees: 100
- Max Depth: None (expanded until all leaves are pure or contain <2 samples)
- Class Weights: Computed automatically to balance classes
- Evaluation Metrics: Accuracy, Precision, Recall, F1-Score
Preprocessing Steps:
- Textual data vectorized using
CountVectorizer. - Synthetic samples generated using SMOTE to balance minority classes.
- Dataset split into training and validation sets.
Evaluation
Metrics
Overall Performance:
- Accuracy: 0.92
- Precision: 0.90
- Recall: 0.88
- Macro F1-Score: 0.89
Per-Class F1-Scores:
- No: 0.95
- Interview: 0.85
- Yes: 0.87
Results Summary
The model demonstrates robust performance across all prediction classes, with particularly strong results for predicting clear rejections. While the model performs well in most tasks, its performance for interview recommendations is slightly lower, indicating room for improvement in this class.
Technical Specifications
Hardware Requirements
- Suitable for running on standard CPU infrastructure
- Minimum 8GB RAM recommended
Software
- Scikit-learn
- Imbalanced-learn
- Python 3.6+
Model Card Authors
Ryan Smith
Model Card Contact
Ryan Smith ryan721s-code@yahoo.com