Resume Job Fit Classifier

A cross-encoder model for predicting whether a resume is a fit for a job description.

Model Description

Fine-tuned BAAI/bge-m3 as a cross-encoder classifier on resume and job description pairs.

The model takes a resume and a job description as input and predicts one of three classes: Good Fit, No Fit, or Potential Fit.

Dataset

Trained on cnamuangtoun/resume-job-description-fit.

Train: 5,059 pairs (90% of original train split, stratified)
Eval: 625 pairs (10% of original train split, stratified)
Test: 1,759 pairs (original test split)

Label distribution: No Fit = 50%, Good Fit = 25%, Potential Fit = 25%

Training Details

Base model: BAAI/bge-m3 (570M params, supports up to 8192 tokens)
Max sequence length: 8192 tokens (resume: 4096, JD: 4000)
Optimizer: AdamW with layer-wise learning rates — bottom layers get LR/10, top layers get full LR, classifier head gets full LR
Learning rate: 8e-6 with cosine scheduler and 15% warmup
Batch size: 1 per device, gradient accumulation steps: 32 (effective batch: 32)
Epochs: 40 max, early stopping with patience 6 (best checkpoint at step 3200, epoch ~21)
Loss: weighted CrossEntropyLoss to handle class imbalance
WeightedRandomSampler to oversample minority classes during training
Label smoothing: 0.1
Dropout: 0.3 classifier, 0.15 hidden layers
Hardware: NVIDIA RTX 4090 (24GB VRAM)
Training time: ~3 hours

Results

Metric	Eval	Test
Accuracy	78.08%	54.58%
Macro F1	76.54%	51.91%
F1 Good Fit	76.31%	44.32%
F1 No Fit	82.35%	66.05%
F1 Potential Fit	70.95%	45.35%

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np

model = AutoModelForSequenceClassification.from_pretrained("med2425/bge-resume-fit")
tokenizer = AutoTokenizer.from_pretrained("med2425/bge-resume-fit")

model.eval()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

resume = """
John Smith
Senior ML Engineer with 6 years experience.
Skills: Python, PyTorch, TensorFlow, NLP, AWS, Docker.
Built NLP pipelines processing 10M documents/day at TechCorp (2020-Present).
Fine-tuned BERT models achieving 94% accuracy on document classification.
B.Sc. Computer Science, State University 2018.
"""

jd = """
Senior Machine Learning Engineer
Requirements: 5+ years ML experience, strong Python,
PyTorch or TensorFlow, NLP experience, production deployment on AWS/GCP/Azure,
Bachelor in Computer Science or related field.
"""

inputs = tokenizer(resume, jd, return_tensors="pt", truncation=True, max_length=8192).to(device)

with torch.no_grad():
    probs = torch.softmax(model(**inputs).logits, dim=-1).squeeze().tolist()

id2label = {0: "Good Fit", 1: "No Fit", 2: "Potential Fit"}
for i, p in enumerate(probs):
    print(f"{id2label[i]}: {p:.2%}")
print(f"Prediction: {id2label[np.argmax(probs)]}")

Note: Use full-length realistic resumes and job descriptions for best results. The model was trained on resumes averaging 700 words and JDs averaging 400 words. Very short inputs may produce unreliable predictions.

Downloads last month: 40

Safetensors

Model size

0.6B params

Tensor type

F32