bert-talentmatchai / README.md
varshamishra's picture
Create README.md
c14a020 verified
# Talent-Match-AI: Resume and Job Description Matching
## πŸ“Œ Overview
This repository hosts the quantized version of the **BERT-base-uncased** model for **Resume and Job Description Matching**. The model is designed to determine whether a resume aligns well with a given job description. If they are a strong match, the model outputs "Good Fit" with a confidence score; otherwise, it categorizes them as "Potential Fit" or "Not a Good Fit." The model has been optimized for efficient deployment while maintaining reasonable accuracy, making it suitable for real-time applications.
## 🏰 Model Details
- **Model Architecture:** BERT-base-uncased
- **Task:** Resume and Job Description Matching
- **Dataset:** `facehuggerapoorv/resume-jd-match`
- **Quantization:** Float16 (FP16) for optimized inference
- **Fine-tuning Framework:** Hugging Face Transformers
## πŸš€ Usage
### Installation
```bash
pip install transformers torch
```
### Loading the Model
```python
from transformers import BertTokenizer, BertForSequenceClassification
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model_name = "AventIQ-AI/bert-talentmatchai"
model = BertForSequenceClassification.from_pretrained(model_name).to(device)
tokenizer = BertTokenizer.from_pretrained(model_name)
```
### Resume Matching Inference
```python
import torch
# Set device (use GPU if available)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Define label mapping
label_mapping = {0: "Not a Good Fit", 1: "Potential Fit", 2: "Good Fit"}
# Sample resume text for testing
test_resume = ["I have worked in different industries and have a lot of experience. I am a hard worker and can learn anything."]
# Tokenize test data
test_tokens = tokenizer(test_resume, padding="max_length", truncation=True, return_tensors="pt").to(device) # Move input to same device as model
# Make predictions
with torch.no_grad(): # Disable gradient computation for inference
output = model(**test_tokens)
# Get predicted label
predicted_label = output.logits.argmax(dim=1).item()
# Print result
print(f"Predicted Category: {predicted_label} ({label_mapping[predicted_label]})")
label_mapping = {0: "No Fit", 1: "Low Fit", 2: "Potential Fit", 3: "Good Fit"}
print(f"Predicted Category: {label_mapping[predictions]}")
```
## πŸ“Š Quantized Model Evaluation Results
### πŸ”₯ Evaluation Metrics πŸ”₯
- βœ… **Accuracy:** 0.9224
- βœ… **Precision:** 0.9212
- βœ… **Recall:** 0.8450
- βœ… **F1-score:** 0.7718
## ⚑ Quantization Details
Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to Float16 (FP16) to reduce model size and improve inference efficiency while balancing accuracy.
## πŸ’½ Repository Structure
```
.
β”œβ”€β”€ model/ # Contains the quantized model files
β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration and vocabulary files
β”œβ”€β”€ model.safetensors/ # Quantized Model
β”œβ”€β”€ README.md # Model documentation
```
## ⚠️ Limitations
- The model may struggle with resumes and job descriptions that use non-standard terminology.
- Quantization may lead to slight degradation in accuracy compared to full-precision models.
- Performance may vary across different industries and job levels.
## 🀝 Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.