| # Talent-Match-AI: Resume and Job Description Matching | |
| ## π Overview | |
| This repository hosts the quantized version of the **BERT-base-uncased** model for **Resume and Job Description Matching**. The model is designed to determine whether a resume aligns well with a given job description. If they are a strong match, the model outputs "Good Fit" with a confidence score; otherwise, it categorizes them as "Potential Fit" or "Not a Good Fit." The model has been optimized for efficient deployment while maintaining reasonable accuracy, making it suitable for real-time applications. | |
| ## π° Model Details | |
| - **Model Architecture:** BERT-base-uncased | |
| - **Task:** Resume and Job Description Matching | |
| - **Dataset:** `facehuggerapoorv/resume-jd-match` | |
| - **Quantization:** Float16 (FP16) for optimized inference | |
| - **Fine-tuning Framework:** Hugging Face Transformers | |
| ## π Usage | |
| ### Installation | |
| ```bash | |
| pip install transformers torch | |
| ``` | |
| ### Loading the Model | |
| ```python | |
| from transformers import BertTokenizer, BertForSequenceClassification | |
| import torch | |
| device = "cuda" if torch.cuda.is_available() else "cpu" | |
| model_name = "AventIQ-AI/bert-talentmatchai" | |
| model = BertForSequenceClassification.from_pretrained(model_name).to(device) | |
| tokenizer = BertTokenizer.from_pretrained(model_name) | |
| ``` | |
| ### Resume Matching Inference | |
| ```python | |
| import torch | |
| # Set device (use GPU if available) | |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | |
| model.to(device) | |
| # Define label mapping | |
| label_mapping = {0: "Not a Good Fit", 1: "Potential Fit", 2: "Good Fit"} | |
| # Sample resume text for testing | |
| test_resume = ["I have worked in different industries and have a lot of experience. I am a hard worker and can learn anything."] | |
| # Tokenize test data | |
| test_tokens = tokenizer(test_resume, padding="max_length", truncation=True, return_tensors="pt").to(device) # Move input to same device as model | |
| # Make predictions | |
| with torch.no_grad(): # Disable gradient computation for inference | |
| output = model(**test_tokens) | |
| # Get predicted label | |
| predicted_label = output.logits.argmax(dim=1).item() | |
| # Print result | |
| print(f"Predicted Category: {predicted_label} ({label_mapping[predicted_label]})") | |
| label_mapping = {0: "No Fit", 1: "Low Fit", 2: "Potential Fit", 3: "Good Fit"} | |
| print(f"Predicted Category: {label_mapping[predictions]}") | |
| ``` | |
| ## π Quantized Model Evaluation Results | |
| ### π₯ Evaluation Metrics π₯ | |
| - β **Accuracy:** 0.9224 | |
| - β **Precision:** 0.9212 | |
| - β **Recall:** 0.8450 | |
| - β **F1-score:** 0.7718 | |
| ## β‘ Quantization Details | |
| Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to Float16 (FP16) to reduce model size and improve inference efficiency while balancing accuracy. | |
| ## π½ Repository Structure | |
| ``` | |
| . | |
| βββ model/ # Contains the quantized model files | |
| βββ tokenizer_config/ # Tokenizer configuration and vocabulary files | |
| βββ model.safetensors/ # Quantized Model | |
| βββ README.md # Model documentation | |
| ``` | |
| ## β οΈ Limitations | |
| - The model may struggle with resumes and job descriptions that use non-standard terminology. | |
| - Quantization may lead to slight degradation in accuracy compared to full-precision models. | |
| - Performance may vary across different industries and job levels. | |
| ## π€ Contributing | |
| Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements. | |