AventIQ-AI
/

Resume-Parsing-NER-AI-Model

Safetensors

bert

Model card Files Files and versions

xet

Community

AmanSengar commited on Jun 16, 2025

Commit

1695f42

verified ·

1 Parent(s): 1934c23

Create README.md

Browse files

Files changed (1) hide show

README.md +130 -0

README.md ADDED Viewed

	@@ -0,0 +1,130 @@

+# 🧠 Resume-Parsing-NER-AI-Model
+A custom Named Entity Recognition (NER) model fine-tuned on annotated resume data using a pre-trained BERT architecture. This model extracts structured information such as names, emails, phone numbers, skills, job titles, education, and companies from raw resume text.
+---
+## ✨ Model Highlights
+- 📌 Base Model: bert-base-cased-resume-ner
+- 📚 Datasets: Custom annotated resume dataset (BIO format)
+- 🏷️ Entity Labels: Name, Email, Phone, Education, Skills, Company, Job Title
+- 🔧 Framework: Hugging Face Transformers + PyTorch
+- 💾 Format: transformers model directory (with tokenizer and config)
+---
+## 🧠 Intended Uses
+- ✅ Resume parsing and candidate data extraction
+- ✅ Applicant Tracking Systems (ATS)
+- ✅ Automated HR screening tools
+- ✅ Resume data analytics and visualization
+- ✅ Chatbots and document understanding applications
+---
+## 🚫 Limitations
+- ❌ Performance may degrade on resumes with non-standard formatting
+- ❌ Might not capture entities in handwritten or image-based resumes
+- ❌ May not generalize to other document types without re-training
+---
+## 🏋️‍♂️ Training Details
+| Attribute          | Value                            |
+|--------------------|----------------------------------|
+| Base Model         | bert-base-cased                  |
+| Dataset            | Food-101-Dataset                 |
+| Task Type          | Token Classification (NER)       |
+| Epochs             | 3                                |
+| Batch Size         | 16                               |
+| Optimizer          | AdamW                            |
+| Loss Function      | CrossEntropyLoss                 |
+| Framework          | PyTorch + Transformers           |
+| Hardware           | CUDA-enabled GPU                 |
+---
+## 📊 Evaluation Metrics
+| Metric                                          | Score |
+| ----------------------------------------------- | ----- |
+| Accuracy                                        | 0.98  |
+| F1-Score                                        | 0.98  |
+| Precision                                       | 0.97  |
+| Recall                                          | 0.98  |
+---
+🚀 Usage
+```python
+from datasets import load_dataset
+from transformers import AutoTokenizer,
+from transformers import AutoModelForTokenClassification,
+from transformers import TrainingArguments, Trainer
+from transformers import pipeline
+# Load model and processor
+model_name = "AventIQ-AI/Resume-Parsing-NER-AI-Model"
+model = AutoModelForImageClassification.from_pretrained("bert-base-cased")
+from transformers import pipeline
+ner_pipe = pipeline("ner", model="./resume-ner-model", tokenizer="./resume-ner-model", aggregation_strategy="simple")
+text = "John worked at Infosys as an Analyst. Email: aman@email.com"
+ner_results = ner_pipe(text)
+for entity in ner_results:
+    print(f"{entity['word']} → {entity['entity_group']} ({entity['score']:.2f})")
+label_list = [
+    "O",           # 0
+    "B-NAME",      # 1
+    "I-NAME",      # 2
+    "B-EMAIL",     # 3
+    "I-EMAIL",     # 4
+    "B-PHONE",     # 5
+    "I-PHONE",     # 6
+    "B-EDUCATION", # 7
+    "I-EDUCATION", # 8
+    "B-SKILL",     # 9
+    "I-SKILL",     # 10
+    "B-COMPANY",   # 11
+    "I-COMPANY",   # 12
+    "B-JOB",       # 13
+    "I-JOB"        # 14
+]
+```
+---
+- 🧩 Quantization
+- Post-training static quantization applied using PyTorch to reduce model size and accelerate inference on edge devices.
+----
+🗂 Repository Structure
+```
+.
+beans-vit-finetuned/
+├── config.json               ✅ Model configuration
+├── pytorch_model.bin         ✅ Fine-tuned model weights
+├── tokenizer_config.json     ✅ Tokenizer configuration
+├── vocab.txt                 ✅ BERT vocabulary
+├── training_args.bin         ✅ Training parameters
+├── preprocessor_config.json  ✅ Optional tokenizer pre-processing info
+├── README.md                 ✅ Model card
+```
+---
+🤝 Contributing
+Open to improvements and feedback! Feel free to submit a pull request or open an issue if you find any bugs or want to enhance the model.