File size: 2,835 Bytes
c0d2381
 
 
 
 
 
 
 
 
 
 
 
f08d3c9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
language: en
license: mit
pipeline_tag: text-classification
tags:
  - resume
  - ats
  - pii
  - nlp
  - huggingface
---

# πŸ€– Resume PII Masking & ATS Optimizer

A professional-grade NLP pipeline to automatically **detect and mask Personally Identifiable Information (PII)** in resumes and **evaluate resume quality based on Applicant Tracking System (ATS) scoring**. Built using the Hugging Face Transformers ecosystem and fine-tuned with custom data, this project simulates real-world applications of Natural Language Processing in HR tech and recruitment automation systems.

---

## Key Features

| Feature                       |                               Description                                  |
|-------------------------------|----------------------------------------------------------------------------|
| PII Masking                   | Detects and masks names, emails, phone numbers, and addresses using NER.   |
| Resume Parsing                | Handles large resumes (up to 2000+ words) with tokenizer support.          |
| ATS Resume Optimization       | Scores resumes based on keyword density, formatting, and clarity.          |
| Job Description Matching      | Optional feature to match resumes with specific job descriptions.          |
| Hugging Face Integration      | Fine-tune and deploy models directly on Hugging Face Hub.                  |
| Modular Architecture          | Well-organized, scalable, and production-ready codebase.                   |

---

## πŸ“ Folder Structure

```bash
resume_ats_project/
β”œβ”€β”€ data/                  # Contains resume samples and PII-labeled training data
β”‚   β”œβ”€β”€ resumes.json
β”‚   └── pii_train.json
β”œβ”€β”€ models/                # Directory to save fine-tuned models
β”‚   └── ats_model/
β”œβ”€β”€ resume_parser.py       # Tokenization, segmentation, and formatting
β”œβ”€β”€ pii_trainer.py         # Script to fine-tune NER model
β”œβ”€β”€ optimizer.py           # ATS scoring logic
β”œβ”€β”€ infer.py               # Combines parsing, masking, and optimization
β”œβ”€β”€ app.py                 # (Optional) Flask or Gradio interface
β”œβ”€β”€ requirements.txt
└── README.md


---
Installation 
git clone https://github.com/your-username/resume-ats-optimizer.git
cd resume_ats_optimizer
pip install -r requirements.txt

---
Real-World Applications
This project mimics systems used by:
LinkedIn Talent Solutions (Resume scoring + redaction)
Amazon HR Automation (Internal resume screening tools)
Google Cloud AutoML NER for internal document pipelines
Infosys & TCS resume filtering portals

---
You can adapt it to:
Job matching portals
Candidate anonymization systems
Large-scale recruitment automation tools

---
License
Licensed under the MIT License.

---
Author
Karthikeyan M C
karthikeyanmc1925@example.com