Spaces:
Configuration error
Configuration error
Upload 3 files
Browse files- README.md +93 -13
- app.py +110 -0
- requirements.txt +7 -0
README.md
CHANGED
|
@@ -1,13 +1,93 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
--
|
| 12 |
-
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ATS Resume Scorer
|
| 2 |
+
|
| 3 |
+
An AI-powered tool to analyze resume-job description compatibility using ATS (Applicant Tracking System) scoring and skill matching.
|
| 4 |
+
|
| 5 |
+
## 🚀 Features
|
| 6 |
+
|
| 7 |
+
- **PDF Resume Upload**: Extract text from PDF resumes using pdfplumber
|
| 8 |
+
- **Job Description Input**: Paste job descriptions for comparison
|
| 9 |
+
- **ATS Match Scoring**: Calculate similarity score using TF-IDF vectorization and cosine similarity
|
| 10 |
+
- **Skill Matching**: Identify matched and missing skills from a predefined list
|
| 11 |
+
- **Interactive UI**: Clean Streamlit interface with progress bars and color-coded tags
|
| 12 |
+
- **Error Handling**: Graceful handling of invalid PDFs, empty inputs, and extraction failures
|
| 13 |
+
|
| 14 |
+
## 🛠️ Tech Stack
|
| 15 |
+
|
| 16 |
+
- **Frontend/UI**: Streamlit
|
| 17 |
+
- **PDF Processing**: pdfplumber
|
| 18 |
+
- **NLP Processing**: NLTK (tokenization, stopwords, lemmatization)
|
| 19 |
+
- **Machine Learning**: scikit-learn (TF-IDF, Cosine Similarity)
|
| 20 |
+
- **Python**: Core language
|
| 21 |
+
|
| 22 |
+
## 📁 Project Structure
|
| 23 |
+
|
| 24 |
+
```
|
| 25 |
+
ATS-Resume-Scorer/
|
| 26 |
+
├── app.py # Main Streamlit application
|
| 27 |
+
├── utils/
|
| 28 |
+
│ ├── text_extraction.py # PDF text extraction utilities
|
| 29 |
+
│ ├── preprocessing.py # Text preprocessing (lowercase, punctuation, stopwords, lemmatization)
|
| 30 |
+
│ ├── scoring.py # ATS score calculation using TF-IDF and cosine similarity
|
| 31 |
+
│ └── skill_matcher.py # Skill matching against predefined list
|
| 32 |
+
├── data/
|
| 33 |
+
│ └── skills.txt # Predefined list of skills for matching
|
| 34 |
+
├── requirements.txt # Python dependencies
|
| 35 |
+
└── README.md # Project documentation
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
## 🏃♂️ How to Run Locally
|
| 39 |
+
|
| 40 |
+
1. **Clone or download the project**:
|
| 41 |
+
```bash
|
| 42 |
+
cd /path/to/your/workspace
|
| 43 |
+
# Place the ATS-Resume-Scorer folder here
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
2. **Install dependencies**:
|
| 47 |
+
```bash
|
| 48 |
+
cd ATS-Resume-Scorer
|
| 49 |
+
pip install -r requirements.txt
|
| 50 |
+
```
|
| 51 |
+
|
| 52 |
+
3. **Run the application**:
|
| 53 |
+
```bash
|
| 54 |
+
streamlit run app.py
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
4. **Open your browser** and go to `http://localhost:8501`
|
| 58 |
+
|
| 59 |
+
## ☁️ Deploy on Streamlit Cloud
|
| 60 |
+
|
| 61 |
+
1. **Fork or upload to GitHub**: Ensure all files are in a GitHub repository.
|
| 62 |
+
|
| 63 |
+
2. **Go to Streamlit Cloud**: Visit [share.streamlit.io](https://share.streamlit.io)
|
| 64 |
+
|
| 65 |
+
3. **Connect your GitHub repo**: Select the repository containing this project.
|
| 66 |
+
|
| 67 |
+
4. **Deploy**: Choose `app.py` as the main file and click deploy.
|
| 68 |
+
|
| 69 |
+
5. **Access your app**: Once deployed, you'll get a public URL to access the application.
|
| 70 |
+
|
| 71 |
+
## 📝 Usage
|
| 72 |
+
|
| 73 |
+
1. Upload a PDF resume using the file uploader.
|
| 74 |
+
2. Paste the job description text in the text area.
|
| 75 |
+
3. Click "Analyze" to get:
|
| 76 |
+
- ATS match score (0-100%)
|
| 77 |
+
- Visual progress bar
|
| 78 |
+
- Matched skills (green tags)
|
| 79 |
+
- Missing skills (red tags)
|
| 80 |
+
|
| 81 |
+
## 🔧 Customization
|
| 82 |
+
|
| 83 |
+
- **Skills List**: Edit `data/skills.txt` to add or modify the predefined skills for matching.
|
| 84 |
+
- **Preprocessing**: Modify `utils/preprocessing.py` to adjust text cleaning steps.
|
| 85 |
+
- **Scoring Algorithm**: Enhance `utils/scoring.py` for more advanced similarity measures.
|
| 86 |
+
|
| 87 |
+
## 🤝 Contributing
|
| 88 |
+
|
| 89 |
+
Feel free to fork this project and submit pull requests for improvements!
|
| 90 |
+
|
| 91 |
+
## 📄 License
|
| 92 |
+
|
| 93 |
+
This project is open-source and available under the MIT License.
|
app.py
ADDED
|
@@ -0,0 +1,110 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import streamlit as st
|
| 2 |
+
import tempfile, os
|
| 3 |
+
import pandas as pd
|
| 4 |
+
|
| 5 |
+
os.chdir(os.path.dirname(__file__))
|
| 6 |
+
|
| 7 |
+
from utils.text_extraction import extract_text_from_pdf
|
| 8 |
+
from utils.preprocessing import preprocess_text
|
| 9 |
+
from utils.experience_extractor import extract_experience_years
|
| 10 |
+
from utils.skill_matcher import load_skills, skill_percentage
|
| 11 |
+
from utils.bert_matcher import bert_similarity
|
| 12 |
+
from utils.scoring import final_ats_score
|
| 13 |
+
|
| 14 |
+
# ---------------- PAGE CONFIG ----------------
|
| 15 |
+
st.set_page_config(
|
| 16 |
+
page_title="Smart ATS Shortlisting System",
|
| 17 |
+
layout="wide"
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
st.title("🤖 Smart BERT-Based ATS Resume Shortlister")
|
| 21 |
+
|
| 22 |
+
# ---------------- FILE UPLOADERS ----------------
|
| 23 |
+
jd_file = st.file_uploader(
|
| 24 |
+
"Upload Job Description (PDF)",
|
| 25 |
+
type=["pdf"],
|
| 26 |
+
accept_multiple_files=False
|
| 27 |
+
)
|
| 28 |
+
|
| 29 |
+
resumes = st.file_uploader(
|
| 30 |
+
"Upload Resumes (Multiple PDFs)",
|
| 31 |
+
type=["pdf"],
|
| 32 |
+
accept_multiple_files=True
|
| 33 |
+
)
|
| 34 |
+
|
| 35 |
+
SHORTLIST_THRESHOLD = 60
|
| 36 |
+
|
| 37 |
+
# ---------------- MAIN LOGIC ----------------
|
| 38 |
+
if st.button("Analyze Resumes"):
|
| 39 |
+
|
| 40 |
+
if not jd_file or not resumes:
|
| 41 |
+
st.error("Please upload Job Description and at least one Resume")
|
| 42 |
+
st.stop()
|
| 43 |
+
|
| 44 |
+
# -------- JD PROCESSING --------
|
| 45 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as f:
|
| 46 |
+
f.write(jd_file.read())
|
| 47 |
+
jd_path = f.name
|
| 48 |
+
|
| 49 |
+
jd_text = extract_text_from_pdf(jd_path)
|
| 50 |
+
jd_clean = preprocess_text(jd_text)
|
| 51 |
+
os.unlink(jd_path)
|
| 52 |
+
|
| 53 |
+
skills = load_skills("data/skills.txt")
|
| 54 |
+
results = []
|
| 55 |
+
|
| 56 |
+
# -------- PROCESS EACH RESUME --------
|
| 57 |
+
for resume_file in resumes:
|
| 58 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as rf:
|
| 59 |
+
rf.write(resume_file.read())
|
| 60 |
+
resume_path = rf.name
|
| 61 |
+
|
| 62 |
+
resume_text = extract_text_from_pdf(resume_path)
|
| 63 |
+
os.unlink(resume_path)
|
| 64 |
+
|
| 65 |
+
if not resume_text.strip():
|
| 66 |
+
continue
|
| 67 |
+
|
| 68 |
+
resume_clean = preprocess_text(resume_text)
|
| 69 |
+
|
| 70 |
+
# -------- SCORES --------
|
| 71 |
+
bert_score = bert_similarity(jd_clean, resume_clean)
|
| 72 |
+
skill_pct, matched_skills = skill_percentage(resume_text, skills)
|
| 73 |
+
experience = extract_experience_years(resume_text)
|
| 74 |
+
|
| 75 |
+
final_score = final_ats_score(
|
| 76 |
+
bert_score,
|
| 77 |
+
skill_pct,
|
| 78 |
+
experience
|
| 79 |
+
)
|
| 80 |
+
|
| 81 |
+
status = "✅ Selected" if final_score >= SHORTLIST_THRESHOLD else "❌ Rejected"
|
| 82 |
+
|
| 83 |
+
results.append({
|
| 84 |
+
"Resume Name": resume_file.name,
|
| 85 |
+
"ATS %": final_score,
|
| 86 |
+
"BERT Match %": bert_score,
|
| 87 |
+
"Skill Match %": skill_pct,
|
| 88 |
+
"Experience (Years)": experience,
|
| 89 |
+
"Status": status,
|
| 90 |
+
"Matched Skills": ", ".join(matched_skills) if matched_skills else "None"
|
| 91 |
+
})
|
| 92 |
+
|
| 93 |
+
# -------- FINAL OUTPUT --------
|
| 94 |
+
if not results:
|
| 95 |
+
st.warning("No resumes could be processed")
|
| 96 |
+
st.stop()
|
| 97 |
+
|
| 98 |
+
df = pd.DataFrame(results).sort_values("ATS %", ascending=False)
|
| 99 |
+
|
| 100 |
+
st.subheader("📊 Resume Shortlisting Result")
|
| 101 |
+
st.dataframe(df, use_container_width=True)
|
| 102 |
+
|
| 103 |
+
st.success(f"✅ Shortlist Threshold: {SHORTLIST_THRESHOLD}%")
|
| 104 |
+
|
| 105 |
+
st.download_button(
|
| 106 |
+
"⬇️ Download Result CSV",
|
| 107 |
+
df.to_csv(index=False),
|
| 108 |
+
"ATS_Shortlisting_Result.csv",
|
| 109 |
+
"text/csv"
|
| 110 |
+
)
|
requirements.txt
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
streamlit
|
| 2 |
+
pdfplumber
|
| 3 |
+
nltk
|
| 4 |
+
scikit-learn
|
| 5 |
+
sentence-transformers
|
| 6 |
+
pandas
|
| 7 |
+
torch
|