mkshari commited on
Commit
6e2abae
Β·
verified Β·
1 Parent(s): 35f2d97

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +24 -6
  2. app.py +217 -0
  3. requirements.txt +8 -0
  4. sample_jd.txt +14 -0
README.md CHANGED
@@ -1,12 +1,30 @@
1
  ---
2
- title: Srcdaksh
3
- emoji: πŸ”₯
4
- colorFrom: purple
5
- colorTo: purple
6
  sdk: gradio
7
- sdk_version: 6.9.0
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: SETHU AI - Resume Gap Analyzer
3
+ emoji: πŸŽ“
4
+ colorFrom: indigo
5
+ colorTo: blue
6
  sdk: gradio
7
+ sdk_version: 4.21.0
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
  ---
12
 
13
+ # SETHU AI - Resume Gap Analyzer
14
+ **From Resume to Career Readiness**
15
+ *In collaboration with SASTRA DEEMED UNIVERSITY*
16
+
17
+ An intelligent tool to analyze the gap between your resume and a specific job description.
18
+
19
+ ## Features
20
+ - **PDF/DOCX Upload**: Extract text from common resume formats.
21
+ - **Skill Extraction**: Automatically identify skills using spaCy NLP.
22
+ - **Similarity Scoring**: semantic comparison using Sentence Transformers (`all-MiniLM-L6-v2`).
23
+ - **Gap Analysis**: Detailed list of missing skills and present skills.
24
+ - **Guidance & Roadmap**: Get learning paths for missing skills.
25
+
26
+ ## Local Installation
27
+ ```bash
28
+ pip install -r requirements.txt
29
+ python app.py
30
+ ```
app.py ADDED
@@ -0,0 +1,217 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import spacy
3
+ import pdfplumber
4
+ from docx import Document
5
+ from sentence_transformers import SentenceTransformer, util
6
+ import pandas as pd
7
+ import re
8
+
9
+ # Load models
10
+ try:
11
+ nlp = spacy.load("en_core_web_sm")
12
+ except:
13
+ # Fallback if model installation via requirements.txt fails in local env
14
+ import os
15
+ os.system("python -m spacy download en_core_web_sm")
16
+ nlp = spacy.load("en_core_web_sm")
17
+
18
+ model = SentenceTransformer('all-MiniLM-L6-v2')
19
+
20
+ # Common Skill Dictionary (Simplified for the demo)
21
+ SKILLS_DB = [
22
+ "python", "javascript", "react", "fastapi", "aws", "docker", "kubernetes", "sql",
23
+ "git", "machine learning", "nlp", "tensorflow", "pytorch", "java", "c++", "golang",
24
+ "postgresql", "mongodb", "redis", "cloud computing", "devops", "rest api", "graphql",
25
+ "scikit-learn", "pandas", "numpy", "django", "flask", "typescript", "angular", "vue"
26
+ ]
27
+
28
+ ROADMAP_DB = {
29
+ "python": "Master Python: [Real Python](https://realpython.com/) | [Programming with Mosh](https://www.youtube.com/user/programmingwithmosh)",
30
+ "react": "Build UI with React: [Official Docs](https://react.dev/) | [FreeCodeCamp React Course](https://www.freecodecamp.org/news/free-react-course-2024/)",
31
+ "aws": "Cloud Mastery: [AWS Skill Builder](https://explore.skillbuilder.aws/) | [Cloud Guru](https://www.pluralsight.com/cloud-computing/aws)",
32
+ "docker": "Containerization: [Docker Get Started](https://docs.docker.com/get-started/) | [Docker Tutorial for Beginners](https://www.youtube.com/watch?v=pg19Z8LL06w)",
33
+ "kubernetes": "Orchestration: [K8s Basics](https://kubernetes.io/docs/tutorials/kubernetes-basics/) | [Nana's K8s Course](https://www.youtube.com/c/TechWorldwithNana)",
34
+ "fastapi": "Modern APIs: [FastAPI Docs](https://fastapi.tiangolo.com/) | [TestDriven.io FastAPI](https://testdriven.io/blog/fastapi-crud/)",
35
+ "nlp": "Language Processing: [Hugging Face NLP Course](https://huggingface.co/learn/nlp-course/) | [Stanford CS224N](https://web.stanford.edu/class/cs224n/)",
36
+ "machine learning": "AI Fundamentals: [ML Specialization by Andrew Ng](https://www.coursera.org/specializations/machine-learning-introduction)",
37
+ "sql": "Database Management: [SQLZoo](https://sqlzoo.net/) | [Mode SQL Tutorial](https://mode.com/sql-tutorial/)",
38
+ "git": "Version Control: [Git Immersion](https://gitimmersion.com/) | [GitHub Learning Path](https://skills.github.com/)",
39
+ "javascript": "JS Deep Dive: [MDN Web Docs](https://developer.mozilla.org/en-US/docs/Web/JavaScript) | [JavaScript.info](https://javascript.info/)",
40
+ "typescript": "Strict Typing: [TypeScript Handbook](https://www.typescriptlang.org/docs/handbook/intro.html)",
41
+ "postgresql": "Advanced Data: [Postgres Tutorial](https://www.postgresqltutorial.com/)",
42
+ "rest api": "API Design: [RESTful API Guide](https://restfulapi.net/)"
43
+ }
44
+
45
+ def extract_text_from_pdf(pdf_file):
46
+ with pdfplumber.open(pdf_file) as pdf:
47
+ text = ""
48
+ for page in pdf.pages:
49
+ text += page.extract_text() or ""
50
+ return text
51
+
52
+ def extract_text_from_docx(docx_file):
53
+ doc = Document(docx_file)
54
+ text = ""
55
+ for para in doc.paragraphs:
56
+ text += para.text + "\n"
57
+ return text
58
+
59
+ def clean_text(text):
60
+ text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
61
+ return text.lower().strip()
62
+
63
+ def get_skills(text):
64
+ text = clean_text(text)
65
+ found_skills = set()
66
+ for skill in SKILLS_DB:
67
+ if re.search(r'\b' + re.escape(skill) + r'\b', text):
68
+ found_skills.add(skill)
69
+ return found_skills
70
+
71
+ def analyze_resume(resume_file, jd_text):
72
+ if resume_file is None or not jd_text.strip():
73
+ return "Please upload a resume and provide a job description.", "", "", 0, None
74
+
75
+ # Step 1: Extract text
76
+ if resume_file.name.endswith('.pdf'):
77
+ resume_text = extract_text_from_pdf(resume_file)
78
+ elif resume_file.name.endswith('.docx'):
79
+ resume_text = extract_text_from_docx(resume_file)
80
+ else:
81
+ return "Unsupported file format. Please upload PDF or DOCX.", "", "", 0, None
82
+
83
+ # Step 2: NLP Analysis (Skills)
84
+ resume_skills = get_skills(resume_text)
85
+ jd_skills = get_skills(jd_text)
86
+
87
+ present_skills = list(resume_skills.intersection(jd_skills))
88
+ missing_skills = list(jd_skills - resume_skills)
89
+
90
+ # Step 3: Similarity Score (Sentence Transformers)
91
+ embeddings1 = model.encode(resume_text, convert_to_tensor=True)
92
+ embeddings2 = model.encode(jd_text, convert_to_tensor=True)
93
+ cosine_score = util.pytorch_cos_sim(embeddings1, embeddings2)
94
+ match_percentage = round(cosine_score.item() * 100, 2)
95
+
96
+ # Format output
97
+ present_str = ", ".join([s.capitalize() for s in present_skills]) if present_skills else "None found."
98
+ missing_str = ", ".join([s.capitalize() for s in missing_skills]) if missing_skills else "None! You are a great match."
99
+
100
+ return f"{match_percentage}%", present_str, missing_str, match_percentage, missing_skills
101
+
102
+ def get_roadmap(missing_skills):
103
+ if not missing_skills:
104
+ return "πŸŽ‰ Great job! You have all the key skills mentioned. Keep up explicitly highlighting them in your experience section."
105
+
106
+ roadmap_items = []
107
+ for skill in missing_skills:
108
+ resource = ROADMAP_DB.get(skill.lower(), f"Search for {skill} tutorials on YouTube or Coursera.")
109
+ roadmap_items.append(f"### {skill.capitalize()}\n{resource}")
110
+
111
+ return "\n\n".join(roadmap_items)
112
+
113
+ # Custom CSS for Premium Look
114
+ custom_css = """
115
+ #logo-img {
116
+ margin: auto;
117
+ display: block;
118
+ }
119
+ .gradio-container {
120
+ background-color: #f8f9fa;
121
+ }
122
+ .main-header {
123
+ text-align: center;
124
+ color: #003366; /* Navy Blue from Logo */
125
+ margin-bottom: 20px;
126
+ }
127
+ .sub-header {
128
+ text-align: center;
129
+ color: #b8860b; /* Gold from Logo */
130
+ font-weight: bold;
131
+ }
132
+ .sastra-text {
133
+ text-align: center;
134
+ font-size: 0.9em;
135
+ color: #555;
136
+ letter-spacing: 1px;
137
+ }
138
+ #analyze-btn {
139
+ background: linear-gradient(90deg, #003366 0%, #004080 100%) !important;
140
+ color: white !important;
141
+ border: none;
142
+ border-radius: 8px;
143
+ padding: 10px 20px;
144
+ font-weight: bold;
145
+ }
146
+ #roadmap-btn {
147
+ background: linear-gradient(90deg, #b8860b 0%, #daa520 100%) !important;
148
+ color: white !important;
149
+ border: none;
150
+ }
151
+ """
152
+
153
+ # Gradio Interface
154
+ with gr.Blocks(theme=gr.themes.Soft(primary_hue="indigo"), css=custom_css) as demo:
155
+ with gr.Row(variant="compact"):
156
+ with gr.Column(scale=1):
157
+ gr.Image("logo.png", show_label=False, height=120, container=False, elem_id="logo-img")
158
+ with gr.Column(scale=4):
159
+ gr.Markdown("# SETHU AI", elem_classes=["main-header"])
160
+ gr.Markdown("### From Resume to Career Readiness", elem_classes=["sub-header"])
161
+ gr.Markdown("SASTRA DEEMED UNIVERSITY", elem_classes=["sastra-text"])
162
+
163
+ gr.Markdown("---")
164
+
165
+ with gr.Row():
166
+ with gr.Column():
167
+ gr.Markdown("### πŸ“„ Input Details")
168
+ resume_input = gr.File(label="Upload Resume (PDF or DOCX)", file_types=[".pdf", ".docx"])
169
+ jd_input = gr.Textbox(label="Job Description", placeholder="Paste the job requirements here...", lines=8)
170
+ analyze_btn = gr.Button("Analyze Resume", variant="primary", elem_id="analyze-btn")
171
+
172
+ with gr.Column():
173
+ gr.Markdown("### πŸ“Š Analysis Dashboard")
174
+ match_score_output = gr.Label(label="Match Percentage")
175
+
176
+ with gr.Tabs():
177
+ with gr.TabItem("Skills Found"):
178
+ present_skills_output = gr.Textbox(label="Available in Resume", interactive=False)
179
+ with gr.TabItem("Gap Analysis"):
180
+ missing_skills_output = gr.Textbox(label="Skills to Acquire", interactive=False)
181
+
182
+ gr.Markdown("---")
183
+ roadmap_btn = gr.Button("Get Guidance & Roadmap", interactive=True, elem_id="roadmap-btn")
184
+ roadmap_output = gr.Markdown(visible=False)
185
+
186
+ # State for hidden analysis results
187
+ missing_skills_state = gr.State([])
188
+
189
+ def on_analyze(resume, jd):
190
+ score_str, present, missing, score_val, missing_list = analyze_resume(resume, jd)
191
+ return {
192
+ match_score_output: score_str,
193
+ present_skills_output: present,
194
+ missing_skills_output: missing,
195
+ roadmap_btn: gr.update(interactive=True),
196
+ missing_skills_state: missing_list,
197
+ roadmap_output: gr.update(visible=False)
198
+ }
199
+
200
+ def on_roadmap(missing_list):
201
+ roadmap_content = get_roadmap(missing_list)
202
+ return gr.update(value=roadmap_content, visible=True)
203
+
204
+ analyze_btn.click(
205
+ on_analyze,
206
+ inputs=[resume_input, jd_input],
207
+ outputs=[match_score_output, present_skills_output, missing_skills_output, roadmap_btn, missing_skills_state, roadmap_output]
208
+ )
209
+
210
+ roadmap_btn.click(
211
+ on_roadmap,
212
+ inputs=[missing_skills_state],
213
+ outputs=[roadmap_output]
214
+ )
215
+
216
+ if __name__ == "__main__":
217
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ gradio
2
+ spacy
3
+ sentence-transformers
4
+ pdfplumber
5
+ python-docx
6
+ scikit-learn
7
+ reportlab
8
+ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl
sample_jd.txt ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Job Title: Senior Python Developer
2
+
3
+ Responsibilities:
4
+ - Design and implement scalable backend services using Python and FastAPI.
5
+ - Work with SQL databases like PostgreSQL and NoSQL like Redis.
6
+ - Deploy applications using Docker and Kubernetes.
7
+ - Collaborate with frontend teams to integrate React components with REST APIs.
8
+ - Experience with AWS (EC2, S3, Lambda) is required.
9
+
10
+ Required Skills:
11
+ - Python, Javascript, React, FastAPI
12
+ - AWS, Docker, Kubernetes, SQL, Git
13
+ - Machine Learning basics, NLP
14
+ - Problem-solving and teamwork