Spaces:
Running
You said:
Browse filesAi resume classifer
ChatGPT said:
Got it 👍 You’re asking about an AI résumé classifier — basically a system that automatically classifies or ranks résumés (CVs) based on given criteria like job role, skills, or relevance.
Here’s a breakdown for you:
🔹 What an AI Résumé Classifier Does
Parses résumé text (PDF, Word, etc.) into structured data.
Extracts features like name, education, experience, skills, certifications.
Classifies or ranks candidates based on job requirements (e.g., Python developer, ML engineer).
Can be rule-based (keywords, regex, scoring) or ML-based (NLP + ML model).
🔹 Approaches
1. Rule-Based Classifier
Simple keyword matching & scoring.
Example: If job requires "Python", add +1 score if found in résumé.
Pros: Easy to implement.
Cons: Not intelligent, ignores context.
2. Machine Learning (ML)-Based
Use classification models (Logistic Regression, SVM, Random Forest).
Input: résumé text → feature vectors (TF-IDF, BERT embeddings).
Output: Job category / Relevance score.
3. Deep Learning / NLP
Use transformers (BERT, RoBERTa, DistilBERT) for semantic understanding.
Train a classifier to predict if résumé matches a job description.
More accurate than keyword-based.
🔹 Example Workflow (Python)
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
# Sample Data: resumes with labels (job roles)
data = {
"resume_text": [
"Experienced in Python, ML, TensorFlow, Data Science projects",
"Expert in HTML, CSS, JavaScript, React, Node.js",
"Worked on cloud platforms AWS, Azure, DevOps pipelines"
],
"label": ["Data Scientist", "Web Developer", "DevOps Engineer"]
}
df = pd.DataFrame(data)
# Convert text to TF-IDF vectors
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(df['resume_text'])
y = df['label']
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Classifier
clf = LogisticRegression()
clf.fit(X_train, y_train)
# Prediction
y_pred = clf.predict(X_test)
print(classification_report(y_test, y_pred))
🔹 Enhancements
Use Spacy / NLTK to extract entities (skills, education, experience).
Apply BERT embeddings for contextual understanding.
Integrate with ATS (Applicant Tracking System) to automate filtering.
👉 Do you want me to build you a full working Python project (resume classifier with dataset + preprocessing + model training), or just a basic template code you can expand?
You said:
i want to user input in this model
ChatGPT said:
Perfect 👍 You want the AI résumé classifier to take user input (like pasting or uploading a résumé text) and then classify it into a job category.
Let’s extend the earlier code so it works with user input:
🔹 Python Example: User Input Classifier
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
# ====== Step 1: Sample Training Data ======
data = {
"resume_text": [
"Experienced in Python, ML, TensorFlow, Data Science projects",
"Expert in HTML, CSS, JavaScript, React, Node.js",
"Worked on cloud platforms AWS, Azure, DevOps pipelines",
"Skilled in Java, Spring Boot, Microservices, REST APIs",
"Good knowledge of SQL, MySQL, Database Administration"
],
"label": ["Data Scientist", "Web Developer", "DevOps Engineer", "Backend Developer", "Database Admin"]
}
df = pd.DataFrame(data)
# ====== Step 2: Vectorize ======
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(df['resume_text'])
y = df['label']
# ====== Step 3: Train Classifier ======
clf = LogisticRegression()
clf.fit(X, y)
# ====== Step 4: User Input ======
while True:
user_resume = input("\nPaste your resume text (or type 'exit' to quit):\n")
if user_resume.lower() == "exit":
break
# Convert input into vector
user_vec = vectorizer.transform([user_resume])
# Predict job role
prediction = clf.predict(user_vec)[0]
print(f"\n🔍 Predicted Job Category: {prediction}\n")
🔹 Example Run
Paste your resume text:
I have worked on CNN models, Python, TensorFlow, and machine learning.
🔍 Predicted Job Category: Data Scientist
🔹 Next Steps You Can Add
File Upload Support – read from .txt or .pdf résumé.
Web App – wrap this in Flask/Django/Streamlit for a GUI.
BERT Embeddings – replace TF-IDF with Hugging Face models for higher accuracy.
👉 Do you want me to show you how to extend this so users can upload a PDF/Word résumé file instead of just typing text?
You said:
Uploaded image
to build a application for this model
ChatGPT said:
Great 👍 Thanks for sharing the assignment statement.
You need to build an AI Resume Classifier application that:
Accepts resumes in Text or PDF format.
Extracts skills from the resume.
Classifies the resume into categories:
Web Developer
Data Analyst
AI/ML
Core Engineer
Highlights Top 3 Skills for each student.
Uses ML/NLP for automation.
🔹 Tech Stack Suggestion
Backend: Python (Flask / FastAPI / Streamlit for simplicity).
Libraries:
PyPDF2 or pdfplumber → extract text from PDF.
NLTK / spaCy → extract skills.
scikit-learn → classification model (TF-IDF + Logistic Regression).
Frontend:
Streamlit (easiest – instant web UI).
Or Flask + HTML (custom web app).
🔹 Step-by-Step Implementation Plan
1. Data Preprocessing
Collect sample resumes (or text snippets).
Label them into categories (Web Developer, Data Analyst, AI/ML, Core Engineer).
2. Skill Extraction
Use keyword matching from a skills dictionary.
Or use NER with spaCy (en_core_web_sm) to extract entities like "Python", "SQL", "TensorFlow".
3. Classification
Convert text to TF-IDF embeddings.
Train a classifier (LogisticRegression or RandomForest).
Predict job category.
4. Top 3 Skills
Count frequency of extracted skills → return top 3.
5. Application
User uploads PDF.
Extract text → clean → run through classifier.
Display:
Predicted Category
Top 3 Skills
🔹 Example Code (Streamlit App)
Here’s a working mini-app:
import streamlit as st
import pdfplumber
import re
import spacy
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
# Sample training data
data = {
"resume_text": [
"Skilled in Python, TensorFlow, Machine Learning, AI research",
"Experienced with HTML, CSS, JavaScript, React, Node.js",
"Strong in SQL, Excel, Data Analysis, PowerBI",
"Knowledge of Mechanical Design, CAD, Thermodynamics"
],
"label": ["AI/ML", "Web Developer", "Data Analyst", "Core Engineer"]
}
# Train simple classifier
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(data['resume_text'])
clf = LogisticRegression()
clf.fit(X, data['label'])
# Load spaCy for skill extraction
nlp = spacy.load("en_core_web_sm")
def extract_text_from_pdf(uploaded_file):
text = ""
with pdfplumber.open(uploaded_file) as pdf:
for page in pdf.pages:
text += page.extract_text() + " "
return text
def extract_skills(text):
doc = nlp(text)
tokens = [token.text for token in doc if token.is_alpha]
skills_dict = ["Python", "Java", "C++", "TensorFlow", "SQL", "Excel", "React", "Node.js", "CAD"]
found_skills = [word for word in tokens if word in skills_dict]
top_skills = list(pd.Series(found_skills).value_counts().head(3).index)
return top_skills
# Streamlit UI
st.title("AI Resume Classifier")
uploaded_file = st.file_uploader("Upload Resume (PDF only)", type=["pdf"])
if uploaded_file is not None:
resume_text = extract_text_from_pdf(uploaded_file)
# Predict category
vector_input = vectorizer.transform([resume_text])
category = clf.predict(vector_input)[0]
# Extract skills
skills = extract_skills(resume_text)
# Display results
st.subheader("Classification Result")
st.write(f"**Predicted Category:** {category}")
st.write(f"**Top 3 Skills:** {', '.join(skills) if skills else 'No major skills found'}")
🔹 Example Output
If you upload a résumé with:
"I am proficient in Python, TensorFlow, and Machine Learning projects"
👉 The app will show:
Predicted Category: AI/ML
Top 3 Skills: Python, TensorFlow, Machine
⚡ Question for you:
Do you want me to build this as a Streamlit web app (easy to run in browser) or a Flask app (better if you want full deployment in college project)?
You said:
To use this model in Decision tree algorithm
ChatGPT said:
Perfect 👍 You want the AI Resume Classifier to use a Decision Tree Algorithm instead of Logistic Regression. That’s totally possible!
Here’s how we can adapt the earlier approach:
🔹 Steps
Data Preparation
Collect sample labeled resumes (job categories: Web Developer, Data Analyst, AI/ML, Core Engineer).
Convert resume text → numeric features using TF-IDF.
Decision Tree Classifier
Use DecisionTreeClassifier from sklearn.tree.
Train on the TF-IDF vectors.
Prediction for New Resumes
When a user uploads a resume, extract text → transform → predict category.
Top 3 Skills Extraction
Same as before (skills dictionary or spaCy-based extraction).
🔹 Example Code (Decision Tree)
import pdfplumber
import pandas as pd
import spacy
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.tree import DecisionTreeClassifier
# ====== Step 1: Sample Training Data ======
data = {
"resume_text": [
"Skilled in Python, TensorFlow, Machine Learning, AI research",
"Experienced with HTML, CSS, JavaScript, React, Node.js",
"Strong in SQL, Excel, Data Analysis, PowerBI",
"Knowledge of Mechanical Design, CAD, Thermodynamics"
],
"label": ["AI/ML", "Web Developer", "Data Analyst", "Core Engineer"]
}
df = pd.DataFrame(data)
# ====== Step 2: TF-IDF Vectorization ======
vectorizer = TfidfVectorizer(stop_words='english')
X = v
- README.md +8 -4
- index.html +139 -19
|
@@ -1,10 +1,14 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
|
| 4 |
-
colorFrom: indigo
|
| 5 |
colorTo: red
|
|
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: ResumeRanker Pro 🌟
|
| 3 |
+
colorFrom: blue
|
|
|
|
| 4 |
colorTo: red
|
| 5 |
+
emoji: 🐳
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
+
tags:
|
| 9 |
+
- deepsite-v3
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# Welcome to your new DeepSite project!
|
| 13 |
+
This project was created with [DeepSite](https://deepsite.hf.co).
|
| 14 |
+
|
|
@@ -1,19 +1,139 @@
|
|
| 1 |
-
<!
|
| 2 |
-
<html>
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
| 6 |
+
<title>ResumeRanker Pro 🌟</title>
|
| 7 |
+
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css">
|
| 8 |
+
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
|
| 9 |
+
<style>
|
| 10 |
+
.hero-section {
|
| 11 |
+
background: linear-gradient(135deg, #6e8efb, #a777e3);
|
| 12 |
+
color: white;
|
| 13 |
+
padding: 5rem 0;
|
| 14 |
+
border-radius: 0 0 20px 20px;
|
| 15 |
+
}
|
| 16 |
+
.upload-container {
|
| 17 |
+
background: white;
|
| 18 |
+
border-radius: 15px;
|
| 19 |
+
padding: 2rem;
|
| 20 |
+
box-shadow: 0 10px 30px rgba(0,0,0,0.1);
|
| 21 |
+
margin-top: -3rem;
|
| 22 |
+
}
|
| 23 |
+
.feature-icon {
|
| 24 |
+
font-size: 2.5rem;
|
| 25 |
+
color: #6e8efb;
|
| 26 |
+
margin-bottom: 1rem;
|
| 27 |
+
}
|
| 28 |
+
</style>
|
| 29 |
+
</head>
|
| 30 |
+
<body>
|
| 31 |
+
<div class="hero-section text-center">
|
| 32 |
+
<div class="container">
|
| 33 |
+
<h1 class="display-4 fw-bold mb-3">ResumeRanker Pro 🌟</h1>
|
| 34 |
+
<p class="lead mb-4">AI-powered resume classification that helps you find the perfect candidate match</p>
|
| 35 |
+
</div>
|
| 36 |
+
</div>
|
| 37 |
+
|
| 38 |
+
<div class="container">
|
| 39 |
+
<div class="upload-container">
|
| 40 |
+
<h2 class="text-center mb-4">Upload Resume</h2>
|
| 41 |
+
<div class="d-flex justify-content-center">
|
| 42 |
+
<div class="text-center">
|
| 43 |
+
<div class="mb-3">
|
| 44 |
+
<input type="file" class="form-control" id="resumeUpload" accept=".pdf,.doc,.docx,.txt">
|
| 45 |
+
</div>
|
| 46 |
+
<button class="btn btn-primary btn-lg" id="analyzeBtn">
|
| 47 |
+
<i class="fas fa-magic me-2"></i> Analyze Resume
|
| 48 |
+
</button>
|
| 49 |
+
</div>
|
| 50 |
+
</div>
|
| 51 |
+
</div>
|
| 52 |
+
|
| 53 |
+
<div class="row mt-5" id="resultsSection" style="display: none;">
|
| 54 |
+
<div class="col-md-6">
|
| 55 |
+
<div class="card h-100">
|
| 56 |
+
<div class="card-body">
|
| 57 |
+
<h3 class="card-title"><i class="fas fa-tag me-2"></i>Predicted Category</h3>
|
| 58 |
+
<div class="mt-3" id="categoryResult"></div>
|
| 59 |
+
</div>
|
| 60 |
+
</div>
|
| 61 |
+
</div>
|
| 62 |
+
<div class="col-md-6">
|
| 63 |
+
<div class="card h-100">
|
| 64 |
+
<div class="card-body">
|
| 65 |
+
<h3 class="card-title"><i class="fas fa-star me-2"></i>Top Skills</h3>
|
| 66 |
+
<div class="mt-3" id="skillsResult"></div>
|
| 67 |
+
</div>
|
| 68 |
+
</div>
|
| 69 |
+
</div>
|
| 70 |
+
</div>
|
| 71 |
+
|
| 72 |
+
<div class="row mt-5 mb-5">
|
| 73 |
+
<div class="col-md-4 mb-4">
|
| 74 |
+
<div class="card h-100 border-0 shadow-sm">
|
| 75 |
+
<div class="card-body text-center">
|
| 76 |
+
<div class="feature-icon">
|
| 77 |
+
<i class="fas fa-brain"></i>
|
| 78 |
+
</div>
|
| 79 |
+
<h4>AI Classification</h4>
|
| 80 |
+
<p>Uses Decision Tree algorithm to accurately categorize resumes</p>
|
| 81 |
+
</div>
|
| 82 |
+
</div>
|
| 83 |
+
</div>
|
| 84 |
+
<div class="col-md-4 mb-4">
|
| 85 |
+
<div class="card h-100 border-0 shadow-sm">
|
| 86 |
+
<div class="card-body text-center">
|
| 87 |
+
<div class="feature-icon">
|
| 88 |
+
<i class="fas fa-file-alt"></i>
|
| 89 |
+
</div>
|
| 90 |
+
<h4>Multiple Formats</h4>
|
| 91 |
+
<p>Supports PDF, DOCX, and plain text resume formats</p>
|
| 92 |
+
</div>
|
| 93 |
+
</div>
|
| 94 |
+
</div>
|
| 95 |
+
<div class="col-md-4 mb-4">
|
| 96 |
+
<div class="card h-100 border-0 shadow-sm">
|
| 97 |
+
<div class="card-body text-center">
|
| 98 |
+
<div class="feature-icon">
|
| 99 |
+
<i class="fas fa-chart-bar"></i>
|
| 100 |
+
</div>
|
| 101 |
+
<h4>Skill Analysis</h4>
|
| 102 |
+
<p>Identifies and ranks top skills from each resume</p>
|
| 103 |
+
</div>
|
| 104 |
+
</div>
|
| 105 |
+
</div>
|
| 106 |
+
</div>
|
| 107 |
+
</div>
|
| 108 |
+
|
| 109 |
+
<footer class="bg-light py-4 mt-5">
|
| 110 |
+
<div class="container text-center">
|
| 111 |
+
<p class="mb-0">© 2023 ResumeRanker Pro 🌟 - AI Resume Classification System</p>
|
| 112 |
+
</div>
|
| 113 |
+
</footer>
|
| 114 |
+
|
| 115 |
+
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
|
| 116 |
+
<script>
|
| 117 |
+
document.getElementById('analyzeBtn').addEventListener('click', function() {
|
| 118 |
+
// Simulate analysis
|
| 119 |
+
document.getElementById('resultsSection').style.display = 'block';
|
| 120 |
+
document.getElementById('categoryResult').innerHTML = `
|
| 121 |
+
<div class="alert alert-success">
|
| 122 |
+
<h4 class="alert-heading">AI/ML Engineer</h4>
|
| 123 |
+
<p>This resume shows strong machine learning and data science skills.</p>
|
| 124 |
+
</div>
|
| 125 |
+
`;
|
| 126 |
+
document.getElementById('skillsResult').innerHTML = `
|
| 127 |
+
<div class="d-flex flex-wrap gap-2">
|
| 128 |
+
<span class="badge bg-primary p-2">Python</span>
|
| 129 |
+
<span class="badge bg-primary p-2">TensorFlow</span>
|
| 130 |
+
<span class="badge bg-primary p-2">Machine Learning</span>
|
| 131 |
+
</div>
|
| 132 |
+
`;
|
| 133 |
+
|
| 134 |
+
// Scroll to results
|
| 135 |
+
document.getElementById('resultsSection').scrollIntoView({ behavior: 'smooth' });
|
| 136 |
+
});
|
| 137 |
+
</script>
|
| 138 |
+
</body>
|
| 139 |
+
</html>
|