Spaces:

husseinelsaadi
/

Codingo

Paused

App Files Files Community

husseinelsaadi commited on Jul 21, 2025

Commit

028190e

1 Parent(s): 277abd1

read me updated

Browse files

Files changed (1) hide show

readme.md +9 -275

readme.md CHANGED Viewed

@@ -1,275 +1,9 @@
-# Codingo - AI Powered Smart Recruitment System
-This repository contains the implementation of Codingo, an AI-powered online recruitment platform designed to automate and enhance the hiring process through a virtual HR assistant named LUNA.
-## Project Overview
-Codingo addresses the challenges of traditional recruitment processes by offering:
-- Automated CV screening and skill-based shortlisting
-- AI-led interviews through the virtual assistant LUNA
-- Real-time cheating detection during assessments
-- Gamified practice tools for candidates
-- Secure administration interface for hiring managers
-## Getting Started
-This guide outlines the development process, starting with local model training before moving to AWS deployment.
-### Prerequisites
-- Python 3.8+
-- pip (Python package manager)
-- Git
-### Development Process
-We'll implement the project in phases:
-#### Phase 1: Local Training and Feature Extraction (Current Phase)
-This initial phase focuses on building and training the model locally before AWS deployment.
-### Project Structure
-```
-Codingo/
-├── backend/                     # Flask API backend
-│   ├── app.py                   # Flask server
-│   ├── predict.py               # Predict using trained model
-│   ├── train_model.py           # Model training script
-│   ├── model/                   # Trained model artifacts
-│   │   └── cv_classifier.pkl
-│   ├── utils/
-│   │   ├── text_extractor.py    # PDF/DOCX to text
-│   │   └── preprocessor.py      # Cleaning, tokenizing
-│
-├── data/
-│   ├── training.csv             # Your training dataset
-│   └── raw_cvs/                 # CV files (PDF/DOCX/txt)
-│
-├── notebooks/
-│   └── eda.ipynb                # Data exploration & feature work
-│
-├── requirements.txt             # Python dependencies
-└── README.md                    # Project overview
-```
-## Step-by-Step Implementation Guide
-### Step 1: Create Training Dataset
-Start by manually collecting ~50-100 CV-like text samples with position labels.
-**File:** `data/training.csv`
-Example format:
-```
-text,position
-"Experienced in Python, Flask, AWS",Backend Developer
-"Built dashboards with React and TypeScript",Frontend Developer
-"ML projects using pandas, scikit-learn",Data Scientist
-```
-### Step 2: Train Model
-Implement a classifier using scikit-learn to predict job roles from CV text.
-**File:** `backend/train_model.py`
-```python
-import pandas as pd
-from sklearn.feature_extraction.text import TfidfVectorizer
-from sklearn.pipeline import Pipeline
-from sklearn.linear_model import LogisticRegression
-import joblib
-# Load training data
-df = pd.read_csv('data/training.csv')
-# Define model pipeline
-model = Pipeline([
-    ('tfidf', TfidfVectorizer(max_features=5000, ngram_range=(1, 2))),
-    ('classifier', LogisticRegression(max_iter=1000))
-])
-# Train model
-model.fit(df['text'], df['position'])
-# Save model
-joblib.dump(model, 'backend/models/cv_classifier.pkl')
-print("Model trained and saved successfully!")
-```
-### Step 3: Test Prediction Locally
-Create a script to verify your model works correctly.
-**File:** `backend/predict.py`
-```python
-import joblib
-import sys
-def predict_role(cv_text):
-    # Load the trained model
-    model = joblib.load('backend/models/cv_classifier.pkl')
-    # Make prediction
-    prediction = model.predict([cv_text])[0]
-    confidence = max(model.predict_proba([cv_text])[0]) * 100
-    return {
-        'predicted_position': prediction,
-        'confidence': f"{confidence:.2f}%"
-    }
-if __name__ == "__main__":
-    if len(sys.argv) > 1:
-        # Get CV text from command line argument
-        cv_text = sys.argv[1]
-    else:
-        # Example CV text
-        cv_text = "Experienced Python developer with 5 years of experience in Flask and AWS."
-    result = predict_role(cv_text)
-    print(f"Predicted Position: {result['predicted_position']}")
-    print(f"Confidence: {result['confidence']}")
-```
-### Step 4: Add Text Extraction Utility
-Create utilities to extract text from PDF and DOCX files.
-**File:** `backend/utils/text_extractor.py`
-```python
-import fitz  # PyMuPDF
-import docx
-import os
-def extract_text_from_pdf(path):
-    """Extract text from PDF file."""
-    doc = fitz.open(path)
-    text = ""
-    for page in doc:
-        text += page.get_text()
-    return text.strip()
-def extract_text_from_docx(path):
-    """Extract text from DOCX file."""
-    doc = docx.Document(path)
-    text = "\n".join([paragraph.text for paragraph in doc.paragraphs])
-    return text.strip()
-def extract_text(file_path):
-    """Extract text from either PDF or DOCX."""
-    extension = os.path.splitext(file_path)[1].lower()
-    if extension == '.pdf':
-        return extract_text_from_pdf(file_path)
-    elif extension in ['.docx', '.doc']:
-        return extract_text_from_docx(file_path)
-    elif extension == '.txt':
-        with open(file_path, 'r', encoding='utf-8') as f:
-            return f.read().strip()
-    else:
-        raise ValueError(f"Unsupported file extension: {extension}")
-```
-### Step 5: Add Flask API (Simple)
-Create a basic Flask API to accept CV uploads and return predictions.
-**File:** `backend/app.py`
-```python
-from flask import Flask, request, jsonify
-from utils.text_extractor import extract_text
-import joblib
-import os
-app = Flask(__name__)
-model = joblib.load("model/cv_classifier.pkl")
-# Ensure directories exist
-os.makedirs("data/raw_cvs", exist_ok=True)
-os.makedirs("model", exist_ok=True)
-@app.route("/predict", methods=["POST"])
-def predict():
-    if 'file' not in request.files:
-        return jsonify({"error": "No file provided"}), 400
-    file = request.files["file"]
-    file_path = f"data/raw_cvs/{file.filename}"
-    file.save(file_path)
-    try:
-        text = extract_text(file_path)
-        prediction = model.predict([text])[0]
-        confidence = max(model.predict_proba([text])[0]) * 100
-        return jsonify({
-            "predicted_position": prediction,
-            "confidence": f"{confidence:.2f}%"
-        })
-    except Exception as e:
-        return jsonify({"error": str(e)}), 500
-if __name__ == "__main__":
-    app.run(debug=True)
-```
-### Step 6: Install Dependencies
-**File:** `requirements.txt`
-```
-flask
-scikit-learn
-pandas
-joblib
-PyMuPDF
-python-docx
-```
-Run: `pip install -r requirements.txt`
-## Next Steps
-After completing Phase 1, we'll move to:
-1. **Phase 2: Enhanced Model & NLP Features**
-   - Implement BERT or DistilBERT for improved semantic understanding
-   - Add skill extraction from CVs
-   - Develop job-CV matching scoring
-2. **Phase 3: Web Interface & Chatbot**
-   - Develop user interface for admin and candidates
-   - Implement LUNA virtual assistant using LangChain
-   - Add interview scheduling functionality
-3. **Phase 4: Video Interview & Proctoring**
-   - Add video interview capabilities
-   - Implement cheating detection using computer vision
-   - Develop automated scoring system
-4. **Phase 5: AWS Deployment**
-   - Set up AWS infrastructure using Terraform
-   - Deploy application to EC2/Lambda
-   - Configure S3 for file storage
-## Authors
-- Hussein El Saadi
-- Nour Ali Shaito
-## Supervisor
-- Dr. Ali Ezzedine
-## License
-This project is licensed under the MIT License - see the LICENSE file for details.

+---
+title: Codingo
+emoji: 🤖
+colorFrom: indigo
+colorTo: pink
+sdk: docker
+app_file: app.py
+pinned: false
+---