bat-6's picture
add Hugging Face Space YAML metadata config
0b3e72d
|
Raw
History Blame Contribute Delete
5.99 kB
---
title: Graduation Project-v1.2
emoji: πŸŽ“
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 7860
pinned: false
---
# πŸ€– AI-Powered Graduation Project Recommendation System
## πŸ“Œ Overview
This project implements an intelligent AI-powered recommendation and semantic similarity platform for graduation projects using:
* Natural Language Processing (NLP)
* Semantic Search
* Vector Embeddings
* Hybrid Ranking Systems
* Large Language Models (LLMs)
The system helps students:
* discover unique graduation project ideas
* avoid duplicate projects
* analyze originality
* generate intelligent project features
* receive context-aware recommendations through an AI chatbot
---
# βš™οΈ System Pipeline
## 1️⃣ Data Preprocessing
* Text normalization
* Duplicate removal
* Smart content merging
* Technical keyword extraction
* Feature engineering
## 2️⃣ Feature Extraction
* KeyBERT-based keyword extraction
* Automatic technical term detection
* Semantic feature generation
## 3️⃣ Embedding Generation
* SentenceTransformer embeddings
* Normalized vector representations
* Semantic encoding of projects
## 4️⃣ Semantic Retrieval
* FAISS vector indexing
* Nearest-neighbor semantic search
* Fast project similarity lookup
## 5️⃣ Hybrid Ranking
The final ranking combines:
* Semantic similarity
* Feature similarity
* Coverage ratio
* Confidence estimation
* Originality analysis
## 6️⃣ AI Recommendation Engine
* Context-aware project generation
* Feature recommendation
* Novelty checking
* Conversational chatbot assistance
---
# 🧠 AI & NLP Technologies Used
## πŸ”Ή Machine Learning & NLP
* SentenceTransformers
* KeyBERT
* Scikit-learn
* SciPy
* FAISS
## πŸ”Ή LLM Integration
* Google Gemini API
* Ollama
* Mistral
## πŸ”Ή Backend & Infrastructure
* FastAPI
* Pandas
* NumPy
* Python
---
# πŸ—οΈ Project Architecture
```text
User Query
↓
Intent Classification
↓
Context Builder
↓
Feature Extraction
↓
Embedding Generation
↓
FAISS Semantic Search
↓
Hybrid Ranking Engine
↓
Originality & Duplicate Analysis
↓
AI Recommendation Response
```
---
# πŸ” Similarity Engine Workflow
```text
Raw Dataset
↓
Preprocessing
↓
Feature Extraction
↓
Sentence Embeddings
↓
FAISS Indexing
↓
Semantic Retrieval
↓
Feature Similarity Matching
↓
Hybrid Re-ranking
↓
Final Recommendation
```
---
# πŸš€ Features
## βœ… AI Chatbot
* Context-aware conversations
* Intent classification
* Domain-specific recommendations
* Memory-aware responses
## βœ… Semantic Similarity Search
* Embedding-based retrieval
* Semantic duplicate detection
* Vector search with FAISS
## βœ… Hybrid Recommendation System
* Multi-stage ranking pipeline
* Feature-level semantic comparison
* Adaptive scoring strategy
## βœ… Originality Detection
* Duplicate risk analysis
* Originality scoring
* Similarity confidence estimation
## βœ… Intelligent Feature Generation
* AI-generated project features
* Novelty-aware generation
* Domain-aware recommendations
---
# πŸ“Š Evaluation
The system includes:
* Self-retrieval evaluation
* Real-query testing
* Hybrid ranking validation
* Confidence scoring
### Evaluation Metrics
* Semantic Similarity Score
* Hybrid Score
* Originality Score
* Confidence Score
* Duplicate Risk Classification
---
# πŸ“ Project Structure
```text
GRADUATION_PROJECT/
β”‚
β”œβ”€β”€ api/ # FastAPI backend
β”‚
β”œβ”€β”€ Data/
β”‚ β”œβ”€β”€ raw/ # Original dataset
β”‚ └── processed/ # Cleaned dataset
β”‚
β”œβ”€β”€ models/ # FAISS index & metadata
β”‚
β”œβ”€β”€ Notebooks/
β”‚ └── TEST.ipynb # Training & evaluation notebook
β”‚
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ recommendation_engine/ # Chatbot & recommendation logic
β”‚ └── similarity_model/ # Semantic search engine
β”‚
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md
└── .gitignore
```
---
# 🧩 Recommendation Engine Modules
## recommendation_engine/
Contains:
* Chatbot engine
* Intent classification
* Prompt building
* Idea generation
* Feature generation
* Memory management
* Novelty checking
* Response formatting
---
# πŸ”¬ Similarity Model Modules
## similarity_model/
Contains:
* Semantic search
* Embedding engine
* Hybrid ranker
* Feature similarity engine
* Preprocessing pipeline
* Evaluation framework
---
# ⚑ Installation
## 1️⃣ Clone Repository
```bash
git clone https://github.com/YOUR_USERNAME/YOUR_REPOSITORY.git
cd YOUR_REPOSITORY
```
---
## 2️⃣ Create Virtual Environment
### Windows
```bash
python -m venv .venv
.venv\Scripts\activate
```
### Linux / Mac
```bash
python3 -m venv .venv
source .venv/bin/activate
```
---
## 3️⃣ Install Dependencies
```bash
pip install -r requirements.txt
```
---
# πŸ”‘ Environment Variables
Create a `.env` file:
```env
GEMINI_API_KEY=your_api_key_here
```
---
# ▢️ Running The Project
## Run FastAPI Server
```bash
uvicorn api.main:app --reload
```
---
## Run Notebook
```bash
jupyter notebook
```
Open:
```text
Notebooks/TEST.ipynb
```
---
# πŸ’‘ Example Query
## Input
```text
AI-based smart library recommendation platform
```
## Output
* Similar graduation projects
* Semantic similarity scores
* Originality analysis
* Duplicate risk estimation
* Recommended features
---
# 🎯 Future Improvements
* Full RAG integration
* Multi-agent orchestration
* GPU acceleration
* Advanced evaluation metrics
* Real-time deployment
* Database persistence
* Frontend dashboard
---
# πŸ“š Research Areas Covered
* Natural Language Processing (NLP)
* Semantic Search
* Recommendation Systems
* Vector Databases
* Conversational AI
* Information Retrieval
* Hybrid Ranking Systems
* Large Language Models (LLMs)
---
# πŸ‘¨β€πŸ’» Author
Yossef Assem
---
# πŸ“„ License
This project is for educational and research purposes.