Predict_Rating / PROJECT_SUMMARY.md
vtdung23's picture
Upload folder using huggingface_hub
c09e844 verified
# πŸ“‹ Project Summary - Vietnamese Product Rating Prediction System
## βœ… What Has Been Built
### πŸ—οΈ Complete Project Structure
```
PredictRating/
β”œβ”€β”€ main.py # FastAPI application entry
β”œβ”€β”€ requirements.txt # All dependencies
β”œβ”€β”€ README.md # Full documentation
β”œβ”€β”€ QUICKSTART.md # Quick setup guide
β”œβ”€β”€ sample_comments.csv # Test data
β”œβ”€β”€ .gitignore # Git ignore rules
β”‚
└── app/
β”œβ”€β”€ config.py # Configuration settings
β”œβ”€β”€ database.py # Database connection
β”œβ”€β”€ models.py # SQLAlchemy models (User, PredictionHistory)
β”œβ”€β”€ schemas.py # Pydantic validation schemas
β”‚
β”œβ”€β”€ routers/ # API endpoints
β”‚ β”œβ”€β”€ auth.py # Login/Register endpoints
β”‚ β”œβ”€β”€ prediction.py # Single/Batch prediction
β”‚ └── dashboard.py # Frontend routes
β”‚
β”œβ”€β”€ services/ # Business logic
β”‚ β”œβ”€β”€ auth_service.py # JWT authentication & password hashing
β”‚ β”œβ”€β”€ ml_service.py # ML prediction (DUMMY - replace with your model)
β”‚ └── visualization_service.py # WordCloud & chart data
β”‚
β”œβ”€β”€ templates/ # Jinja2 HTML templates
β”‚ β”œβ”€β”€ base.html # Base layout with TailwindCSS
β”‚ β”œβ”€β”€ login.html # Login page
β”‚ β”œβ”€β”€ register.html # Registration page
β”‚ └── dashboard.html # Main prediction interface
β”‚
β”œβ”€β”€ static/ # Static files
β”‚ β”œβ”€β”€ css/
β”‚ β”œβ”€β”€ js/
β”‚ └── uploads/
β”‚ └── wordclouds/ # Generated word cloud images
β”‚
└── database/ # SQLite database location
```
---
## 🎯 Features Implemented
### 1. Authentication System βœ…
- **User Registration** with email validation
- **JWT-based Login** (secure token authentication)
- **Password Hashing** using bcrypt
- **Protected Routes** requiring authentication
### 2. Single Comment Prediction βœ…
- Select target product
- Input Vietnamese comment
- Get predicted rating (1-5 stars)
- Display confidence score
- Save to prediction history
### 3. Batch CSV Prediction βœ…
- Upload CSV file with comments
- Bulk prediction processing
- **Visualizations:**
- Bar chart showing rating distributionStart command
- Word cloud of frequent words
- Results table with all predictions
- **Export:** Download CSV with predicted ratings
### 4. Data Visualization βœ…
- **Chart.js** for interactive bar charts
- **WordCloud** library for generating word cloud images
- Responsive charts that update dynamically
### 5. API Documentation βœ…
- **Swagger UI** at `/docs` (automatic generation)
- **ReDoc** at `/redoc` (alternative documentation)
- Interactive API testing interface
- Complete request/response schemas
### 6. Database Integration βœ…
- **SQLite** database
- **User table** (username, email, hashed password)
- **PredictionHistory table** (tracks all predictions)
- Automatic table creation on startup
### 7. Frontend UI βœ…
- **TailwindCSS** for modern, responsive design
- **Jinja2** server-side rendering
- Tab-based interface (Single/Batch)
- Real-time form validation
- Loading states and error handling
---
## πŸš€ How to Run
### Step 1: Install Dependencies
```bash
pip install -r requirements.txt
```
### Step 2: Start Server
```bash
python main.py
```
### Step 3: Access Application
- **Dashboard:** http://localhost:8000/dashboard
- **Swagger API Docs:** http://localhost:8000/docs ⭐
---
## πŸ“Š API Endpoints
### Authentication
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/auth/register` | Register new user |
| POST | `/api/auth/login` | Login (returns JWT token) |
| GET | `/api/auth/me` | Get current user info |
### Predictions
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/predict/single` | Predict single comment |
| POST | `/api/predict/batch` | Predict batch from CSV |
| GET | `/api/predict/history` | Get prediction history |
### Frontend
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/login` | Login page |
| GET | `/register` | Registration page |
| GET | `/dashboard` | Main dashboard |
---
## πŸ”§ Replace Dummy ML Model
The file `app/services/ml_service.py` contains a **DUMMY prediction function** that returns random ratings.
### To integrate your real model:
1. **Load your model in `__init__`:**
```python
def __init__(self):
self.model = load_model('path/to/your/model.h5')
self.tokenizer = load_tokenizer('path/to/tokenizer.pkl')
```
2. **Update `predict_single` method:**
```python
def predict_single(self, text: str) -> Dict[str, any]:
# Preprocess Vietnamese text
preprocessed = self.preprocess(text)
# Tokenize
tokens = self.tokenizer.encode(preprocessed)
# Predict
prediction = self.model.predict([tokens])
rating = int(prediction.argmax()) + 1 # 1-5 scale
confidence = float(prediction.max())
return {
'rating': rating,
'confidence': confidence
}
```
3. **Implement preprocessing:**
```python
def preprocess(self, text: str) -> str:
# Your Vietnamese text preprocessing
text = text.lower()
text = remove_special_characters(text)
text = normalize_vietnamese(text)
return text
```
---
## πŸŽ“ Demo for Teacher
### Show Swagger UI (Bonus Points!)
1. Open http://localhost:8000/docs
2. Demonstrate:
- All API endpoints organized by tags
- Request/response schemas
- "Try it out" functionality
- Authentication with JWT Bearer token
### User Flow Demo
1. **Register** a new account
2. **Login** and show JWT token storage
3. **Single Prediction:**
- Select product
- Enter Vietnamese comment
- Show predicted rating + confidence
4. **Batch Prediction:**
- Upload `sample_comments.csv`
- Show bar chart of rating distribution
- Show word cloud visualization
- Download CSV with predictions
### Technical Highlights
- βœ… FastAPI automatic Swagger generation
- βœ… JWT authentication security
- βœ… RESTful API design
- βœ… Separation of concerns (routers, services, models)
- βœ… Database relationships (User ↔ PredictionHistory)
- βœ… Responsive frontend with TailwindCSS
- βœ… Data visualization with Chart.js + WordCloud
---
## πŸ“¦ Dependencies Installed
```
fastapi # Web framework
uvicorn # ASGI server
sqlalchemy # ORM for database
python-jose # JWT tokens
passlib # Password hashing
pydantic # Data validation
jinja2 # Template engine
wordcloud # Word cloud generation
matplotlib # Image rendering
python-multipart # File uploads
```
---
## 🎯 What You Need to Do Next
1. **Test the application:**
- Register an account
- Try single prediction
- Upload the `sample_comments.csv` file
- Test batch prediction
2. **Replace the dummy ML model:**
- Edit `app/services/ml_service.py`
- Load your fine-tuned model
- Implement proper preprocessing
- Update prediction logic
3. **Customize (optional):**
- Add more products in `app/config.py`
- Adjust styling in templates
- Add more Vietnamese stopwords in visualization service
4. **Prepare for demo:**
- Practice showing Swagger UI
- Prepare sample comments in Vietnamese
- Explain the architecture and tech stack
---
## πŸ“ž Quick Reference
| What | Where |
|------|-------|
| Start server | `python main.py` |
| Swagger UI | http://localhost:8000/docs |
| Dashboard | http://localhost:8000/dashboard |
| Replace model | `app/services/ml_service.py` |
| Add products | `app/config.py` β†’ PRODUCTS list |
| Database file | `app/database/rating_prediction.db` |
| Uploads folder | `app/static/uploads/` |
| Test CSV | `sample_comments.csv` |
---
## ✨ Success Criteria Met
βœ… FastAPI backend with Swagger UI
βœ… Jinja2 templates + TailwindCSS
βœ… SQLite database (Users + History)
βœ… JWT authentication
βœ… Single comment prediction
βœ… Batch CSV prediction
βœ… Data visualization (charts + word cloud)
βœ… CSV export with predictions
βœ… Professional project structure
βœ… Complete documentation
**Your ML prediction web app is ready! πŸŽ‰**
Good luck with your presentation! πŸŽ“