Spaces:
Sleeping
Sleeping
File size: 8,847 Bytes
c09e844 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 |
# π Project Summary - Vietnamese Product Rating Prediction System
## β
What Has Been Built
### ποΈ Complete Project Structure
```
PredictRating/
βββ main.py # FastAPI application entry
βββ requirements.txt # All dependencies
βββ README.md # Full documentation
βββ QUICKSTART.md # Quick setup guide
βββ sample_comments.csv # Test data
βββ .gitignore # Git ignore rules
β
βββ app/
βββ config.py # Configuration settings
βββ database.py # Database connection
βββ models.py # SQLAlchemy models (User, PredictionHistory)
βββ schemas.py # Pydantic validation schemas
β
βββ routers/ # API endpoints
β βββ auth.py # Login/Register endpoints
β βββ prediction.py # Single/Batch prediction
β βββ dashboard.py # Frontend routes
β
βββ services/ # Business logic
β βββ auth_service.py # JWT authentication & password hashing
β βββ ml_service.py # ML prediction (DUMMY - replace with your model)
β βββ visualization_service.py # WordCloud & chart data
β
βββ templates/ # Jinja2 HTML templates
β βββ base.html # Base layout with TailwindCSS
β βββ login.html # Login page
β βββ register.html # Registration page
β βββ dashboard.html # Main prediction interface
β
βββ static/ # Static files
β βββ css/
β βββ js/
β βββ uploads/
β βββ wordclouds/ # Generated word cloud images
β
βββ database/ # SQLite database location
```
---
## π― Features Implemented
### 1. Authentication System β
- **User Registration** with email validation
- **JWT-based Login** (secure token authentication)
- **Password Hashing** using bcrypt
- **Protected Routes** requiring authentication
### 2. Single Comment Prediction β
- Select target product
- Input Vietnamese comment
- Get predicted rating (1-5 stars)
- Display confidence score
- Save to prediction history
### 3. Batch CSV Prediction β
- Upload CSV file with comments
- Bulk prediction processing
- **Visualizations:**
- Bar chart showing rating distributionStart command
- Word cloud of frequent words
- Results table with all predictions
- **Export:** Download CSV with predicted ratings
### 4. Data Visualization β
- **Chart.js** for interactive bar charts
- **WordCloud** library for generating word cloud images
- Responsive charts that update dynamically
### 5. API Documentation β
- **Swagger UI** at `/docs` (automatic generation)
- **ReDoc** at `/redoc` (alternative documentation)
- Interactive API testing interface
- Complete request/response schemas
### 6. Database Integration β
- **SQLite** database
- **User table** (username, email, hashed password)
- **PredictionHistory table** (tracks all predictions)
- Automatic table creation on startup
### 7. Frontend UI β
- **TailwindCSS** for modern, responsive design
- **Jinja2** server-side rendering
- Tab-based interface (Single/Batch)
- Real-time form validation
- Loading states and error handling
---
## π How to Run
### Step 1: Install Dependencies
```bash
pip install -r requirements.txt
```
### Step 2: Start Server
```bash
python main.py
```
### Step 3: Access Application
- **Dashboard:** http://localhost:8000/dashboard
- **Swagger API Docs:** http://localhost:8000/docs β
---
## π API Endpoints
### Authentication
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/auth/register` | Register new user |
| POST | `/api/auth/login` | Login (returns JWT token) |
| GET | `/api/auth/me` | Get current user info |
### Predictions
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/predict/single` | Predict single comment |
| POST | `/api/predict/batch` | Predict batch from CSV |
| GET | `/api/predict/history` | Get prediction history |
### Frontend
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/login` | Login page |
| GET | `/register` | Registration page |
| GET | `/dashboard` | Main dashboard |
---
## π§ Replace Dummy ML Model
The file `app/services/ml_service.py` contains a **DUMMY prediction function** that returns random ratings.
### To integrate your real model:
1. **Load your model in `__init__`:**
```python
def __init__(self):
self.model = load_model('path/to/your/model.h5')
self.tokenizer = load_tokenizer('path/to/tokenizer.pkl')
```
2. **Update `predict_single` method:**
```python
def predict_single(self, text: str) -> Dict[str, any]:
# Preprocess Vietnamese text
preprocessed = self.preprocess(text)
# Tokenize
tokens = self.tokenizer.encode(preprocessed)
# Predict
prediction = self.model.predict([tokens])
rating = int(prediction.argmax()) + 1 # 1-5 scale
confidence = float(prediction.max())
return {
'rating': rating,
'confidence': confidence
}
```
3. **Implement preprocessing:**
```python
def preprocess(self, text: str) -> str:
# Your Vietnamese text preprocessing
text = text.lower()
text = remove_special_characters(text)
text = normalize_vietnamese(text)
return text
```
---
## π Demo for Teacher
### Show Swagger UI (Bonus Points!)
1. Open http://localhost:8000/docs
2. Demonstrate:
- All API endpoints organized by tags
- Request/response schemas
- "Try it out" functionality
- Authentication with JWT Bearer token
### User Flow Demo
1. **Register** a new account
2. **Login** and show JWT token storage
3. **Single Prediction:**
- Select product
- Enter Vietnamese comment
- Show predicted rating + confidence
4. **Batch Prediction:**
- Upload `sample_comments.csv`
- Show bar chart of rating distribution
- Show word cloud visualization
- Download CSV with predictions
### Technical Highlights
- β
FastAPI automatic Swagger generation
- β
JWT authentication security
- β
RESTful API design
- β
Separation of concerns (routers, services, models)
- β
Database relationships (User β PredictionHistory)
- β
Responsive frontend with TailwindCSS
- β
Data visualization with Chart.js + WordCloud
---
## π¦ Dependencies Installed
```
fastapi # Web framework
uvicorn # ASGI server
sqlalchemy # ORM for database
python-jose # JWT tokens
passlib # Password hashing
pydantic # Data validation
jinja2 # Template engine
wordcloud # Word cloud generation
matplotlib # Image rendering
python-multipart # File uploads
```
---
## π― What You Need to Do Next
1. **Test the application:**
- Register an account
- Try single prediction
- Upload the `sample_comments.csv` file
- Test batch prediction
2. **Replace the dummy ML model:**
- Edit `app/services/ml_service.py`
- Load your fine-tuned model
- Implement proper preprocessing
- Update prediction logic
3. **Customize (optional):**
- Add more products in `app/config.py`
- Adjust styling in templates
- Add more Vietnamese stopwords in visualization service
4. **Prepare for demo:**
- Practice showing Swagger UI
- Prepare sample comments in Vietnamese
- Explain the architecture and tech stack
---
## π Quick Reference
| What | Where |
|------|-------|
| Start server | `python main.py` |
| Swagger UI | http://localhost:8000/docs |
| Dashboard | http://localhost:8000/dashboard |
| Replace model | `app/services/ml_service.py` |
| Add products | `app/config.py` β PRODUCTS list |
| Database file | `app/database/rating_prediction.db` |
| Uploads folder | `app/static/uploads/` |
| Test CSV | `sample_comments.csv` |
---
## β¨ Success Criteria Met
β
FastAPI backend with Swagger UI
β
Jinja2 templates + TailwindCSS
β
SQLite database (Users + History)
β
JWT authentication
β
Single comment prediction
β
Batch CSV prediction
β
Data visualization (charts + word cloud)
β
CSV export with predictions
β
Professional project structure
β
Complete documentation
**Your ML prediction web app is ready! π**
Good luck with your presentation! π
|