| --- |
| title: ZH Apartment Rent Predictor |
| emoji: 🏠 |
| colorFrom: blue |
| colorTo: indigo |
| sdk: docker |
| app_port: 7860 |
| pinned: false |
| --- |
| |
| # ZH Apartment Rent Predictor |
|
|
| Predicts monthly rent for apartments in the Canton of Zurich. Built for AI Applications HS24. |
|
|
| ## Data |
| - 804 real listings from the canton of Zurich |
| - Enriched with BFS municipal data (population density, tax income, foreign resident share) |
| - 59 municipalities covered |
|
|
| ## Preprocessing |
| 1. Merged listing data with BFS municipal statistics on `bfs_number` |
| 2. Extracted binary keyword flags from descriptions (Attika, Loft, Seesicht, Luxuriös, Pool, Exklusiv) |
| 3. Area categorised into 3 buckets (< 50 m², 50–99 m², 100+ m²) |
| 4. Added `zurich_city` flag for city of Zurich listings |
| 5. Log-transformed `pop_dens` and `tax_income` to reduce skew |
| 6. Derived `room_per_m2` and `area_per_room` ratios |
| 7. Added `has_premium_keyword` (new feature) — 1 if any luxury keyword present |
| 8. All numeric features scaled with `RobustScaler` |
|
|
| ## Models & Results (5-fold CV) |
|
|
| | Iteration | Model | MAE | RMSE | R² | |
| |---|---|---|---|---| |
| | 1 | Ridge (α=10) | 446 | 671 | 0.607 | |
| | 1 | Lasso (α=10) | 448 | 678 | 0.599 | |
| | 2 | Random Forest | 439 | 659 | 0.618 | |
| | **2** | **Gradient Boosting** ✓ | **422** | **641** | **0.638** | |
|
|
| Final model: `GradientBoostingRegressor(n_estimators=400, learning_rate=0.04, max_depth=5, subsample=0.8)` |
|
|
| ## New Feature |
| `has_premium_keyword` — binary flag consolidating six sparse keyword columns into one stable signal. Ranks in the top 10 most important features. |
|
|
| ## Files |
| - `app.py` — Flask app + HTML (all data baked in, no extra JSON files needed) |
| - `model.joblib` — trained model |
| - `Dockerfile` / `requirements.txt` — deployment config |
|
|