--- title: ZH Apartment Rent Predictor emoji: 🏠 colorFrom: blue colorTo: indigo sdk: docker app_port: 7860 pinned: false --- # ZH Apartment Rent Predictor Predicts monthly rent for apartments in the Canton of Zurich. Built for AI Applications HS24. ## Data - 804 real listings from the canton of Zurich - Enriched with BFS municipal data (population density, tax income, foreign resident share) - 59 municipalities covered ## Preprocessing 1. Merged listing data with BFS municipal statistics on `bfs_number` 2. Extracted binary keyword flags from descriptions (Attika, Loft, Seesicht, Luxuriös, Pool, Exklusiv) 3. Area categorised into 3 buckets (< 50 m², 50–99 m², 100+ m²) 4. Added `zurich_city` flag for city of Zurich listings 5. Log-transformed `pop_dens` and `tax_income` to reduce skew 6. Derived `room_per_m2` and `area_per_room` ratios 7. Added `has_premium_keyword` (new feature) — 1 if any luxury keyword present 8. All numeric features scaled with `RobustScaler` ## Models & Results (5-fold CV) | Iteration | Model | MAE | RMSE | R² | |---|---|---|---|---| | 1 | Ridge (α=10) | 446 | 671 | 0.607 | | 1 | Lasso (α=10) | 448 | 678 | 0.599 | | 2 | Random Forest | 439 | 659 | 0.618 | | **2** | **Gradient Boosting** ✓ | **422** | **641** | **0.638** | Final model: `GradientBoostingRegressor(n_estimators=400, learning_rate=0.04, max_depth=5, subsample=0.8)` ## New Feature `has_premium_keyword` — binary flag consolidating six sparse keyword columns into one stable signal. Ranks in the top 10 most important features. ## Files - `app.py` — Flask app + HTML (all data baked in, no extra JSON files needed) - `model.joblib` — trained model - `Dockerfile` / `requirements.txt` — deployment config