blockOne / README.md
eceleo's picture
Upload 5 files
bab0742 verified
---
title: ZH Apartment Rent Predictor
emoji: 🏠
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
---
# ZH Apartment Rent Predictor
Predicts monthly rent for apartments in the Canton of Zurich. Built for AI Applications HS24.
## Data
- 804 real listings from the canton of Zurich
- Enriched with BFS municipal data (population density, tax income, foreign resident share)
- 59 municipalities covered
## Preprocessing
1. Merged listing data with BFS municipal statistics on `bfs_number`
2. Extracted binary keyword flags from descriptions (Attika, Loft, Seesicht, Luxuriös, Pool, Exklusiv)
3. Area categorised into 3 buckets (< 50 m², 50–99 m², 100+ m²)
4. Added `zurich_city` flag for city of Zurich listings
5. Log-transformed `pop_dens` and `tax_income` to reduce skew
6. Derived `room_per_m2` and `area_per_room` ratios
7. Added `has_premium_keyword` (new feature) — 1 if any luxury keyword present
8. All numeric features scaled with `RobustScaler`
## Models & Results (5-fold CV)
| Iteration | Model | MAE | RMSE | R² |
|---|---|---|---|---|
| 1 | Ridge (α=10) | 446 | 671 | 0.607 |
| 1 | Lasso (α=10) | 448 | 678 | 0.599 |
| 2 | Random Forest | 439 | 659 | 0.618 |
| **2** | **Gradient Boosting** ✓ | **422** | **641** | **0.638** |
Final model: `GradientBoostingRegressor(n_estimators=400, learning_rate=0.04, max_depth=5, subsample=0.8)`
## New Feature
`has_premium_keyword` — binary flag consolidating six sparse keyword columns into one stable signal. Ranks in the top 10 most important features.
## Files
- `app.py` — Flask app + HTML (all data baked in, no extra JSON files needed)
- `model.joblib` — trained model
- `Dockerfile` / `requirements.txt` — deployment config