File size: 1,403 Bytes
dacc8d9 8d5b219 dacc8d9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | ---
license: mit
---
# π‘ House Price Predictor (Kaggle + Hugging Face)
This project is a complete machine learning pipeline for predicting house prices in Ames, Iowa, using structured data and transformer-based text embeddings. It was developed as part of the [Kaggle House Prices - Advanced Regression Techniques](https://www.kaggle.com/c/house-prices-advanced-regression-techniques) competition.
The model is published on the Hugging Face Hub:
π https://huggingface.co/DanteChapterMaster/house-price-predictor
---
## π¦ Project Highlights
- β
Exploratory Data Analysis (EDA)
- β
Feature Engineering from domain knowledge
- β
Model training: Ridge, Lasso, Random Forest, XGBoost, and Stacking
- β
NLP augmentation: BERT embeddings from generated property descriptions
- β
Full model pipeline with preprocessing (ColumnTransformer)
- β
Deployment-ready model saved with `joblib`
---
## π Features
**Numerical Features:**
- `GrLivArea`, `TotalBsmtSF`, `GarageCars`, etc.
**Categorical Features:**
- `Neighborhood`, `HouseStyle`, etc. (one-hot encoded)
**Generated Features:**
- Log-transformed target
- Interaction terms
- Transformer-based embeddings from property descriptions
---
## π€ Model Card
- **Type:** Regressor
- **Algorithm:** XGBoost in Scikit-learn `Pipeline`
- **Target:** `SalePrice` (log-transformed)
- **Evaluation:** Root Mean Squared Error (RMSE) |