DanteChapterMaster's picture
Update README.md
dacc8d9 verified
---
license: mit
---
# 🏑 House Price Predictor (Kaggle + Hugging Face)
This project is a complete machine learning pipeline for predicting house prices in Ames, Iowa, using structured data and transformer-based text embeddings. It was developed as part of the [Kaggle House Prices - Advanced Regression Techniques](https://www.kaggle.com/c/house-prices-advanced-regression-techniques) competition.
The model is published on the Hugging Face Hub:
πŸ‘‰ https://huggingface.co/DanteChapterMaster/house-price-predictor
---
## πŸ“¦ Project Highlights
- βœ… Exploratory Data Analysis (EDA)
- βœ… Feature Engineering from domain knowledge
- βœ… Model training: Ridge, Lasso, Random Forest, XGBoost, and Stacking
- βœ… NLP augmentation: BERT embeddings from generated property descriptions
- βœ… Full model pipeline with preprocessing (ColumnTransformer)
- βœ… Deployment-ready model saved with `joblib`
---
## πŸ“Š Features
**Numerical Features:**
- `GrLivArea`, `TotalBsmtSF`, `GarageCars`, etc.
**Categorical Features:**
- `Neighborhood`, `HouseStyle`, etc. (one-hot encoded)
**Generated Features:**
- Log-transformed target
- Interaction terms
- Transformer-based embeddings from property descriptions
---
## πŸ€– Model Card
- **Type:** Regressor
- **Algorithm:** XGBoost in Scikit-learn `Pipeline`
- **Target:** `SalePrice` (log-transformed)
- **Evaluation:** Root Mean Squared Error (RMSE)