--- license: mit --- # 🏡 House Price Predictor (Kaggle + Hugging Face) This project is a complete machine learning pipeline for predicting house prices in Ames, Iowa, using structured data and transformer-based text embeddings. It was developed as part of the [Kaggle House Prices - Advanced Regression Techniques](https://www.kaggle.com/c/house-prices-advanced-regression-techniques) competition. The model is published on the Hugging Face Hub: 👉 https://huggingface.co/DanteChapterMaster/house-price-predictor --- ## 📦 Project Highlights - ✅ Exploratory Data Analysis (EDA) - ✅ Feature Engineering from domain knowledge - ✅ Model training: Ridge, Lasso, Random Forest, XGBoost, and Stacking - ✅ NLP augmentation: BERT embeddings from generated property descriptions - ✅ Full model pipeline with preprocessing (ColumnTransformer) - ✅ Deployment-ready model saved with `joblib` --- ## 📊 Features **Numerical Features:** - `GrLivArea`, `TotalBsmtSF`, `GarageCars`, etc. **Categorical Features:** - `Neighborhood`, `HouseStyle`, etc. (one-hot encoded) **Generated Features:** - Log-transformed target - Interaction terms - Transformer-based embeddings from property descriptions --- ## 🤖 Model Card - **Type:** Regressor - **Algorithm:** XGBoost in Scikit-learn `Pipeline` - **Target:** `SalePrice` (log-transformed) - **Evaluation:** Root Mean Squared Error (RMSE)