InfinitodeLTD
/

IHPPM-OPEN-ARC

+---
+license: mit
+language:
+- en
+metrics:
+- mae
+- r_squared
+pipeline_tag: tabular-regression
+tags:
+- regression
+- price-prediction
+---
+# Model Card for Infinitode/IHPPM-OPEN-ARC
+Repository: https://github.com/Infinitode/OPEN-ARC/
+## Model Description
+OPEN-ARC-IHPP is a CatBoostRegressor model developed as part of Infinitode's OPEN-ARC initiative. It was designed to predict accurate price points for India house and property rentals based on various factors.
+**Architecture**:
+- **CatBoostRegressor**: `iterations=2500`, `depth=10`, `learning_rate=0.045`, `loss_function="MAE"`, `eval_metric="MAE"`, `random_seed=42`, `verbose=200`.
+- **Framework**: CatBoost
+- **Training Setup**: Trained with 2500 iterations on the dataset split.
+## Uses
+- Predicting accurate price points for properties in India.
+- Validating or measuring existing price points for properties.
+- Researching property value and factors that influence price.
+## Limitations
+- May generate implausible or inappropriate results when influenced by extreme outlier values.
+- Could provide inaccurate prices; caution is advised when relying on these outputs.
+## Training Data
+- Dataset: India House Rent Prediction dataset from Kaggle.
+- Source URL: https://www.kaggle.com/datasets/pranavshinde36/india-house-rent-prediction
+- Content: House type, locality, city, area, furnishing and room specifics along with the target rent value.
+- Size: 7691 entries of properties in India.
+- Preprocessing: Removed tiny area properties, extreme rent outliers, and `area_rate`. Also created "area buckets" for better performance.
+## Training Procedure
+- Metrics: MAE, R-squared
+- Train/Testing Split: 85% train, 15% testing.
+## Evaluation Results
+| Metric | Value |
+| ------ | ----- |
+| Testing MAE | 3.86k |
+| Testing R-squared | 0.9351 |
+## How to Use
+```python
+def predict_user_rent(model, raw_df):
+    print("\n\n========== RENT PREDICTION ASSISTANT ==========\n")
+    print("Choose values for each feature below. For categorical vars, pick a number.\n")
+    sample = {}
+    # Menu
+    def choose_cat(col_name):
+        unique_vals = sorted(raw_df[col_name].unique())
+        print(f"\n--- {col_name} ---")
+        for idx, val in enumerate(unique_vals):
+            print(f"{idx + 1}. {val}")
+        sel = int(input("Enter your choice number: ")) - 1
+        return unique_vals[sel]
+    # Categorical
+    sample["house_type"] = choose_cat("house_type")
+    sample["locality"] = choose_cat("locality")
+    sample["city"] = choose_cat("city")
+    sample["furnishing"] = choose_cat("furnishing")
+    # Numeric values
+    def choose_num(col_name):
+        return float(input(f"\nEnter value for {col_name}: "))
+    sample["area"] = choose_num("area")
+    sample["beds"] = choose_num("beds")
+    sample["bathrooms"] = choose_num("bathrooms")
+    sample["balconies"] = choose_num("balconies")
+    # area bucket
+    area_val = sample["area"]
+    area_bins = [0, 300, 600, 900, 1200, 2000, 5000, 100000]
+    area_bucket = np.digitize([area_val], area_bins)[0] - 1
+    sample["area_bucket"] = area_bucket
+    # placeholder for rent_psf bucket (we don't know rent yet)
+    # so we use area only as a proxy for typical price density
+    sample["rent_psf_bucket"] = min(int(area_bucket), 19)
+    df_input = pd.DataFrame([sample])
+    # Must match training encodings
+    for col in ["house_type", "locality", "city", "furnishing"]:
+        df_input[col] = df_input[col].astype(raw_df[col].dtype)
+    # Prediction
+    pred_log = model.predict(df_input)[0]
+    pred_rent = np.expm1(pred_log)
+    print("\n===================================")
+    print(f"Estimated Rent: ₹ {pred_rent:,.2f}")
+    print("===================================\n")
+    return pred_rent
+# Uncomment to use interactively:
+# predict_user_rent(model, df)
+```
+## Contact
+For questions or issues, open a GitHub issue or reach out at https://infinitode.netlify.app/forms/contact.