Upload 2 files

## 📱 Mobile Price Prediction (NPR) | Quick Summary

**Objective:** Predict smartphone prices in Nepal using a regression pipeline built on messy scraped data.

### 🛠️ The Tech Stack

* **Data:** 109 cleaned records (from 127 raw) of mobile specs in Nepal.
* **Features:** RAM, Storage, 5G, "Ultra/Pro" status, Foldable status, and a custom **Premium Score**.
* **Top Model:** **Gradient Boosting Regressor** ($R^2 = 0.7487$, MAE $\approx$ NPR 31k).

### 🔑 Key Findings

1. **Brand is King:** Apple and Samsung (Ultra/Fold) drive the highest price premiums.
2. **Interaction Matters:** The relationship between `RAM x Storage` is a better predictor than either alone.
3. **Tiered Logic:** The model successfully categorizes phones from Budget (<20k) to Flagship (≥120k).

### 🏃 Use It in 3 Lines

```python
import joblib
bundle = joblib.load('mobile_price_model.pkl')
# Input: [RAM, Storage, 5G, Ultra, Pro, Foldable, Interaction, Log_Store, Premium, Brand]
price = bundle['model'].predict(sample_df[bundle['feature_cols']])

```

---

### 💡 Pro-Tips for "V2"

* **Data Scarcity:** With only 109 rows, Gradient Boosting might overfit. Consider **Simple Linear Regression** or **ElasticNet** as benchmarks to see if the complexity is truly paying off.
* **Feature Scaling:** Ensure your `StandardScaler` (if used) is applied inside a `Pipeline` object to prevent data leakage during cross-validation.
* **Categorical Handling:** Since you have few brands, try **One-Hot Encoding** instead of Ordinal Encoding to see if it captures brand-specific "hype" better without implying a mathematical order.

Files changed (2) hide show

mobile_price_model_2.pkl +3 -0
requirements.txt.txt +22 -0

mobile_price_model_2.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2fdb09b24d45e64c1bbdd0950db5a1c483882aa1e89e73b56fbb674fc472cef8
+size 136595

requirements.txt.txt ADDED Viewed

	@@ -0,0 +1,22 @@

+python>=3.8
+numpy>=1.24
+pandas>=2.0
+scikit-learn>=1.2
+joblib>=1.2
+matplotlib>=3.7
+seaborn>=0.12
+scipy>=1.10
+# Optional / useful for extended workflows
+# For faster gradient-boosted models (if you later switch from RandomForest)
+xgboost>=1.7  # optional
+lightgbm>=3.3  # optional
+# If you use the scraper / JS rendering scripts from earlier messages
+requests>=2.31
+beautifulsoup4>=4.12
+lxml>=4.9
+playwright>=1.36  # optional, only if you use Playwright; run `playwright install` after installing
+# Jupyter / notebook support (optional)
+jupyterlab>=4.0