mjpsm
/

Idea-Difficulty-XGB

+---
+license: mit
+library_name: xgboost
+pipeline_tag: tabular-classification
+tags:
+  - entrepreneurial-readiness
+  - tabular
+  - xgboost
+  - idea-difficulty
+model-index:
+  - name: Idea Difficulty Classifier (XGBoost)
+    results:
+      - task:
+          type: tabular-classification
+          name: Idea Difficulty (Low/Medium/High)
+        dataset:
+          name: idea_difficulty_dataset_2000 (synthetic, balanced)
+          type: tabular
+        metrics:
+          - type: accuracy
+            value: 0.9733
+          - type: macro_f1
+            value: 0.9733
+          - type: log_loss
+            value: 0.0584
+---
+# mjpsm/Idea-Difficulty-XGB
+## 🧾 Overview
+This model predicts the **difficulty of a business idea** as `Low`, `Medium`, or `High`.
+It is part of the Entrepreneurial Readiness series of tabular classifiers (alongside Skill Level, Risk Tolerance, and Confidence).
+The model was trained with **XGBoost** on a 2,000-row synthetic dataset of structured features that capture common difficulty drivers.
+---
+## 📥 Input Features
+| Feature | Type | Range | Definition |
+|---------|------|-------|------------|
+| `capital_required` | int | 1–10 | How much upfront capital is needed (1 = minimal, 10 = very high) |
+| `technical_complexity` | int | 1–10 | How technically difficult the product/service is to build or maintain |
+| `market_competition` | int | 1–10 | How crowded the target market is with competitors |
+| `customer_acquisition_difficulty` | int | 1–10 | How difficult it is to acquire and retain customers |
+| `regulatory_hurdles` | int | 1–10 | The degree of legal/regulatory challenges |
+| `time_to_mvp_months` | int | 1–60 | Estimated time to Minimum Viable Product launch (in months) |
+| `team_expertise_required` | int | 1–10 | Level of specialized expertise/team members required |
+| `scalability_requirement` | int | 1–10 | Degree to which scaling is required for success |
+**Target label:**
+- `Low` = Idea is relatively easy to execute
+- `Medium` = Moderately challenging
+- `High` = Difficult, requiring significant resources and expertise
+---
+## 📊 Performance
+- **Accuracy:** 0.9733
+- **Macro F1:** 0.9733
+- **Log Loss:** 0.0584
+Confusion Matrix (rows = true, cols = predicted):
+|       | High | Low | Medium |
+|-------|------|-----|--------|
+| High  | 100  |  0  |   0    |
+| Low   |  0   | 96  |   4    |
+| Medium|  2   |  2  |  96    |
+---
+## 🚀 Quickstart (load from the Hub)
+```python
+# Load directly from: mjpsm/Idea-Difficulty-XGB
+from huggingface_hub import hf_hub_download
+from xgboost import XGBClassifier
+import pandas as pd, json
+REPO_ID = "mjpsm/Idea-Difficulty-XGB"
+model_path = hf_hub_download(REPO_ID, "xgb_model.json")
+clf = XGBClassifier()
+clf.load_model(model_path)
+# IMPORTANT: Use the same feature names/order as training
+FEATURES = [
+    "capital_required","technical_complexity","market_competition",
+    "customer_acquisition_difficulty","regulatory_hurdles",
+    "time_to_mvp_months","team_expertise_required","scalability_requirement"
+]
+row = pd.DataFrame([{
+    "capital_required": 7,
+    "technical_complexity": 9,
+    "market_competition": 6,
+    "customer_acquisition_difficulty": 8,
+    "regulatory_hurdles": 7,
+    "time_to_mvp_months": 18,
+    "team_expertise_required": 5,
+    "scalability_requirement": 9
+}], columns=FEATURES)
+pred_id = int(clf.predict(row)[0])
+# If label_map.json is NOT uploaded, default to alphabetical LabelEncoder order:
+CLASSES = ["High","Low","Medium"]  # update if you publish label_map.json
+print("Predicted Idea Difficulty:", CLASSES[pred_id])
+# OPTIONAL: If you later upload 'label_map.json', prefer this:
+# lm_path = hf_hub_download(REPO_ID, "label_map.json")
+# label_map = json.load(open(lm_path))
+# inv_map = {v:k for k,v in label_map.items()}
+# print("Predicted Idea Difficulty:", inv_map[pred_id])