sauravlchaudhari
/

UMKM1

+---
+license: apache-2.0
+tags:
+- tabular-classification
+- tabular-regression
+- pytorch
+- multi-task-learning
+- finance
+- business
+- umkm
+---
+# UMKM Multi-Task Learning Network (SME Health & Profitability)
+## Model Description
+This is a PyTorch-based **Multi-Task Learning (MTL)** neural network designed to analyze the fundamental health and profitability of Micro, Small, and Medium Enterprises (SMEs/UMKMs).
+Instead of treating business risk and financial performance as isolated metrics, this model utilizes a shared deep learning body to understand core operational patterns before branching into two distinct predictive heads:
+1. **Classification Head:** Categorizes the business into one of four risk/health tiers (*Elite, Growth, Struggling, or Critical*).
+2. **Regression Head:** Forecasts the continuous Net Profit Margin (%) of the business.
+## Intended Uses & Limitations
+* **Intended Use:** This model is intended for educational purposes, portfolio demonstration, and as a baseline for quantitative business analysis on tabular data.
+* **Limitations:** The model was trained on the *Synthetic UMKM Dataset*. While it realistically mimics economic skews and operational relationships, it does not represent real-world entities and should not be used for actual financial underwriting without fine-tuning on empirical data.
+## Model Architecture
+* **Shared Representation Layers:** 2-layer MLP with BatchNorm, GELU activations, and Dropout (0.2). Extracts underlying business dynamics from 11 core operational features.
+* **Classifier Head:** Predicts 4 distinct classes.
+* **Regression Head:** Outputs a single continuous value representing the profit margin percentage.
+## Training Data
+The model was trained on the [Synthetic UMKM Dataset](https://www.kaggle.com/datasets/zkyfauzi/umkm-dataset) by ZkyFauzi on Kaggle, which models the interplay between burn rate, transaction volume, latency, and digital adoption in Indonesian MSMEs.
+## How to Get Started with the Model
+You can use the `huggingface_hub` library to easily download the model weights, the feature scaler, and the label encoder to run inference on your own machine.
+```python
+import torch
+import torch.nn as nn
+import joblib
+from huggingface_hub import hf_hub_download
+import numpy as np
+# 1. Define the Architecture
+class UMKM_MultiTaskModel(nn.Module):
+    def __init__(self, input_dim=11, num_classes=4):
+        super().__init__()
+        self.shared_layers = nn.Sequential(
+            nn.Linear(input_dim, 128), nn.BatchNorm1d(128), nn.GELU(), nn.Dropout(0.2),
+            nn.Linear(128, 64), nn.BatchNorm1d(64), nn.GELU()
+        )
+        self.classifier_head = nn.Sequential(nn.Linear(64, 32), nn.GELU(), nn.Linear(32, num_classes))
+        self.regression_head = nn.Sequential(nn.Linear(64, 32), nn.GELU(), nn.Linear(32, 1))
+    def forward(self, x):
+        return self.classifier_head(self.shared_layers(x)), self.regression_head(self.shared_layers(x))
+# 2. Download Artifacts from Hugging Face
+repo_id = "your-username/your-repo-name" # <--- UPDATE THIS
+model_path = hf_hub_download(repo_id=repo_id, filename="umkm_mtl_model.pth")
+scaler_path = hf_hub_download(repo_id=repo_id, filename="feature_scaler.pkl")
+encoder_path = hf_hub_download(repo_id=repo_id, filename="label_encoder.pkl")
+# 3. Load Model and Preprocessors
+scaler = joblib.load(scaler_path)
+label_encoder = joblib.load(encoder_path)
+model = UMKM_MultiTaskModel()
+model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
+model.eval()
+# 4. Run Inference (Example using random dummy data)
+# Features must match the 11 training columns:
+# [Monthly_Revenue, Burn_Rate_Ratio, Transaction_Count, Avg_Historical_Rating, Review_Volatility, Business_Tenure_Months, Repeat_Order_Rate, Digital_Adoption_Score, Peak_Hour_Latency (0,1,2), Location_Competitiveness, Sentiment_Score]
+dummy_data = np.array([[15000000, 0.6, 500, 4.5, 0.2, 24, 60.0, 8.0, 0, 3, 0.8]])
+scaled_data = scaler.transform(dummy_data)
+tensor_data = torch.FloatTensor(scaled_data)
+with torch.no_grad():
+    class_logits, margin_pred = model(tensor_data)
+    predicted_class = label_encoder.inverse_transform([torch.argmax(class_logits, dim=1).item()])[0]
+    predicted_margin = margin_pred.item()
+print(f"Predicted Class: {predicted_class}")
+print(f"Predicted Net Profit Margin: {predicted_margin:.2f}%")