rohin30n
/

Armour

+# Hybrid ML + Rule-Based Risk Classifier
+## Overview
+Production-ready risk detection model for financial conversations. Identifies 5 types of financial risks with 82.5% accuracy.
+## Performance Metrics
+- **Overall Accuracy:** 82.5% (tested on 160 diverse cases)
+- **Baseline Improvement:** +22.5% (60% → 82.5%)
+- **vs Pure ML:** +5% improvement
+### Per-Category Accuracy
+| Risk Type | Accuracy | Performance |
+|-----------|----------|-------------|
+| Credit Risk | 84.4% | Strong |
+| Market Risk | 90.6% | Excellent (+21% boost) |
+| Liquidity Risk | 71.9% | Good |
+| Opportunity Risk | 71.9% | Good (+15% improvement) |
+| Regulatory Risk | 93.8% | Excellent (+19% improvement) |
+## Architecture
+- **ML Engine:** Random Forest (200 trees) + Gradient Boosting ensemble
+- **Feature Extraction:** TF-IDF Vectorizer (1,059 features with trigrams)
+- **Detection Method:** Hybrid approach (94% rules+ML blend, 6% pure ML)
+- **Rules:** Category-specific financial keyword patterns
+## Model Files
+- `classifier.pkl` - Random Forest classifier (1.36 MB)
+- `classifier_gb.pkl` - Gradient Boosting classifier (0.66 MB)
+- `vectorizer.pkl` - TF-IDF vectorizer (0.07 MB)
+- `metadata.json` - Metrics and configuration
+## Risk Categories
+### 1. Credit Risk (84.4%)
+- Inability to afford monthly payments
+- Loan defaults and payment delinquencies
+- Poor credit history and low creditworthiness
+- High debt-to-income ratios
+- Keywords: "afford", "default", "delinquent", "debt"
+### 2. Market Risk (90.6%)
+- Stock market crashes and volatility
+- Economic downturns affecting portfolio
+- Currency fluctuations and losses
+- Keywords: "crash", "volatility", "bear market", "downturn"
+### 3. Liquidity Risk (71.9%)
+- Funds locked in long-term investments
+- Cash flow constraints
+- Difficulty accessing emergency funds
+- Keywords: "locked", "illiquid", "cash", "shortage"
+### 4. Opportunity Risk (71.9%)
+- Missed investment opportunities
+- Poor timing decisions
+- Regret about investment choices
+- Keywords: "missed", "opportunity", "regret", "timing"
+### 5. Regulatory Risk (93.8%)
+- Tax compliance requirements
+- AML/KYC regulations
+- Regulatory approval delays
+- Keywords: "tax", "compliance", "regulatory", "legal"
+## Usage
+```python
+from huggingface_hub import hf_hub_download
+import pickle
+# Download risk classifier
+classifier_path = hf_hub_download(
+    repo_id="rohin30n/Armour",
+    filename="risk_classifier/classifier.pkl",
+    token="YOUR_TOKEN"
+)
+vectorizer_path = hf_hub_download(
+    repo_id="rohin30n/Armour",
+    filename="risk_classifier/vectorizer.pkl",
+    token="YOUR_TOKEN"
+)
+# Load models
+with open(classifier_path, 'rb') as f:
+    classifier = pickle.load(f)
+with open(vectorizer_path, 'rb') as f:
+    vectorizer = pickle.load(f)
+# Predict risk category
+text = "Customer can't afford monthly EMI payments"
+X = vectorizer.transform([text])
+risk_pred = classifier.predict(X)[0]
+risk_proba = classifier.predict_proba(X)[0]
+print(f"Risk Category: {risk_pred}")
+print(f"Confidence: {max(risk_proba):.2%}")
+```
+## Technical Details
+### Training Data
+- 152 financial conversation samples
+- 32 samples per risk category
+- Diverse scenarios and language variations
+- Stratified train-test split (80/20)
+### Hyperparameters
+- Random Forest: 200 trees, max_depth=20, class_weight='balanced'
+- Gradient Boosting: 100 estimators, max_depth=5, learning_rate=0.1
+- TF-IDF: 1,059 features, trigrams (1-3 grams), sublinear scaling
+### Evaluation
+- 5-fold cross-validation
+- Stratified splits for class balance
+- Per-category metrics (precision, recall, F1)
+- Tested on 160 diverse financial scenarios
+## Integration with Armour AI
+This risk classifier integrates seamlessly with the Armour AI financial NLP pipeline:
+1. Text → Finance Classification
+2. If financial → Risk Analysis (this model)
+3. Risk output → Entity Extraction & Action Items
+## Performance Notes
+- **Weak categories boosted by rules:** Market risk (90.6%), Regulatory risk (93.8%)
+- **Hybrid approach:** Combines ML predictions with keyword pattern matching
+- **Fast inference:** ~50-100ms per prediction
+- **Explainable:** Returns which detection method was used (ML vs rules)
+---
+**Model Location:** `/risk_classifier/` in rohin30n/Armour
+**License:** Apache 2.0
+**Tags:** risk-scoring, financial-nlp, hybrid-model, classification