| # Fix for KeyError: 'success' in Risk Discovery Comparison | |
| ## Problems Fixed | |
| ### Issue 1: NMF Parameter Error β | |
| **Error:** `TypeError: NMF.__init__() got an unexpected keyword argument 'alpha'` | |
| **Fixed in:** `risk_discovery_alternatives.py` | |
| ### Issue 2: KeyError 'method' β | |
| **Error:** `KeyError: 'method'` when comparing methods | |
| **Fixed in:** `risk_discovery.py` - Added consistent return format | |
| ### Issue 3: KeyError 'success' β | |
| **Error:** `KeyError: 'success'` in generate_comparison_report | |
| **Fixed in:** `compare_risk_discovery.py` - Updated to handle direct results format | |
| ## Root Cause Analysis | |
| The comparison pipeline had evolved to use a unified `compare_risk_discovery_methods()` function that returns: | |
| ```python | |
| { | |
| 'summary': {...}, | |
| 'detailed_results': { | |
| 'kmeans': {'method': '...', 'n_clusters': 7, ...}, | |
| 'lda': {'method': '...', 'n_topics': 7, ...}, | |
| ... | |
| } | |
| } | |
| ``` | |
| But `generate_comparison_report()` was still expecting the OLD format from `compare_single_method()`: | |
| ```python | |
| { | |
| 'kmeans': { | |
| 'success': True, | |
| 'results': {'method': '...', ...}, | |
| 'execution_time': 42.5 | |
| } | |
| } | |
| ``` | |
| ## Solution | |
| Updated `generate_comparison_report()` to work directly with method results without the wrapper: | |
| **Before:** | |
| ```python | |
| for method_name, result in all_results.items(): | |
| if result['success']: # β KeyError! | |
| res = result['results'] # β KeyError! | |
| n_patterns = res.get('n_clusters') | |
| ``` | |
| **After:** | |
| ```python | |
| for method_name, result in all_results.items(): | |
| n_patterns = result.get('n_clusters') or result.get('n_topics') # β Direct access | |
| quality_metrics = result.get('quality_metrics', {}) # β Direct access | |
| ``` | |
| ## Changes Made | |
| ### File: `compare_risk_discovery.py` | |
| 1. **Summary Table Generation** (lines ~245-260) | |
| - Removed `result['success']` check | |
| - Access results directly without `result['results']` wrapper | |
| - Removed execution time column (not tracked in unified comparison) | |
| 2. **Detailed Analysis** (lines ~270-330) | |
| - Removed `if not result['success']` error handling | |
| - Changed all `res.get(...)` to `result.get(...)` | |
| - Fixed pattern display for all three formats | |
| - Removed duplicate code sections | |
| ## Testing | |