File size: 2,296 Bytes
9b1c753
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# Fix for KeyError: 'success' in Risk Discovery Comparison

## Problems Fixed

### Issue 1: NMF Parameter Error βœ…
**Error:** `TypeError: NMF.__init__() got an unexpected keyword argument 'alpha'`
**Fixed in:** `risk_discovery_alternatives.py`

### Issue 2: KeyError 'method' βœ…
**Error:** `KeyError: 'method'` when comparing methods
**Fixed in:** `risk_discovery.py` - Added consistent return format

### Issue 3: KeyError 'success' βœ…
**Error:** `KeyError: 'success'` in generate_comparison_report
**Fixed in:** `compare_risk_discovery.py` - Updated to handle direct results format

## Root Cause Analysis

The comparison pipeline had evolved to use a unified `compare_risk_discovery_methods()` function that returns:
```python
{
    'summary': {...},
    'detailed_results': {
        'kmeans': {'method': '...', 'n_clusters': 7, ...},
        'lda': {'method': '...', 'n_topics': 7, ...},
        ...
    }
}
```

But `generate_comparison_report()` was still expecting the OLD format from `compare_single_method()`:
```python
{
    'kmeans': {
        'success': True,
        'results': {'method': '...', ...},
        'execution_time': 42.5
    }
}
```

## Solution

Updated `generate_comparison_report()` to work directly with method results without the wrapper:

**Before:**
```python
for method_name, result in all_results.items():
    if result['success']:                    # ❌ KeyError!
        res = result['results']              # ❌ KeyError!
        n_patterns = res.get('n_clusters')
```

**After:**
```python
for method_name, result in all_results.items():
    n_patterns = result.get('n_clusters') or result.get('n_topics')  # βœ… Direct access
    quality_metrics = result.get('quality_metrics', {})               # βœ… Direct access
```

## Changes Made

### File: `compare_risk_discovery.py`

1. **Summary Table Generation** (lines ~245-260)
   - Removed `result['success']` check
   - Access results directly without `result['results']` wrapper
   - Removed execution time column (not tracked in unified comparison)

2. **Detailed Analysis** (lines ~270-330)
   - Removed `if not result['success']` error handling
   - Changed all `res.get(...)` to `result.get(...)`
   - Fixed pattern display for all three formats
   - Removed duplicate code sections

## Testing