# Priority 후처리 규칙

모델의 원시 점수에 키워드 기반 규칙을 적용하여 우선순위를 조정하는 후처리 시스템입니다.

## 개요

이 모델은 회귀 모델로 점수를 예측하지만, 실제 사용 시에는 다음과 같은 후처리 규칙을 적용하는 것을 권장합니다:

1. **키워드 기반 규칙 적용**: 특정 키워드에 따라 우선순위를 강제 조정
2. **배치 내 상대 정규화**: 여러 이슈를 함께 비교할 때 배치 내에서 정규화
3. **상대적 분류**: 배치 내 상위/하위 퍼센타일 기준으로 HIGH/MED/LOW 분류

## 규칙 종류

### 1. LOW 강제 키워드
`low_forced_keywords`에 포함된 키워드가 있으면 무조건 LOW 우선순위로 분류됩니다.

예시:
- "README 오타 수정" → LOW
- "문서 업데이트" → LOW
- "typo fix" → LOW

### 2. 최소 MED 보장 키워드
`min_med_keywords`에 포함된 키워드가 있으면 최소한 MED 이상의 우선순위를 보장합니다.

예시:
- "로그인 에러 발생" → 최소 MED
- "서버 다운 문제" → 최소 MED
- "결제 오류" → 최소 MED

### 3. HIGH 부스트 키워드
`high_boost_keywords`에 포함된 키워드가 있으면 HIGH 우선순위로 부스트됩니다.

예시:
- "데이터 손실 발생" → HIGH
- "무한 루프 재발" → HIGH
- "critical security issue" → HIGH

## 사용법

### Python 예제

```python
import yaml
import json

# 규칙 로드
with open("postprocess/priority_rules.yaml", "r", encoding="utf-8") as f:
    rules = yaml.safe_load(f)

# 이슈 텍스트
issue_text = "로그인 에러 발생, 사용자 접근 불가"

# 키워드 체크
text_lower = issue_text.lower()

# LOW 강제 체크
if any(kw in text_lower for kw in rules["low_forced_keywords"]):
    priority = "LOW"
elif any(kw in text_lower for kw in rules["high_boost_keywords"]):
    priority = "HIGH"
elif any(kw in text_lower for kw in rules["min_med_keywords"]):
    # 모델 점수가 낮아도 최소 MED 보장
    priority = max(model_priority, "MED")
else:
    priority = model_priority  # 모델 예측 그대로 사용
```

### 배치 처리 예제

```python
import numpy as np
from scipy.stats import rankdata

def apply_postprocessing(issues, scores, rules):
    """
    배치 내에서 후처리 규칙 적용
    """
    # 1. 키워드 기반 규칙 적용
    adjusted_scores = apply_keyword_rules(issues, scores, rules)
    
    # 2. 정규화 (quantile)
    if rules["normalize_method"] == "quantile":
        normalized_scores = rankdata(adjusted_scores, method='average') / len(adjusted_scores)
    else:
        normalized_scores = adjusted_scores
    
    # 3. 상대적 분류
    q_high = np.percentile(normalized_scores, rules["high_percentile"] * 100)
    q_low = np.percentile(normalized_scores, rules["low_percentile"] * 100)
    
    priorities = []
    for score in normalized_scores:
        if score >= q_high:
            priorities.append("HIGH")
        elif score <= q_low:
            priorities.append("LOW")
        else:
            priorities.append("MED")
    
    return priorities, normalized_scores
```

## 규칙 커스터마이징

`priority_rules.yaml` 파일을 수정하여 프로젝트에 맞는 키워드를 추가/제거할 수 있습니다.

예시:
```yaml
# 프로젝트 특화 키워드 추가
min_med_keywords:
  - 우리회사특화키워드
  - critical-path
  - production-issue
```

## 주의사항

- 키워드 매칭은 대소문자를 구분하지 않습니다 (소문자로 변환 후 비교)
- LOW 강제 키워드가 최우선으로 적용됩니다
- HIGH 부스트 키워드가 있으면 자동으로 최소 MED도 보장됩니다
- 배치 내 정규화는 여러 이슈를 함께 비교할 때만 의미가 있습니다