---
library_name: transformers
license: apache-2.0
base_model: monologg/koelectra-base-v3-discriminator
tags:
- intent-classification
- korean
- koelectra
- generated_from_trainer
metrics:
- accuracy
- f1
model-index:
- name: koelectra_intent_model
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# koelectra_intent_model

This model is a fine-tuned version of [monologg/koelectra-base-v3-discriminator](https://huggingface.co/monologg/koelectra-base-v3-discriminator) on the custom-intent-dataset dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9360
- Accuracy: 0.9885
- F1: 0.9884

## Model description

# 📋 모델 카드 (Model Card)

## 모델 정보

### 기본 정보
- **모델명**: Intent Classifier KoELECTRA Fine-tuned
- **모델 ID**: `kakao1513/koelectra_intent_model`
- **기본 모델**: `monologg/koelectra-base-v3-discriminator`
- **작업**: 텍스트 분류 (Text Classification)
- **언어**: 한국어 (Korean)

### 모델 개요
이 모델은 사용자의 의도를 분류하기 위해 KoELECTRA 모델을 한국어 의도 분류 데이터셋으로 미세 조정(fine-tuning)한 것입니다. 쇼핑몰, 회원가입, 로그인 등 웹 애플리케이션의 사용자 행동 의도를 35개의 클래스로 분류합니다.

---

## 훈련 데이터

### 데이터셋 통계
| 항목 | 값 |
|------|-----|
| **총 데이터 수** | 7,084 |
| **훈련 데이터** | 5,698 (80%) |
| **테스트 데이터** | 1,386 (20%) |
| **의도 클래스 수** | 35개 |

### 주요 의도 클래스 (예시)
| 의도 | 설명 | 샘플 수 |
|------|------|--------|
| `unknown` | 무관/일상잡담 | 748 |
| `go_mall` | 쇼핑몰로 이동 | 220 |
| `go_coupang` | 쿠팡으로 이동 | 220 |
| `click_login` | 로그인 | 220 |
| `click_signup` | 회원가입 클릭 | 220 |
| ... | 그 외 30개 의도 | - |

---

## 훈련 설정

### 하이퍼파라미터
```python
학습률: 2e-5
배치 크기: 32
에포크: 5
최대 시퀀스 길이: 64
가중치 감소: 0.01
라벨 스무딩: 0.1
옵티마이저: AdamW
```

### 훈련 결과
| Epoch | Validation Loss | Accuracy | F1 Score |
|-------|-----------------|----------|----------|
| 1 | 2.651761 | 71.63% | 0.6689 |
| 2 | 1.768677 | 92.35% | 0.9065 |
| 3 | 1.241083 | 97.99% | 0.9797 |
| 4 | 0.999594 | 98.91% | 0.9890 |
| 5 | 0.936003 | 98.85% | 0.9884 |

**최종 성능 (테스트 셋)**
- **정확도 (Accuracy)**: 98.85%
- **F1 점수 (Weighted)**: 0.9884

---

## 사용 방법

### 설치
```bash
pip install transformers torch
```

### 기본 사용법
```python
from transformers import pipeline

# 모델 로드
classifier = pipeline("text-classification", 
                     model="smj1513/intent-classifier-koElectra-finetuned")

# 예측 실행
text = "쇼핑몰 사이트로 이동 할까 말까 할게"
result = classifier(text)[0]

print(f"의도: {result['label']}")
print(f"확신도: {result['score']:.4f}")
```

### 상세 사용법
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 모델 및 토크나이저 로드
model_name = "smj1513/intent-classifier-koElectra-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 텍스트 전처리
text = "로그인 페이지로 가줘"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=64)

# 예측
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_class_id = logits.argmax().item()
    confidence = torch.softmax(logits, dim=-1)[0][predicted_class_id].item()

print(f"예측 클래스: {model.config.id2label[predicted_class_id]}")
print(f"신뢰도: {confidence:.4f}")
```

---

## 성능 분석

### 강점
✅ **높은 정확도**: 98.85%의 테스트 정확도로 매우 우수한 성능  
✅ **균형잡힌 F1 점수**: 0.9884의 F1 점수로 정밀도와 재현율의 균형 유지  
✅ **빠른 추론**: GPU에서 약 94개/초의 처리 속도  
✅ **한국어 특화**: KoELECTRA를 사용한 효율적인 한국어 처리  

### 주의사항
⚠️ **도메인 특화**: 쇼핑몰/회원관리 도메인에 최적화되어 있음  
⚠️ **토큰 길이 제한**: 최대 64 토큰으로 제한 (긴 문장은 활용 제한적)  
⚠️ **미지 의도**: `unknown` 클래스로 분류되는 일상 잡담이 포함됨  

---

## 기술 사항

### 모델 아키텍처
- **모델 크기**: ELECTRA Base
- **파라미터 수**: ~110M
- **출력 레이어**: 선형 분류 헤드 (35개 클래스)

### 입출력 명세
- **입력**: 최대 64 토큰 길이의 한국어 텍스트
- **출력**: 35개 의도 클래스 중 확률이 가장 높은 클래스 및 신뢰도

---

## 제한사항 및 권장사항

### 적용 가능 도메인
- ✅ 쇼핑몰/전자상거래 시스템
- ✅ 회원가입/로그인 의도 분류
- ✅ 웹/모바일 애플리케이션 사용자 명령

### 부적절한 사용 사례
- ❌ 의료, 법률 등 고위험 도메인
- ❌ 실시간 음성 인식 (이 모델은 텍스트 기반)
- ❌ 다른 언어 또는 도메인의 의도 분류

### 성능 개선 팁
1. **맥락 추가**: 긴 문장은 요약하여 64토큰 이내로 유지
2. **후처리**: 신뢰도가 낮은 경우(< 0.7) 사람의 검토 권장
3. **재훈련**: 새로운 의도 클래스 추가 시 모델 재훈련

---

## 라이선스 및 출처

- **기본 모델 라이선스**: MIT (KoELECTRA)
- **모델 공개**: Hugging Face Model Hub
- **사용 라이선스**: MIT

---

## 인용 정보

이 모델을 사용하는 경우 다음과 같이 인용해주세요:

```bibtex
@misc{intent-classifier-koelectra,
  author = {Your Name},
  title = {Intent Classifier KoELECTRA Fine-tuned},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/smj1513/intent-classifier-koElectra-finetuned}
}
```

---

## 연락처 및 지원

문제가 발생하거나 피드백이 있으시면 Hugging Face 모델 페이지에서 Issues를 제출해주세요.

**마지막 업데이트**: 2026년 2월 11일

### Framework versions

- Transformers 5.1.0
- Pytorch 2.9.1+cu128
- Datasets 4.5.0
- Tokenizers 0.22.2