KoELECTRA Intent Classifier

업무 μžλ™ν™” μ›Œν¬ν”Œλ‘œμš° μ—μ΄μ „νŠΈ(λ“€λ“€)λ₯Ό μœ„ν•œ ν•œκ΅­μ–΄ μ˜λ„ λΆ„λ₯˜ λͺ¨λΈμž…λ‹ˆλ‹€.

μ‚¬μš©μžμ˜ μžμ—°μ–΄ μž…λ ₯을 8개 업무 μ˜λ„(intent)둜 λΆ„λ₯˜ν•©λ‹ˆλ‹€.

Model Details

Item Detail
Base Model monologg/koelectra-base-v3-discriminator
Architecture ElectraForSequenceClassification
Parameters 112.9M
Language Korean
Experiment v2_stage6

Intent Labels (8 classes)

ID Intent Description
0 judgment 업무 νŒλ‹¨ μš”μ²­ (승인/반렀/κ²€ν† )
1 doc_search λ¬Έμ„œ 검색
2 doc_generate λ¬Έμ„œ 생성 (회의둝, λ³΄κ³ μ„œ λ“±)
3 doc_summary λ¬Έμ„œ μš”μ•½
4 schedule_add 일정 μΆ”κ°€/등둝
5 schedule_view 일정 쑰회/확인
6 general 일반 λŒ€ν™”/질문
7 doc_qa λ¬Έμ„œ 기반 Q&A

Performance

Metric Score
Test F1 97.88%
Adversarial F1 87.84%
Inference Speed 7.9ms / sample
  • Training Data: 2,425 sentences (2,327 base + 98 augmented)
  • Test Data: 286 samples + 450 adversarial samples
  • Label Smoothing: 0.1

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "jiyong1110/koelectra-intent-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "내일 μ˜€ν›„ 3μ‹œμ— 회의 μž‘μ•„μ€˜"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

with torch.no_grad():
    outputs = model(**inputs)
    pred = torch.argmax(outputs.logits, dim=-1).item()

id2label = model.config.id2label
print(f"Intent: {id2label[pred]}")  # schedule_add

Training Details

7단계 μ‹€ν—˜μ„ 거쳐 μ΅œμ ν™”λœ λͺ¨λΈμž…λ‹ˆλ‹€:

  1. Stage 1: Claude + GPT-4o 기반 ν•™μŠ΅ 데이터 생성
  2. Stage 2: 3개 λͺ¨λΈ 베이슀라인 비ꡐ (BERT, KoBERT, KoELECTRA)
  3. Stage 3: 32-point ν•˜μ΄νΌνŒŒλΌλ―Έν„° κ·Έλ¦¬λ“œ μ„œμΉ˜
  4. Stage 4: μ΅œμ’… 평가 (μ λŒ€μ  ν…ŒμŠ€νŠΈ, 속도 벀치마크)
  5. Stage 5: μ—λŸ¬ 뢄석 및 νƒ€κ²Ÿ 증강
  6. Stage 6: Label smoothing 적용
  7. Stage 7: μ‹œλ‚˜λ¦¬μ˜€ ν…ŒμŠ€νŠΈ (100 samples)

Project

SKN21-FINAL-3TEAM β€” WorkFlow Agent (λ“€λ“€)

LangGraph 기반 λ©€ν‹° μ—μ΄μ „νŠΈ 업무 μžλ™ν™” μ‹œμŠ€ν…œμ˜ Intent Classification λͺ¨λ“ˆμž…λ‹ˆλ‹€.

Downloads last month
13
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jiyong1110/koelectra-intent-classifier

Finetuned
(104)
this model

Evaluation results