Korean PII NER v3 (klue/roberta-large fine-tuned)

ํ•œ๊ตญ์–ด PII (Personally Identifiable Information) ๊ฐ€๋“œ๋ ˆ์ผ์šฉ NER ๋ชจ๋ธ. NAME / ADDRESS / ORG 3 ์—”ํ‹ฐํ‹ฐ 7-label BIO ๋ถ„๋ฅ˜.

Quick Start

from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

tokenizer = AutoTokenizer.from_pretrained("vmaca123/korean-pii-ner-v3")
model = AutoModelForTokenClassification.from_pretrained("vmaca123/korean-pii-ner-v3")

ner = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
print(ner("์ €๋Š” ํ™๊ธธ๋™์ด๊ณ  ์„œ์šธ์‹œ ๊ฐ•๋‚จ๊ตฌ ํ…Œํ—ค๋ž€๋กœ 152์— ๊ฑฐ์ฃผํ•ฉ๋‹ˆ๋‹ค. ์‚ผ์„ฑ์ „์ž ์†Œ์†์ž…๋‹ˆ๋‹ค."))
# [{'entity_group': 'NAME', 'start': 6, 'end': 9, 'word': 'ํ™๊ธธ๋™', ...},
#  {'entity_group': 'ADDRESS', 'start': 13, 'end': 29, 'word': '์„œ์šธ์‹œ ๊ฐ•๋‚จ๊ตฌ ํ…Œํ—ค๋ž€๋กœ 152', ...},
#  {'entity_group': 'ORG', 'start': 35, 'end': 39, 'word': '์‚ผ์„ฑ์ „์ž', ...}]

Labels (BIO, 7 classes)

ID Label Maps to v0.2 EntityType
0 O (non-entity)
1 B-NAME PERSON_NAME
2 I-NAME PERSON_NAME
3 B-ADDRESS ADDRESS_FULL
4 I-ADDRESS ADDRESS_FULL
5 B-ORG ORGANIZATION
6 I-ORG ORGANIZATION

PHONE / EMAIL / RRN / CREDIT_CARD ๋“ฑ ์ •ํ˜• PII๋Š” ๋ณธ ๋ชจ๋ธ scope ๋ฐ– (regex/dict ์ฑ…์ž„).

Training data

Source Count License
KLUE-NER train 21,008 CC-BY-SA
Faker-ko baseline (real admin divisions) 10,000 self-generated
Faker conjunctive composite 2,000 self-generated
Hard negatives (ํ•˜๋Š˜/์‚ฌ๋ž‘/๋Œ€ํ‘œ๋ฒˆํ˜ธ/์˜ˆ์‹œ๋ฒˆํ˜ธ) 1,000 self-generated
Total train pool 34,008

๋ฐ์ดํ„ฐ split: 8:1:1 (train 27,206 / val 3,401 / test 3,401). KLUE-NER validation 5,000์€ ํ•™์Šต ๋ฏธํฌํ•จ ์™ธ๋ถ€ ํ‰๊ฐ€์šฉ.

Training details

  • Base: klue/roberta-large (335M params)
  • Phase 1: encoder freeze, classifier head 1 epoch, LR 5e-4
  • Phase 2: full unfreeze, 5 epochs, LR 2e-5, warmup ratio 0.1, weight decay 0.01
  • Batch: 16, max_length 128, fp16
  • Hardware: RTX 3090 24GB (Vast.ai)
  • Wall clock: ~30 minutes

Evaluation results

Eval set macro-F1 micro-F1 size
Internal val 0.872 0.880 3,401
Internal test 0.878 0.887 3,401
KLUE-NER val (์™ธ๋ถ€) 0.766 0.792 5,000

Iteration comparison (6 training runs)

Run Base Data Internal test KLUE val
1 bert-base, 2ep 31k 0.776 0.630
2 bert-base, 5ep 31k 0.798 0.669
3 roberta-base, 5ep 31k 0.830 0.697
4 (v1) roberta-large, 5ep 31k 0.865 0.764
5 (v2) roberta-large + Naver/WikiAnn 139k 0.708 0.664 โŒ
6 (v3) roberta-large + augment 34k 0.878 0.766 โ˜…

v2 ์‹คํŒจ lesson: Naver NER 90k + WikiAnn 20k ํ†ตํ•ฉ์ด KLUE val -10%p. ์–ด์ ˆ ๋ผ๋ฒจ์˜ char-level ๋ณ€ํ™˜ ๋…ธ์ด์ฆˆ + multi-source distribution shift๊ฐ€ ์›์ธ. v3๋Š” v1 setup ๊ทธ๋Œ€๋กœ + composite/hard-negative augment๋งŒ ์ถ”๊ฐ€ํ•˜๋Š” conservative ์ ‘๊ทผ.

Sample outputs

Input NER output
"์•ˆ๋…•ํ•˜์„ธ์š”, ์ €๋Š” ํ™๊ธธ๋™์ด๊ณ  ์„œ์šธ์‹œ ๊ฐ•๋‚จ๊ตฌ ํ…Œํ—ค๋ž€๋กœ 152์— ๊ฑฐ์ฃผํ•ฉ๋‹ˆ๋‹ค. ์‚ผ์„ฑ์ „์ž ์†Œ์†์ž…๋‹ˆ๋‹ค." NAME=ํ™๊ธธ๋™, ADDR=์„œ์šธ์‹œ ๊ฐ•๋‚จ๊ตฌ ํ…Œํ—ค๋ž€๋กœ 152, ORG=์‚ผ์„ฑ์ „์ž
"์ €๋Š” ๋ฐ•์ •ํฌ์ด๊ณ  ๋ถ€์‚ฐ๊ด‘์—ญ์‹œ ํ•ด์šด๋Œ€๊ตฌ์— ๊ฑฐ์ฃผํ•˜๋ฉฐ LG์ „์ž ์†Œ์†์ž…๋‹ˆ๋‹ค." NAME + ADDR + ORG
"์˜ค๋Š˜ ํ•˜๋Š˜์ด ๋ง‘๋„ค์š”." (no spans) โœ“ hard negative
"์‚ฌ๋ž‘์€ ์ค‘์š”ํ•œ ๊ฐ€์น˜์ž…๋‹ˆ๋‹ค." (no spans) โœ“ hard negative
"์˜ˆ์‹œ ์ „ํ™”๋ฒˆํ˜ธ๋Š” 010-0000-0000์ž…๋‹ˆ๋‹ค." (no spans) โœ“ NER scope ๋ฐ–

Limitations & known issues

  • Conjunctive ํŒจํ„ด: {name}์ด๊ณ  {address} ๊ฐ™์€ ์ผ€์ด์Šค v3์—์„œ ํ•™์Šต ๋ฐ์ดํ„ฐ๋กœ ๋ณด๊ฐ•ํ–ˆ์œผ๋‚˜ ์ผ๋ถ€ ๋ณ€ํ˜• (์ง€๋ช…+ํ˜ธ์นญ ์ธ์ ‘ ๋“ฑ) ์—์„œ boundary ์˜ค๋ฅ˜ ๊ฐ€๋Šฅ
  • ADDRESS_UNIT ๋ฏธ์ง€์›: "101๋™ 1203ํ˜ธ" ๊ฐ™์€ unit์€ dict ํ›„์ฒ˜๋ฆฌ์—์„œ ๋ถ„๊ธฐ (NER์€ ADDRESS_FULL๋งŒ emit)
  • SCHOOL/HOSPITAL ๋ฏธ์„ธ๋ถ„: ORG ๋‹จ์ผ ๋ผ๋ฒจ๋กœ emit, dict๋กœ SCHOOL/HOSPITAL reclassification
  • ์™ธ๋ถ€ ๋„๋ฉ”์ธ transfer ์•ฝํ•จ: KLUE val 0.766 vs internal 0.878. ๋„๋ฉ”์ธ ํŠนํ™” fine-tuning ๊ถŒ์žฅ (์˜๋ฃŒ/๊ธˆ์œต/๋ฒ•๋ฅ  ๋“ฑ)
  • PHONE/EMAIL/RRN ๋ฏธ์ง€์›: ์ •ํ˜• PII๋Š” regex ์ฑ…์ž„

Intended use

  • ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ์—์„œ PII ํ›„๋ณด (PERSON_NAME / ADDRESS_FULL / ORGANIZATION) ํƒ์ง€
  • v0.2 ๊ฐ€๋“œ๋ ˆ์ผ ํŒŒ์ดํ”„๋ผ์ธ์˜ NER detector ๋ชจ๋“ˆ (Korean PII Guardrail v0.2)
  • Production ํ™˜๊ฒฝ: regex / dictionary / context scorer / boundary corrector์™€ ํ•จ๊ป˜ ์‚ฌ์šฉ ๊ถŒ์žฅ (๋‹จ์ผ NER๋งŒ์œผ๋กœ๋Š” 99% target ๋ฏธ๋‹ฌ)

Out of scope

  • ์ •ํ˜• PII (์ „ํ™”๋ฒˆํ˜ธ, ์ด๋ฉ”์ผ, ์ฃผ๋ฏผ๋“ฑ๋ก๋ฒˆํ˜ธ ๋“ฑ) โ€” regex/validator ์‚ฌ์šฉ
  • ๋ฉ€ํ‹ฐ ํ„ด / RAG ํ™˜๊ฒฝ์—์„œ์˜ ์ปจํ…์ŠคํŠธ ์ถ”์  โ€” v0.2 single-turn ๋ฒ”์œ„ ๋ฐ–
  • ์˜๋ฃŒ ์ฐจํŠธ, ๋ฒ•๋ฅ  ๋ฌธ์„œ ๋“ฑ ๋„๋ฉ”์ธ ํŠนํ™” ํ…์ŠคํŠธ (์„ฑ๋Šฅ ์ €ํ•˜ ์˜ˆ์ƒ)

License

CC-BY-SA-4.0 (KLUE base ๋ชจ๋ธ ๋ผ์ด์„ ์Šค ์ƒ์†).

Citation

@misc{kimminwoo2026koreanpiinerv3,
  title={Korean PII NER v3: klue/roberta-large fine-tuned for PII guardrails},
  author={Kim, Minwoo},
  year={2026},
  url={https://huggingface.co/vmaca123/korean-pii-ner-v3}
}

Companion project

์ด ๋ชจ๋ธ์€ Korean PII Guardrail v0.2 ํ”„๋กœ์ ํŠธ์˜ NER detector ๋ชจ๋“ˆ์ž…๋‹ˆ๋‹ค.

  • Wrapper code: PII/ner/ner_wrapper.py (v0.2 BaseNERDetector Protocol ์ค€์ˆ˜)
  • Design doc: korean_pii_guardrail_v0_2/docs/14_NER_DESIGN_v1.md
  • Training results: PII/ner/TRAINING_RESULTS_v3.md
Downloads last month
254
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for vmaca123/korean-pii-ner-v3

Finetuned
(80)
this model

Dataset used to train vmaca123/korean-pii-ner-v3

Evaluation results

  • macro F1 on PII NER internal test (3,401 sentences)
    self-reported
    0.878
  • micro F1 on PII NER internal test (3,401 sentences)
    self-reported
    0.887
  • macro F1 on KLUE-NER validation (5,000 sentences)
    self-reported
    0.766
  • micro F1 on KLUE-NER validation (5,000 sentences)
    self-reported
    0.792