KLUE-RoBERTa ๋‰ด์Šค ๊ธฐ์‚ฌ ๊ธฐ์—… ๊ฐ์ •๋ถ„์„ ๋ชจ๋ธ

๋ชจ๋ธ ์„ค๋ช…

์ด ๋ชจ๋ธ์€ ๋‰ด์Šค ๊ธฐ์‚ฌ ์† ํŠน์ • ๊ธฐ์—…์— ๋Œ€ํ•œ ๊ฐ์ •(ํ˜ธ์žฌ/์•…์žฌ/์ค‘๋ฆฝ)์„ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด fine-tuning๋œ KLUE-RoBERTa ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

์‚ฌ์šฉ ๋ชฉ์ 

  • ๋‰ด์Šค ๊ธฐ์‚ฌ์—์„œ ํŠน์ • ๊ธฐ์—…์— ๋Œ€ํ•œ ๊ธ์ •/๋ถ€์ •/์ค‘๋ฆฝ ๊ฐ์ • ์ž๋™ ๋ถ„๋ฅ˜
  • ๊ธˆ์œต ๋‰ด์Šค ๊ฐ์ • ๋ถ„์„
  • ๊ธฐ์—… ํ‰ํŒ ๋ชจ๋‹ˆํ„ฐ๋ง

๋ ˆ์ด๋ธ”

  • 0: negative (์•…์žฌ/๋ถ€์ •) - ํ•ด๋‹น ๊ธฐ์—…์— ๋Œ€ํ•œ ๋ถ€์ •์  ๋‚ด์šฉ
  • 1: neutral (์ค‘๋ฆฝ) - ํ•ด๋‹น ๊ธฐ์—…์— ๋Œ€ํ•œ ์ค‘๋ฆฝ์  ๋‚ด์šฉ
  • 2: positive (ํ˜ธ์žฌ/๊ธ์ •) - ํ•ด๋‹น ๊ธฐ์—…์— ๋Œ€ํ•œ ๊ธ์ •์  ๋‚ด์šฉ

์„ฑ๋Šฅ

Metric Score
Accuracy 0.8426
F1-Macro 0.8468
F1-Weighted 0.8422

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ (Optuna๋กœ ์ตœ์ ํ™”)

{
  "learning_rate": 9.78310992630157e-06,
  "num_train_epochs": 8,
  "weight_decay": 0.06436845335086991,
  "warmup_ratio": 0.10859899755289561,
  "per_device_train_batch_size": 32
}

์‚ฌ์šฉ ๋ฐฉ๋ฒ•

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# ๋ชจ๋ธ ๋กœ๋“œ
model_name = "FISA-conclave/klue-roberta-news-sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# ์˜ˆ์ธก
text = "์‚ผ์„ฑ์ „์ž์˜ 3๋ถ„๊ธฐ ์‹ค์ ์ด ์‹œ์žฅ ์˜ˆ์ƒ์„ ํฌ๊ฒŒ ์ƒํšŒํ–ˆ๋‹ค."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)[0]
    pred = torch.argmax(probs).item()

labels = {0: "negative", 1: "neutral", 2: "positive"}
print(f"์˜ˆ์ธก: {labels[pred]} ({probs[pred]:.2%})")

ํ•™์Šต ๋ฐ์ดํ„ฐ

  • ์ด ์ƒ˜ํ”Œ ์ˆ˜: 9,992๊ฐœ
  • ์ถœ์ฒ˜:
    • finance_sentiment_corpus
    • korfin-asc
    • twice_kr_fin

๋ฒ ์ด์Šค ๋ชจ๋ธ

์ธ์šฉ

@misc{klue-roberta-news-sentiment,
  author = {Tobykim},
  title = {KLUE-RoBERTa News Sentiment Analysis},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/FISA-conclave/klue-roberta-news-sentiment}}
}

๋ผ์ด์„ผ์Šค

Apache 2.0

Downloads last month
39
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results