File size: 3,473 Bytes
dcdbffb
 
 
 
4bfd654
 
 
 
 
 
 
 
 
 
dcdbffb
4bfd654
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dcdbffb
 
4bfd654
dcdbffb
4bfd654
 
 
 
 
 
 
 
dcdbffb
 
 
 
 
4bfd654
 
dcdbffb
 
 
4bfd654
 
 
 
 
dcdbffb
 
 
 
 
4bfd654
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dcdbffb
4bfd654
dcdbffb
 
4bfd654
 
 
 
 
 
 
 
 
 
 
dcdbffb
 
4bfd654
 
 
 
 
 
 
 
 
 
 
dcdbffb
4bfd654
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---

language: ko
license: mit
tags:
  - pytorch
  - bert
  - kobert
  - text-classification
  - stance-detection
  - korean
  - news
  - political
datasets:
  - custom
metrics:
  - accuracy
  - f1
model-index:
  - name: stance-classifier-v2
    results:
      - task:
          type: text-classification
          name: Stance Classification
        metrics:
          - type: accuracy
            value: 73.93
            name: Test Accuracy
          - type: f1
            value: 0.7395
            name: Test F1
---


# Korean Political News Stance Classifier v2

KoBERT ๊ธฐ๋ฐ˜ ํ•œ๊ตญ์–ด ์ •์น˜ ๋‰ด์Šค ์Šคํƒ ์Šค(์ž…์žฅ) ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

## Model Description

- **Base Model**: monologg/kobert
- **Task**: 3-class stance classification (์˜นํ˜ธ/์ค‘๋ฆฝ/๋น„ํŒ)
- **Language**: Korean
- **Training Data**: ~12,000 labeled political news articles

## Performance

| Metric | Score |
|--------|-------|
| Test Accuracy | 73.93% |
| Test F1 (macro) | 0.7395 |

## Labels

| Label ID | Korean | English | Description |
|----------|--------|---------|-------------|
| 0 | ์˜นํ˜ธ | support | ์ •๋ถ€/์—ฌ๋‹น์— ์šฐํ˜ธ์  |
| 1 | ์ค‘๋ฆฝ | neutral | ๊ฐ๊ด€์  ์‚ฌ์‹ค ์ „๋‹ฌ |
| 2 | ๋น„ํŒ | oppose | ์ •๋ถ€/์—ฌ๋‹น์— ๋น„ํŒ์  |

## Usage

```python

import torch

from transformers import BertModel, AutoTokenizer

from huggingface_hub import hf_hub_download

import torch.nn as nn



# ๋ชจ๋ธ ์ •์˜

class StanceClassifier(nn.Module):

    def __init__(self, bert_model, num_classes=3, dropout_rate=0.3):

        super().__init__()

        self.bert = bert_model

        self.dropout = nn.Dropout(dropout_rate)

        self.classifier = nn.Linear(768, num_classes)



    def forward(self, input_ids, attention_mask, token_type_ids=None):

        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)

        pooled_output = outputs.pooler_output

        pooled_output = self.dropout(pooled_output)

        return self.classifier(pooled_output)



# ๋ชจ๋ธ ๋กœ๋“œ

model_path = hf_hub_download(repo_id="gaaahee/stance-classifier-v2", filename="pytorch_model.pt")

checkpoint = torch.load(model_path, map_location='cpu')



bert_model = BertModel.from_pretrained('monologg/kobert')

model = StanceClassifier(bert_model)

model.load_state_dict(checkpoint['model_state_dict'])

model.eval()



# ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ

tokenizer = AutoTokenizer.from_pretrained('monologg/kobert', trust_remote_code=True)



# ์˜ˆ์ธก

text = "์ •๋ถ€์˜ ์ƒˆ ์ •์ฑ…์ด ๊ฒฝ์ œ ์„ฑ์žฅ์— ํฌ๊ฒŒ ๊ธฐ์—ฌํ•  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค"

encoding = tokenizer(text, truncation=True, max_length=512, padding='max_length', return_tensors='pt')



with torch.no_grad():

    logits = model(encoding['input_ids'], encoding['attention_mask'])

    probs = torch.softmax(logits, dim=1)

    pred = torch.argmax(probs, dim=1).item()



labels = ['์˜นํ˜ธ', '์ค‘๋ฆฝ', '๋น„ํŒ']

print(f"Prediction: {labels[pred]} ({probs[0][pred].item()*100:.1f}%)")

```

## Training Details

| Parameter | Value |
|-----------|-------|
| Base Model | monologg/kobert |
| Max Length | 512 |
| Batch Size | 64 |
| Learning Rate | 2e-05 |
| Dropout | 0.3 |
| Loss Function | Focal Loss (gamma=2.0) |
| Early Stopping | patience=3 |

## Citation

```bibtex

@misc{korean-stance-classifier-v2,

  title={Korean Political News Stance Classifier v2},

  year={2024},

  publisher={HuggingFace}

}

```