File size: 5,555 Bytes
a6e6817
 
 
 
 
 
 
10154c7
 
 
8643c84
10154c7
 
 
 
 
8643c84
10154c7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c62ec7b
 
 
730736b
c62ec7b
730736b
 
 
 
 
 
 
 
 
 
75353d5
 
730736b
 
 
 
 
 
c62ec7b
 
730736b
 
 
 
c62ec7b
730736b
 
 
 
 
 
c62ec7b
730736b
 
c62ec7b
730736b
 
 
 
 
c62ec7b
730736b
c62ec7b
 
10154c7
 
28cc311
10154c7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
license: mit
language:
- ko
- en
base_model:
- klue/bert-base
---
# LQ-KBERT-Base: Crypto Market Korean Sentiment & Action Signal Classifier

๊ฐ€์ƒ์ž์‚ฐ AI Agent ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ”Œ๋žซํผ, [LangQuant](https://langquant.com)์—์„œ ๊ณต๊ฐœํ•œ **ํ•œ๊ตญ์–ด ๊ธˆ์œต ์ปค๋ฎค๋‹ˆํ‹ฐ/๋‰ด์Šค ํˆฌ์ž์‹ฌ๋ฆฌ ๋ถ„๋ฅ˜ ๋ชจ๋ธ**์ž…๋‹ˆ๋‹ค.  
`klue/bert-base`๋ฅผ ๋ฐฑ๋ณธ์œผ๋กœ ํ•˜๊ณ , ๊ฐ€์ƒ์ž์‚ฐ ๊ด€๋ จ ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ์…‹ **10๋งŒ ๊ฑด ์ด์ƒ**์„ ์ „์ฒ˜๋ฆฌํ•˜์—ฌ ํŒŒ์ธํŠœ๋‹ํ–ˆ์Šต๋‹ˆ๋‹ค.  
๋ชจ๋ธ์€ ๋ฌธ์žฅ ๋‹จ์œ„ ์ž…๋ ฅ(`โ‰ค200์ž`)์— ๋Œ€ํ•ด **ํˆฌ์ž ์‹ฌ๋ฆฌยทํ–‰๋™ยท๊ฐ์ •ยทํ™•์‹ ๋„ยท๊ด€๋ จ์„ฑยท์œ ํ•ด์„ฑ**์„ ๋™์‹œ์— ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

- [Github](https://github.com/LangQuant/LQ-KBERT-Base)
---
### ๋ชจ๋ธ์€ ์•„์›ƒํ’‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

```json
{
  "sentiment_strength": "strong_pos | weak_pos | neutral | weak_neg | strong_neg",
  "action_signal": "buy | hold | sell | avoid | info_only | ask_info",
  "emotions": ["greed","fear","confidence","doubt","anger","hope","sarcasm"],
  "certainty": 0.0 ~ 1.0,
  "relevance": 0.0 ~ 1.0,
  "toxicity": 0.0 ~ 1.0
}
```
---
## Labeling Guidelines

### Sentiment Strength
- **strong_pos**: ๊ธ‰๋“ฑ ํ™•์‹ , `"๊ฐ€์ฆˆ์•„"`, `"๋ฌด์กฐ๊ฑด ๊ฐ„๋‹ค"`.
- **weak_pos**: ์กฐ์‹ฌ์Šค๋Ÿฌ์šด ๋‚™๊ด€, `"๋ฐ˜๋“ฑ ๊ฐ€๋Šฅ"`, `"๊ดœ์ฐฎ์„ ๋“ฏ"`.
- **neutral**: ๋‹จ์ˆœ ์ •๋ณด/๊ณต์ง€/์žก๋‹ด.
- **weak_neg**: ์™„๊ณกํ•œ ๋ถ€์ •, `"์กฐ์ • ์˜ฌ ๋“ฏ"`, `"๊ด€๋ง"`.
- **strong_neg**: ํญ๋ฝยทํŒจ๋‹‰, `"๋‚˜๋ฝ"`, `"๋งํ•จ"`, `"ํ•ดํ‚น/์ œ์žฌ"`.

### Action Signal
- **buy**: ๋งค์ˆ˜/์ง„์ž… ์ง€์‹œ, `"์ง€๊ธˆ ์‚ฐ๋‹ค"`, `"๋กฑ"`.
- **hold**: ๋ณด์œ  ์œ ์ง€/๊ด€๋ง, `"์กด๋ฒ„"`, `"์œ ์ง€"`.
- **sell**: ๋งค๋„/์ฒญ์‚ฐ, `"์ต์ ˆ"`, `"์†์ ˆ"`, `"์ •๋ฆฌ"`.
- **avoid**: ํšŒํ”ผ/์œ„ํ—˜ ๊ฒฝ๊ณ , `"๊ฐ€์ง€๋งˆ"`, `"์Šค์บ "`, `"์œ„ํ—˜"`.
- **info_only**: ๋‹จ์ˆœ ์ •๋ณด ์ „๋‹ฌ (๋‰ด์Šค/๊ณต์ง€).
- **ask_info**: ์งˆ๋ฌธ/ํƒ์ƒ‰, `"๋“ค์–ด๊ฐ€๋„ ๋ผ?"`, `"์™œ ๋–จ์–ด์ ธ?"`.

### Emotions (๋‹ค์ค‘ ์„ ํƒ)
- **greed** ํƒ์š•  
- **fear** ๋‘๋ ค์›€  
- **confidence** ํ™•์‹   
- **doubt** ์˜์‹ฌ  
- **anger** ๋ถ„๋…ธ  
- **hope** ํฌ๋ง  
- **sarcasm** ํ’์ž  

### Certainty
- **0.2~0.4**: ์งˆ๋ฌธยทํƒ์ƒ‰ยท๋ฐˆ (๋‚ฎ์Œ)  
- **0.4~0.6**: ์™„๊ณกํ•œ ์˜๊ฒฌ (์ค‘๊ฐ„)  
- **0.6~0.8**: ์ˆ˜์น˜ยท๊ทผ๊ฑฐยท๊ณต์‹์„ฑ (๋†’์Œ)  
- **0.8~1.0**: ๊ฐ•ํ•œ ๋‹จ์ •ยท์ง€์‹œ (๋งค์šฐ ๋†’์Œ)  

### Relevance
- **0.7~1.0**: ์ง์ ‘์ ์ธ ํˆฌ์ž/์‹œ์žฅ ๊ด€๋ จ  
- **0.4~0.7**: ๊ฐ„์ ‘ ๊ด€๋ จ (์—…๊ณ„/์ธ๋ฌผ/๊ธฐ์ˆ )  
- **0.0~0.3**: ๋ฌด๊ด€/์žก๋‹ด/๋ฐˆ  

### Toxicity
- ์š•์„คยท๋ชจ์š•ยท๋น„ํ•˜ ๊ฐ•๋„์— ๋”ฐ๋ผ **0~1**.  
- ํˆฌ์ž ์˜๋ฏธ์™€๋Š” ๋ณ„๋„๋กœ ๋…๋ฆฝ์ ์œผ๋กœ ํ‰๊ฐ€.  

---

## Sentiment Strength vs Action Signal

- **Sentiment Strength**  
  - ํˆฌ์ž ์‹ฌ๋ฆฌ์˜ ๊ฐ•๋„ (๊ธ์ • โ†” ๋ถ€์ •).  
  - ๊ฐ€๊ฒฉ ์ „๋ง์˜ ํ†ค์— ์ง‘์ค‘.  

- **Action Signal**  
  - ์‹ค์ œ ํˆฌ์ž ํ–‰๋™ ์˜๋„/์ง€์‹œ.  
  - ๋งค์ˆ˜/๋งค๋„/๋ณด์œ /ํšŒํ”ผ/์งˆ๋ฌธ/์ •๋ณด.  


---
## How to use the model
```
import torch, json
from transformers import AutoTokenizer, AutoModel

repo_or_dir = "LangQuant/LQ-Kbert-base" 
texts = [
    "๋น„ํŠธ์ฝ”์ธ ์กฐ์ • ํ›„ ๋ฐ˜๋“ฑ, ํˆฌ์ž์‹ฌ๋ฆฌ ๊ฐœ์„ ",
    "ํ™˜์œจ ๊ธ‰๋“ฑ์— ์ฆ์‹œ ๋ณ€๋™์„ฑ ํ™•๋Œ€",
    "๋น„ํŠธ ๊ทธ๋งŒ ์ข€ ๋‚ด๋ ค๋ผ ์ง„์งœ..",
    "ํญ๋ฝใ… ใ… ใ…œใ… ใ…œ ๋‹ค ํŒ”์•„์•ผํ• ๊นŒ์š”?"
]


tokenizer = AutoTokenizer.from_pretrained(repo_or_dir)
model = AutoModel.from_pretrained(repo_or_dir, trust_remote_code=True)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device).eval()


enc = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=200).to(device)
with torch.inference_mode():
    out = model(**enc)

IDX2SENTI = {0:"strong_pos",1:"weak_pos",2:"neutral",3:"weak_neg",4:"strong_neg"}
IDX2ACT   = {0:"buy",1:"hold",2:"sell",3:"avoid",4:"info_only",5:"ask_info"}
EMO_LIST  = ["greed","fear","confidence","doubt","anger","hope","sarcasm"]


for i, t in enumerate(texts):
    senti = int(out["logits_senti"][i].argmax().item())
    act   = int(out["logits_act"][i].argmax().item())
    emo_p = torch.sigmoid(out["logits_emo"][i]).tolist()
    reg   = torch.clamp(out["pred_reg"][i], 0, 1).tolist()
    emos = [EMO_LIST[j] for j,p in enumerate(emo_p) if p >= 0.5]

    result = {
        "text": t,
        "pred_sentiment_strength": IDX2SENTI[senti],
        "pred_action_signal":      IDX2ACT[act],
        "pred_emotions":           emos,
        "pred_certainty":  float(reg[0]),
        "pred_relevance":  float(reg[1]),
        "pred_toxicity":   float(reg[2]),
    }
    print(json.dumps(result, ensure_ascii=False))

```
---

### Examples

| ๋ฌธ์žฅ | sentiment_strength | action_signal | ํ•ด์„ |
|------|--------------------|---------------|------|
| "๊ฐœ๋–ก์ƒ์ด์—ฌ " | strong_pos | buy | ๊ฐ•ํ•œ ์ƒ์Šน ํ™•์‹  + ์ฆ‰์‹œ ๋งค์ˆ˜ ์˜๋„ |
| "์—ฌ๊ธฐ์„  ๊ด€๋ง์ด ๋งž๋‹ค" | weak_neg | hold | ๋ถ€์ •์ ์ด์ง€๋งŒ ๋ณด์œ  ์œ ์ง€ ์„ ํƒ |
| "๋“ค์–ด๊ฐ€๋„ ๋ ๊นŒ?" | weak_pos | ask_info | ์กฐ์‹ฌ์Šค๋Ÿฌ์šด ๋‚™๊ด€, ๋งค์ˆ˜ ํƒ์ƒ‰ ์งˆ๋ฌธ |
| "ํ•ดํ‚น ํ„ฐ์ง, ๋น„์ƒ. ์ ‘๊ทผ ๊ธˆ์ง€" | strong_neg | avoid | ๊ฐ•ํ•œ ๋ถ€์ • + ํšŒํ”ผ ๊ถŒ๊ณ  |
| "์—…๋ฐ์ดํŠธ ๊ณต์ง€ ๋‚˜์™”์Šต๋‹ˆ๋‹ค" | neutral | info_only | ๋‹จ์ˆœ ์ •๋ณด ์ œ๊ณต, ํ–‰๋™ ์—†์Œ |

---
### Citation
```
@misc{langquant2025lkbert,
  title  = {LQ-KBERT-Base: Crypto Market Korean Sentiment & Action Signal Classifier},
  author = {LangQuant},
  year   = {2025},
  url    = {https://huggingface.co/langquant/LQ-Kbert-base}
}
```
---
### Disclaimer
```
์ด ๋ชจ๋ธ์€ ํ•™์ˆ  ์—ฐ๊ตฌ ๋ฐ ์‹คํ—˜์šฉ์œผ๋กœ๋งŒ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค.
๋ณธ ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์€ ๊ธˆ์œต/ํˆฌ์ž ์ž๋ฌธ์œผ๋กœ ๊ฐ„์ฃผ๋  ์ˆ˜ ์—†์œผ๋ฉฐ,
๋ฐœ์ƒํ•˜๋Š” ๋ชจ๋“  ๊ฒฐ๊ณผ์— ๋Œ€ํ•ด LangQuant๋Š” ์ฑ…์ž„์„ ์ง€์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
```