nsmc-sentiment-lora / README.md
JINIIII's picture
Upload README.md with huggingface_hub
3ce007b verified
---
license: mit
base_model: klue/bert-base
tags:
- bert
- lora
- korean
- text-classification
- sentiment-analysis
language:
- ko
datasets:
- nsmc
---
# NSMC ๊ฐ์ • ๋ถ„์„ (LoRA Fine-tuned)
์ด ๋ชจ๋ธ์€ LoRA๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ NSMC(Naver Sentiment Movie Corpus) ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํŒŒ์ธํŠœ๋‹๋œ ๊ฐ์ • ๋ถ„์„ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
## ๋ชจ๋ธ ์ •๋ณด
- **๋ฒ ์ด์Šค ๋ชจ๋ธ:** klue/bert-base
- **ํŒŒ์ธํŠœ๋‹ ๋ฐฉ๋ฒ•:** LoRA (Low-Rank Adaptation)
- **ํ•™์Šต ํŒŒ๋ผ๋ฏธํ„ฐ:** ์•ฝ 0.3% (~300K)
- **๋ฐ์ดํ„ฐ์…‹:** NSMC (๋„ค์ด๋ฒ„ ์˜ํ™” ๋ฆฌ๋ทฐ)
- **Task:** ์ด์ง„ ๋ถ„๋ฅ˜ (๊ธ์ •/๋ถ€์ •)
## ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel
import torch
# ๋ฒ ์ด์Šค ๋ชจ๋ธ ๋กœ๋“œ
base_model = AutoModelForSequenceClassification.from_pretrained(
"klue/bert-base",
num_labels=2
)
# LoRA ์–ด๋Œ‘ํ„ฐ ๋กœ๋“œ
model = PeftModel.from_pretrained(base_model, "JINIIII/nsmc-sentiment-lora")
tokenizer = AutoTokenizer.from_pretrained("JINIIII/nsmc-sentiment-lora")
# ์ถ”๋ก 
text = "์ด ์˜ํ™” ์ •๋ง ์žฌ๋ฏธ์žˆ์–ด์š”!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
pred = torch.argmax(probs, dim=-1).item()
label = "๊ธ์ •" if pred == 1 else "๋ถ€์ •"
confidence = probs[0][pred].item()
print(f"๊ฒฐ๊ณผ: {label} (ํ™•์‹ ๋„: {confidence:.2%})")
```
## ํ•™์Šต ์„ธ๋ถ€์‚ฌํ•ญ
- **LoRA Rank (r):** 8
- **LoRA Alpha:** 16
- **Target Modules:** query, value
- **Dropout:** 0.1
- **ํ•™์Šต ์—ํญ:** 5
- **ํ•™์Šต๋ฅ :** 5e-4
## ์„ฑ๋Šฅ
- **ํ•™์Šต ๋ฐ์ดํ„ฐ:** 10,000๊ฐœ
- **ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ:** 2,000๊ฐœ
- **ํ‰๊ฐ€ ์ •ํ™•๋„:** ~85-90%
## ํ™œ์šฉ ์˜ˆ์‹œ
- ์˜ํ™” ๋ฆฌ๋ทฐ ๊ฐ์ • ๋ถ„์„
- ์ƒํ’ˆ ๋ฆฌ๋ทฐ ๋ถ„์„
- SNS ๊ฐ์ • ๋ชจ๋‹ˆํ„ฐ๋ง
- ๊ณ ๊ฐ ํ”ผ๋“œ๋ฐฑ ์ž๋™ ๋ถ„๋ฅ˜
## ์ œํ•œ์‚ฌํ•ญ
- ์˜ํ™” ๋ฆฌ๋ทฐ ๋„๋ฉ”์ธ์— ํŠนํ™”๋˜์–ด ์žˆ์Œ
- ์งง์€ ํ…์ŠคํŠธ์—์„œ ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ
- ๊ทน๋‹จ์ ์ธ ๊ฐ์ • ํ‘œํ˜„์—์„œ ์ •ํ™•๋„๊ฐ€ ๋†’์Œ
## ๋ผ์ด์„ ์Šค
MIT License
## ์ž‘์„ฑ์ž
JINIIII
## ์ธ์šฉ
```bibtex
@misc{nsmc-sentiment-lora,
author = {JINIIII},
title = {NSMC Sentiment Analysis with LoRA},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/JINIIII/nsmc-sentiment-lora}
}
```
**Note**: ์ด ๋ชจ๋ธ์€ ๊ต์œก ๋ชฉ์ ์œผ๋กœ ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค.