File size: 1,269 Bytes
b67399c b268f93 9b46957 b268f93 b67399c fd7bdba b67399c fd7bdba | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | ---
language:
- ko
library_name: transformers
license: apache-2.0
metrics:
- f1
pipeline_tag: text-classification
---
# roberta-base-infringement-detect
## Model Details
### Model Description
[klue/roberta-base](https://huggingface.co/klue/roberta-base) ๋ชจ๋ธ์ ์ด์ฉํ์ฌ, ๋ ์ปจํ
์ธ ๊ฐ์ ์ ์ฌ์ฌ๋ถ๋ฅผ ํ์ธํ๋ ๋ชจ๋ธ์
๋๋ค.
## Train
์์ฒด๊ตฌ์ถ๋ 1,310๊ฐ์ ์ฐธ์ธ ์ ์ฌ ์ปจํ
์ธ ์์ ์ด์ฉํ์ฌ, ์
ํ ํ ์ฐธ/๊ฑฐ์ง ๋น์จ 1:2์ธ ๋ฐ์ดํฐ์
์ ์์ฑํ์ฌ ํ์ต์์ผฐ์ต๋๋ค.
์ด์ธ์ ํ์ต์ ํ๋ผ๋ฏธํฐ๋ ๋ค์๊ณผ ๊ฐ์ต๋๋ค.
| Parameter | Value |
| ------------------ | ----- |
| `train_batch_size` | 16 |
| `num_train_epochs` | 5 |
| `weight_decay` | 0.01 |
| `learning_rate` | 2e-5 |
## How to use
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_name = "kms7530/roberta-base-infringement-detect"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
```
๋ชจ๋ธ์ ์ถ๋ก ์ ๋ค์๊ณผ ๊ฐ์ด ์
๋ ฅํด์ผ ํฉ๋๋ค.
```plain
[CLS]\
[unused0]<ORIGINAL_CONTENT_TITLE>\
[unused1]<ORIGINAL_CONTENT>[SEP] \
[unused0]<TEST_CONTENT_TITLE>\
[unused1]<TEST_CONTENT>[SEP]
``` |