Model Card for Model ID

koelectra-base-news-classification

Fine-tuned KoElectra model for Korean news classification for summarization.

Model Details

  • Base model: monologg/koelectra-base-discriminator
  • Task: Text Classification (News polarity)
  • Language: Korean
  • Number of labels: 2 (LABEL_0 = ๋ถ€์ •, LABEL_1 = ๊ธ์ •)

Training

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
import torch


device = 0 if torch.cuda.is_available() else -1
model = AutoModelForSequenceClassification.from_pretrained("jxchlee/koelectra-base-news-summerization2")
tokenizer = AutoTokenizer.from_pretrained("jxchlee/koelectra-base-news-summerization2")

nlp = pipeline("text-classification", model=model, tokenizer=tokenizer, device=device)

result = nlp("์ด ๋ชจ๋ธ์€ ์„ฑ๋Šฅ์ด ์ข‹์„๊นŒ?")
print(result)

long_text = '''
์ „๋ผ๋‚จ๋„๊ฐ€ ์Œ€ ๊ณผ์ž‰๋ฌธ์ œ๋ฅผ ๊ทผ๋ณธ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์˜ฌํ•ด๋ถ€ํ„ฐ ์‹œํ–‰ํ•˜๋Š” ์Œ€ ์ƒ์‚ฐ์กฐ์ •์ œ๋ฅผ ์ ๊ทน ์ถ”์ง„ํ‚ค๋กœ ํ–ˆ๋‹ค.

์Œ€ ์ƒ์‚ฐ์กฐ์ •์ œ๋Š” ๋ฒผ๋ฅผ ์‹ฌ์—ˆ๋˜ ๋…ผ์— ๋ฒผ ๋Œ€์‹  ์‚ฌ๋ฃŒ์ž‘๋ฌผ์ด๋‚˜ ์ฝฉ ๋“ฑ ๋‹ค๋ฅธ ์ž‘๋ฌผ์„ ์‹ฌ์œผ๋ฉด ๋ฒผ์™€์˜ ์ผ์ • ์†Œ๋“์ฐจ๋ฅผ ๋ณด์ „ํ•ด์ฃผ๋Š” ์ œ๋„๋‹ค.

์˜ฌํ•ด ์ „๋‚จ์˜ ๋…ผ ๋‹ค๋ฅธ ์ž‘๋ฌผ ์žฌ๋ฐฐ ๊ณ„ํš๋ฉด์ ์€ ์ „๊ตญ 5๋งŒha์˜ ์•ฝ 21%์ธ 1๋งŒ 698ha๋กœ, ์„ธ๋ถ€์‹œํ–‰์ง€์นจ์„ ํ™•์ •, ์‹œ๊ตฐ์— ํ†ต๋ณดํ–ˆ๋‹ค.

์ง€์› ๋Œ€์ƒ ์ž‘๋ฌผ์€ 1๋…„์ƒ์„ ํฌํ•จํ•œ ๋‹ค๋…„์ƒ์˜ ๋ชจ๋“  ์ž‘๋ฌผ์ด ํ•ด๋‹น๋˜๋‚˜ ์žฌ๋ฐฐ ๋ฉด์  ํ™•๋Œ€ ์‹œ ์ˆ˜๊ธ‰๊ณผ์ž‰์ด ์šฐ๋ ค๋˜๋Š” ๊ณ ์ถ”, ๋ฌด, ๋ฐฐ์ถ”, ์ธ์‚ผ, ๋Œ€ํŒŒ ๋“ฑ ์ˆ˜๊ธ‰ ๋ถˆ์•ˆ ํ’ˆ๋ชฉ์€ ์ œ์™ธ๋œ๋‹ค.

๋†์ง€์˜ ๊ฒฝ์šฐ๋„ ์ด๋ฏธ ๋‹ค๋ฅธ ์ž‘๋ฌผ ์žฌ๋ฐฐ ์˜๋ฌด๊ฐ€ ๋ถ€์—ฌ๋œ ๊ฐ„์ฒ™์ง€, ์ •๋ถ€๋งค์ž…๋น„์ถ•๋†์ง€, ๋†์ง„์ฒญ ์‹œ๋ฒ”์‚ฌ์—…, ๊ฒฝ๊ด€๋ณด์ „ ์ง๋ถˆ๊ธˆ ์ˆ˜๋ น ๋†์ง€ ๋“ฑ์€ ์ œ์™ธ๋  ์˜ˆ์ •์ด๋‹ค.
'''

import kss
sentences = kss.split_sentences(long_text)
result2 = nlp(sentences)
print(result2)
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support