pipeline_tag: summarization language: - ko tags: - T5

t5-base-korean-summarization

This is T5 model for korean text summarization.

Usage (HuggingFace Transformers)

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("summarization", model="t5-base-trained-model")
pipe("""๋ฏธ ํ•ญ๊ณต์šฐ์ฃผ๊ตญ(NASA)์ด 2014๋…„ ํ•œ๋ฐ˜๋„์˜ ๋ฐค์„ ์œ„์„ฑ์œผ๋กœ ์ดฌ์˜ํ•ด ํ™”์ œ๊ฐ€ ๋œ ์‚ฌ์ง„์ด ์žˆ๋‹ค.
๋น›์œผ๋กœ ๊ฝ‰ ์ฐฌ ํ•œ๊ตญ๊ณผ ๋‹ฌ๋ฆฌ, ๋ถํ•œ์—” ํ‰์–‘์—๋งŒ ๋ถˆ๋น›์ด ๋ณด์ผ ๋ฟ ์ปด์ปดํ•œ ์–ด๋‘ ์ด ๊ฐ€๋“ํ•˜๋‹ค. ์ด ์‚ฌ์ง„์€ ์ •ํ™•ํ•œ ํ†ต๊ณ„ ์ž๋ฃŒ๊ฐ€ ๋ถ€์กฑํ•œ ๋ถํ•œ ๊ฒฝ์ œ์˜ ์‹ค์ƒ์„ ์ง์ž‘๊ฒŒ ํ•˜๋Š” ๊ณ„๊ธฐ๊ฐ€ ๋๋‹ค.

์ด๋Ÿฐ ์œ„์„ฑ ์‚ฌ์ง„๊ณผ ๋”๋ถˆ์–ด ์ตœ๊ทผ์—” ์ธ๊ณต์ง€๋Šฅ(AI) ๊ธฐ์ˆ ๋กœ ๋ถํ•œ์„ ์ข€ ๋” ๊ฐ๊ด€์ ์œผ๋กœ ๋“ค์—ฌ๋‹ค๋ณด๋Š” ์—ฐ๊ตฌ๋“ค์ด ๋‚˜์˜ค๊ณ  ์žˆ๋‹ค.

์ง€๋‚œํ•ด ๋ง, ํ•œ๊ตญ ์นด์ด์ŠคํŠธ(KAIST)๋Š” ๊ธฐ์ดˆ๊ณผํ•™์—ฐ๊ตฌ์›, ์„œ๊ฐ•๋Œ€, ํ™์ฝฉ๊ณผ๊ธฐ๋Œ€, ์‹ฑ๊ฐ€ํฌ๋ฅด๊ตญ๋ฆฝ๋Œ€์™€ ์œ„์„ฑ์˜์ƒ์„ ํ™œ์šฉํ•ด ๋ถํ•œ์ฒ˜๋Ÿผ ๊ธฐ์ดˆ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•œ ์ง€์—ญ์˜ ๊ฒฝ์ œ ์ƒํ™ฉ์„ ๋ถ„์„ํ•˜๋Š” AI ๊ธฐ๋ฒ•์„ ๊ฐœ๋ฐœํ–ˆ๋‹ค. ์ปดํ“จํ„ฐ ์‚ฌ์ด์–ธ์Šค, ๊ฒฝ์ œ, ์ง€๋ฆฌํ•™ ๋“ฑ ์ „๋ฌธ๊ฐ€ 10์—ฌ ๋ช…์ด ํž˜์„ ํ•ฉ์นœ ๊ฒƒ.

์—ฐ๊ตฌํŒ€์€ ํ•œ๊ตญ์˜ ์•„๋ฆฌ๋ž‘, ์œ ๋Ÿฝ์˜ ์„ผํ‹ฐ๋„ฌ ๋“ฑ ์ธ๊ณต์œ„์„ฑ ์˜์ƒ์„ ํ‰๊ท  0.23ใŽข ๋กœ ์„ธ๋ฐ€ํ•˜๊ฒŒ ๋‚˜๋ˆด๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ตฌ์—ญ ์•ˆ์˜ ๊ฑด๋ฌผ๊ณผ ๋„๋กœ, ๋…น์ง€ ๋“ฑ์˜ ์‹œ๊ฐ ์ •๋ณด๋ฅผ ์ˆ˜์น˜ํ™”ํ•ด AI๊ฐ€ ๊ฒฝ์ œ ๋ฐœ์ „ ์ •๋„๋ฅผ ์ ์ˆ˜๋กœ ๋งค๊ธฐ๋„๋ก ํ–ˆ๋‹ค.

์ด๋ฅผ ํ†ตํ•ด ํŠน์ • ๊ธฐ๊ฐ„ ํ•ด๋‹น ์ง€์—ญ์—์„œ ์–ด๋А ์ •๋„์˜ ๋ณ€ํ™”๊ฐ€ ์žˆ์—ˆ๋Š”์ง€๋ฅผ ๋น„๊ตํ•˜๊ณ  ์•Œ ์ˆ˜ ์žˆ๋‹ค.

์—ฐ๊ตฌํŒ€์€ ์ด ๊ธฐ์ˆ ์„ ๋ถํ•œ์— ์ ์šฉํ•ด ๋ถ„์„ํ–ˆ๋‹ค.

์ฃผ์š” ์—ฐ๊ตฌ์ง„์œผ๋กœ ์ฐธ์—ฌํ•œ ๊น€์ง€ํฌ ์นด์ด์ŠคํŠธ ๊ต์ˆ˜๋Š” BBC ์ฝ”๋ฆฌ์•„์— "๋ถํ•œ์˜ ๊ฒฝ์šฐ์—” ๋Œ€๋ถ€๋ถ„์˜ ๋‚˜๋ผ์—” ์žˆ๋Š” ์†Œ๋“, ์ž์‚ฐ, ์ธ๊ตฌ ๋“ฑ์˜ ์ž๋ฃŒ๊ฐ€ ์ถฉ๋ถ„์น˜ ์•Š๊ธฐ์— ์ ˆ๋Œ€์  ๊ฒฝ์ œ์ง€ํ‘œ๊ฐ€ ๊ฑฐ์˜ ์—†๋‹ค"๋ฉฐ "์ƒ๋Œ€์ ์ธ ๋ฐœ์ „ ์ •๋„๋ผ๋„ ํ•œ๋ฒˆ ํŒŒ์•…ํ•ด ๋ณด๊ณ  ์‹ถ์—ˆ๋‹คโ€๊ณ  ์—ฐ๊ตฌ ๋ชฉ์ ์„ ์„ค๋ช…ํ–ˆ๋‹ค.

๊ทธ๋Ÿฌ๋ฉด์„œ "๊ทธ๋™์•ˆ ์œ„์„ฑ์‚ฌ์ง„์œผ๋กœ๋Š” (๋ณ€ํ™”๊ฐ€ ์žˆ์œผ๋ฆฌ๋ผ ์˜ˆ์ธก๋˜๋Š”) ์œ„์น˜๋ฅผ ์ž„์˜๋กœ ์„ ์ •ํ•˜๊ณ  ์ถ”์  ๊ฐ์‹œ๋ฅผ ํ–ˆ๋Š”๋ฐ, ๊ฐœ๋ฐœํ•œ AI ๋ชจ๋ธ์€ ์ „์ง€์—ญ์„ ๊ฐ์ง€ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋Ÿฐ ๊ณผ์ • ์—†์ด ๋ถํ•œ ์ „์—ญ์„ ์„ธ๋ฐ€ํ•˜๊ฒŒ ๊ด€์ธกํ•  ์ˆ˜ ์žˆ๋‹ค" ๊ณ  ํ–ˆ๋‹ค.""")

RESULT >> [{'summary_text': ํ•œ๊ตญ ์นด์ด์ŠคํŠธ๋Š” ๊ธฐ์ดˆ๊ณผํ•™์—ฐ๊ตฌ์›๊ณผ ์„œ๊ฐ•๋Œ€ ํ™์ฝฉ๊ณผ๊ธฐ๋Œ€ ์‹ฑ๊ฐ€ํฌ๋ฅด๊ตญ๋ฆฝ๋Œ€์™€ ํ•จ๊ป˜ ์œ„์„ฑ ์˜์ƒ์„ ํ™œ์šฉํ•ด ๋ถํ•œ์ฒ˜๋Ÿผ ๊ธฐ์ดˆ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•œ ์ง€์—ญ์˜ ๊ฒฝ์ œ ์ƒํ™ฉ์„ ๋ถ„์„ํ•˜๋Š” AI ๊ธฐ๋ฒ•์„ ๊ฐœ๋ฐœํ–ˆ๋‹ค.}]

Evalutation Result

  • Epoch Training Loss ValidationLoss Rouge1 Rouge2 Rougel Rougelsum
  • csebuetnlp/xlsum
      8	1.051100	  1.718005	       18.211300	 3.563200	 18.000500	 18.001100
    
  • daekeun-ml/naver-news-summarization-ko
    8	No log	            0.441769	  50.047600	23.509700	49.730000	49.806500
    
  • ํ•œ๊ตญ์–ด ๋ฉ€ํ‹ฐ์„ธ์…˜ ๋Œ€ํ™”
      8	1.072700	1.624539	       7.749500	1.273900	7.744200	7.768000
    

Training

The model was trained with the parameters:

  • training arguments
batch_size = 8
num_train_epochs = 8 

args = Seq2SeqTrainingArguments(
    
    evaluation_strategy="epoch",
    learning_rate=5.6e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01, #weight_decay:
    save_total_limit=3,#:
    num_train_epochs=num_train_epochs,
    predict_with_generate=True,
    logging_steps=logging_steps,
    push_to_hub=True,
    save_steps=1000,
)
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support