Description

This model is a specialized adaptation of the facebook/bart-large-xsum, fine-tuned for enhanced performance on dialogue summarization using the SamSum dataset.

Development

Kaggle Notebook: Text Summarization with Large Language Models

Usage

from transformers import pipeline

model = pipeline("summarization", model="luisotorres/bart-finetuned-samsum")

conversation = '''Sarah: Do you think it's a good idea to invest in Bitcoin?
    Emily: I'm skeptical. The market is very volatile, and you could lose money.
    Sarah: True. But there's also a high upside, right?                                     
'''
model(conversation)

Training Parameters

evaluation_strategy = "epoch",
save_strategy = 'epoch',
load_best_model_at_end = True,
metric_for_best_model = 'eval_loss',
seed = 42,
learning_rate=2e-5,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
gradient_accumulation_steps=2,
weight_decay=0.01,
save_total_limit=2,
num_train_epochs=4,
predict_with_generate=True,
fp16=True,
report_to="none"

Reference

This model is based on the original BART architecture, as detailed in:

Lewis et al. (2019). BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. arXiv:1910.13461

Downloads last month: 7

Safetensors

Model size

0.4B params

Tensor type

F32

Spaces using luisotorres/bart-finetuned-samsum 8

Paper for luisotorres/bart-finetuned-samsum

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Paper • 1910.13461 • Published Oct 29, 2019 • 6

Evaluation results

Validation ROUGE-1 on SamSum
self-reported

53.880
Validation ROUGE-2 on SamSum
self-reported

29.233
Validation ROUGE-L on SamSum
self-reported

44.774
Validation ROUGE-L Sum on SamSum
self-reported

49.825
Test ROUGE-1 on SamSum
self-reported

52.816
Test ROUGE-2 on SamSum
self-reported

28.126
Test ROUGE-L on SamSum
self-reported

43.715
Test ROUGE-L Sum on SamSum
self-reported

48.571