| --- |
| language: en |
| license: apache-2.0 |
| base_model: facebook/bart-base |
| tags: |
| - summarization |
| - research-paper |
| - seq2seq |
| - bart |
| datasets: |
| - custom |
| metrics: |
| - rouge |
| - bertscore |
| --- |
| |
| # Bart-Base-Summarization |
|
|
| A fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) for summarizing research papers into concise summaries. This is the first stage of a two-step **Research Paper Simplifier** pipeline. |
|
|
| ## Model Description |
|
|
| This model takes a section of a research paper as input and generates a plain-language summary approximately 1/10th the length of the original text. It was fine-tuned end-to-end (no LoRA) on a custom dataset of research papers. |
|
|
| ## Pipeline |
|
|
| ``` |
| Research Paper βββΊ [Bart-Base-Summarization] βββΊ Summary βββΊ [Bart-Base-Story-Generation] βββΊ Story |
| ``` |
|
|
| ## Training Details |
|
|
| | Parameter | Value | |
| |-----------|-------| |
| | Base model | facebook/bart-base | |
| | Task | Summarization | |
| | Max input length | 1024 tokens | |
| | Max target length | 128 tokens | |
| | Learning rate | 5e-5 | |
| | Batch size | 8 | |
| | Warmup steps | 1000 | |
| | Weight decay | 0.01 | |
| | Fine-tuning method | Full fine-tuning | |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
| |
| tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/Bart-Base-Summarization") |
| model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/Bart-Base-Summarization") |
| |
| text = "Your research paper section here..." |
| word_count = len(text.split()) |
| prompt = f"Summarize this part of the research paper to less than {word_count // 10} words:\n{text}" |
| |
| inputs = tokenizer(prompt, return_tensors="pt", max_length=1024, truncation=True) |
| outputs = model.generate(**inputs, max_length=128, num_beams=4, length_penalty=1.0) |
| summary = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| print(summary) |
| ``` |
|
|
| ## Evaluation Metrics |
|
|
| Evaluated using ROUGE and BERTScore on a held-out 10% test split. |
|
|
| ## Related Models |
|
|
| - [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization) |
| - [harsharajkumar273/ProphetNet-Large-Summarization](https://huggingface.co/harsharajkumar273/ProphetNet-Large-Summarization) |
| - [harsharajkumar273/Bart-Base-Story-Generation](https://huggingface.co/harsharajkumar273/Bart-Base-Story-Generation) β next stage |
|
|