Instructions to use harsharajkumar273/ProphetNet-Large-Summarization with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use harsharajkumar273/ProphetNet-Large-Summarization with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| language: en | |
| license: apache-2.0 | |
| base_model: microsoft/prophetnet-large-uncased | |
| tags: | |
| - summarization | |
| - research-paper | |
| - seq2seq | |
| - prophetnet | |
| - lora | |
| - peft | |
| datasets: | |
| - custom | |
| metrics: | |
| - rouge | |
| - bertscore | |
| # ProphetNet-Large-Summarization | |
| A fine-tuned version of [microsoft/prophetnet-large-uncased](https://huggingface.co/microsoft/prophetnet-large-uncased) for summarizing research papers into concise summaries. This is the first stage of a two-step **Research Paper Simplifier** pipeline. | |
| ## Model Description | |
| This model takes a section of a research paper as input and generates a plain-language summary. Fine-tuned using LoRA (PEFT) with 4-bit quantization for efficient training. | |
| ## Pipeline | |
| ``` | |
| Research Paper βββΊ [ProphetNet-Large-Summarization] βββΊ Summary βββΊ [ProphetNet-Large-Story-Generation] βββΊ Story | |
| ``` | |
| ## Training Details | |
| | Parameter | Value | | |
| |-----------|-------| | |
| | Base model | microsoft/prophetnet-large-uncased | | |
| | Task | Summarization | | |
| | Max input length | 2048 tokens | | |
| | Max target length | 256 tokens | | |
| | Learning rate | 3e-5 | | |
| | Batch size | 2 | | |
| | Gradient accumulation steps | 4 | | |
| | Warmup steps | 1500 | | |
| | Weight decay | 0.01 | | |
| | Fine-tuning method | LoRA (r=16, alpha=64, targets: query_proj, value_proj) | | |
| | Quantization | 4-bit NF4 (bitsandbytes) | | |
| ## Usage | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM | |
| tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/ProphetNet-Large-Summarization") | |
| model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/ProphetNet-Large-Summarization") | |
| text = "Your research paper section here..." | |
| word_count = len(text.split()) | |
| prompt = f"Summarize this part of the research paper to less than {word_count // 10} words:\n{text}" | |
| inputs = tokenizer(prompt, return_tensors="pt", max_length=2048, truncation=True) | |
| outputs = model.generate(**inputs, max_length=256, num_beams=4) | |
| summary = tokenizer.decode(outputs[0], skip_special_tokens=True) | |
| print(summary) | |
| ``` | |
| ## Evaluation Metrics | |
| Evaluated using ROUGE and BERTScore on a held-out 10% test split. | |
| ## Related Models | |
| - [harsharajkumar273/Bart-Base-Summarization](https://huggingface.co/harsharajkumar273/Bart-Base-Summarization) | |
| - [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization) | |
| - [harsharajkumar273/ProphetNet-Large-Story-Generation](https://huggingface.co/harsharajkumar273/ProphetNet-Large-Story-Generation) β next stage | |