File size: 1,131 Bytes
f8c839e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
license: cc-by-4.0
language:
- en
tags:
- summarization
- text-generation
- NLP
- transformers
datasets:
- your-dataset-name
---
# BART Fine-Tuned Summarization Model
This repository hosts a **BART-based model fine-tuned for text summarization** on a custom dataset of articles and highlights. The model is suitable for **generating concise summaries from long-form text**.
---
## Model Overview
- **Base Model:** `facebook/bart-large-cnn`
- **Task:** Text Summarization
- **Fine-Tuning Dataset:** Custom CSV dataset containing `document` and `summary` columns
- **Dataset Size:** Varies depending on your CSV file
- **Framework:** Hugging Face Transformers
- **Language:** English
---
## Dataset Preparation
1. Load your CSV dataset containing columns: `article` (renamed to `document`) and `highlights` (renamed to `summary`).
2. Clean the dataset by removing missing or non-string entries.
3. Split the dataset into **train** and **validation** sets (80/20 split).
```python
from datasets import Dataset
dataset = Dataset.from_pandas(df)
dataset = dataset.train_test_split(test_size=0.2, seed=42)
|