File size: 1,131 Bytes
f8c839e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
license: cc-by-4.0
language:
  - en
tags:
  - summarization
  - text-generation
  - NLP
  - transformers
datasets:
  - your-dataset-name
---

# BART Fine-Tuned Summarization Model

This repository hosts a **BART-based model fine-tuned for text summarization** on a custom dataset of articles and highlights. The model is suitable for **generating concise summaries from long-form text**.

---

## Model Overview

- **Base Model:** `facebook/bart-large-cnn`
- **Task:** Text Summarization
- **Fine-Tuning Dataset:** Custom CSV dataset containing `document` and `summary` columns
- **Dataset Size:** Varies depending on your CSV file
- **Framework:** Hugging Face Transformers
- **Language:** English

---

## Dataset Preparation

1. Load your CSV dataset containing columns: `article` (renamed to `document`) and `highlights` (renamed to `summary`).  
2. Clean the dataset by removing missing or non-string entries.  
3. Split the dataset into **train** and **validation** sets (80/20 split).  

```python
from datasets import Dataset
dataset = Dataset.from_pandas(df)
dataset = dataset.train_test_split(test_size=0.2, seed=42)