BasitAliii commited on
Commit
f8c839e
·
verified ·
1 Parent(s): f935d08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -3
README.md CHANGED
@@ -1,3 +1,40 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - summarization
7
+ - text-generation
8
+ - NLP
9
+ - transformers
10
+ datasets:
11
+ - your-dataset-name
12
+ ---
13
+
14
+ # BART Fine-Tuned Summarization Model
15
+
16
+ This repository hosts a **BART-based model fine-tuned for text summarization** on a custom dataset of articles and highlights. The model is suitable for **generating concise summaries from long-form text**.
17
+
18
+ ---
19
+
20
+ ## Model Overview
21
+
22
+ - **Base Model:** `facebook/bart-large-cnn`
23
+ - **Task:** Text Summarization
24
+ - **Fine-Tuning Dataset:** Custom CSV dataset containing `document` and `summary` columns
25
+ - **Dataset Size:** Varies depending on your CSV file
26
+ - **Framework:** Hugging Face Transformers
27
+ - **Language:** English
28
+
29
+ ---
30
+
31
+ ## Dataset Preparation
32
+
33
+ 1. Load your CSV dataset containing columns: `article` (renamed to `document`) and `highlights` (renamed to `summary`).
34
+ 2. Clean the dataset by removing missing or non-string entries.
35
+ 3. Split the dataset into **train** and **validation** sets (80/20 split).
36
+
37
+ ```python
38
+ from datasets import Dataset
39
+ dataset = Dataset.from_pandas(df)
40
+ dataset = dataset.train_test_split(test_size=0.2, seed=42)