Update README.md
Browse files
README.md
CHANGED
|
@@ -6,4 +6,52 @@ language:
|
|
| 6 |
- en
|
| 7 |
base_model:
|
| 8 |
- google/flan-t5-small
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- en
|
| 7 |
base_model:
|
| 8 |
- google/flan-t5-small
|
| 9 |
+
tags:
|
| 10 |
+
- summarization
|
| 11 |
+
- research-papers
|
| 12 |
+
- arxiv
|
| 13 |
+
- t5
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# Fine-Tuned Summarization Model (`fine-tuned-summarization-arxiv`)
|
| 17 |
+
This model is a fine-tuned version of [`google/flan-t5-small`](https://huggingface.co/google/flan-t5-small) on a dataset of armanc/scientific_papers (arxiv). It is optimized for **summarizing scientific abstracts**.
|
| 18 |
+
|
| 19 |
+
## Model Details
|
| 20 |
+
- **Base Model:** `google/flan-t5-small`
|
| 21 |
+
- **Training Data:** Arxiv Research Papers (`article` → `abstract`)
|
| 22 |
+
- **Fine-Tuned Task:** Text Summarization
|
| 23 |
+
- **Use Case:** Generate shorter summaries of long research papers
|
| 24 |
+
- **License:** Apache 2.0
|
| 25 |
+
|
| 26 |
+
## How to Use
|
| 27 |
+
```python
|
| 28 |
+
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
| 29 |
+
|
| 30 |
+
model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization")
|
| 31 |
+
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization")
|
| 32 |
+
|
| 33 |
+
text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection."
|
| 34 |
+
inputs = tokenizer(text, return_tensors="pt")
|
| 35 |
+
summary_ids = model.generate(**inputs)
|
| 36 |
+
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
|
| 37 |
+
|
| 38 |
+
print("Generated Summary:", summary)
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
## Training Details
|
| 42 |
+
- **Training Data:** 100k+ Arxiv research papers
|
| 43 |
+
- **Training Framework:** Hugging Face Transformers
|
| 44 |
+
- **Hyperparameters:**
|
| 45 |
+
- Learning Rate: `5e-5`
|
| 46 |
+
- Batch Size: `8`
|
| 47 |
+
- Epochs: `10`
|
| 48 |
+
- **Hardware Used:** TPU & GPU
|
| 49 |
+
|
| 50 |
+
## Limitations
|
| 51 |
+
- ❌ May struggle with **very technical** papers (e.g., complex math formulas).
|
| 52 |
+
|
| 53 |
+
## Example Summaries
|
| 54 |
+
| **Original Abstract** | **Generated Summary** |
|
| 55 |
+
|----------------------|----------------------|
|
| 56 |
+
| "Deep learning has transformed many fields... We propose a new CNN for cancer detection..." | "A CNN model is proposed for cancer detection using deep learning." |
|
| 57 |
+
| "Quantum computing has shown potential for cryptographic applications..." | "Quantum computing can be used in cryptography." |
|