LogicLoom-app / app.py
Darshan03's picture
Update app.py
88bd6c9 verified
import streamlit as st
from transformers import pipeline
import huggingface_hub
import os
# Load Hugging Face API key from environment variable
HF_API_KEY = os.getenv("HF_API_KEY")
# Login to Hugging Face Hub
huggingface_hub.login(HF_API_KEY)
# Load the fine-tuned model from Hugging Face
@st.cache_resource
def load_model():
return pipeline("summarization", model="Darshan03/t5-model-small")
summarizer = load_model()
# Main app interface
st.title("T5 Headline Generator")
st.write("Generate concise headlines from news articles using a fine-tuned T5 model.")
# Input text box
article = st.text_area("Enter your news article below:", height=300)
if st.button("Generate Headline"):
if article.strip():
with st.spinner("Generating headline..."):
headline = summarizer(article, max_length=42, min_length=5, do_sample=False)[0]['summary_text']
st.success("Generated Headline:")
st.write(headline)
else:
st.warning("Please enter a news article to generate a headline.")
# Sidebar documentation
st.sidebar.title("📄 Project Documentation")
st.sidebar.markdown("""
### Dataset Description, Pre-processing, and Exploratory Data Analysis (EDA)
The dataset used in this project comprises labeled news articles with corresponding captions. The columns in the dataset include "News Article" and "Caption." The dataset is pre-processed by:
- Renaming columns for consistency
- Dropping missing values
**Key Statistics:**
- Average length of news articles: 997.53 words
- Average length of captions: 42.29 words
---
### Model Selection
#### I. Qwen-2.5 0.5 Model
- **Tuning Approach:** Instruction Tuning
- **Insights:** The model yielded precise answers but often produced long summaries even after instruction tuning. This behavior was unsuitable for the task's objective of generating concise captions.
#### II. T5-base Model
- **Tuning Approach:** Normal Fine-tuning
- **Rationale:** The T5-base model was ultimately chosen due to its ability to generate more concise and contextually relevant captions.
---
### Hyperparameters and Tuning Process
- **Model Name:** `t5-base`
- **Max Token Length:** 256
- **Batch Size:** 8
- **Learning Rate:** 2e-5
- **Number of Training Epochs:** 7
- **Gradient Accumulation Steps:** 2
---
### Training Process
#### Data Split
The dataset was not split into training, validation, and test sets as separate training datasets were used.
#### Techniques Used
- **Tokenization:** Used the T5 tokenizer to convert text into input IDs with padding and truncation.
- **Evaluation Strategy:** Evaluated the model using ROUGE scores at regular intervals during training.
- **Early Stopping:** Used to prevent overfitting.
- **Mixed Precision Training:** Enabled via `fp16` for faster training and reduced memory usage.
---
### Training Arguments
- `output_dir="./t5-headline-generator"`
- `per_device_train_batch_size=8`
- `per_device_eval_batch_size=8`
- `evaluation_strategy="steps"`
- `save_total_limit=2`
- `push_to_hub=True`
---
### Future Work
- Experiment with larger variants of the T5 model to improve performance.
- Explore the use of data augmentation to enhance the dataset.
---
### Conclusion
This project successfully fine-tuned a T5 model for generating captions from news articles. The T5-base model demonstrated strong performance, providing more concise and contextually relevant summaries compared to the Qwen-2.5 0.5 model. The project lays the foundation for further improvements and potential deployment.
---
### Live App Demonstration
The developed app allows users to input a news article and receive a generated headline. The app is hosted on Hugging Face Hub and can be accessed via the following link:
[**Darshan03/t5-model-small**](https://huggingface.co/spaces/Darshan03/LogicLoom-app)
""")