Spaces:
Sleeping
Sleeping
| import streamlit as st | |
| from transformers import pipeline | |
| import huggingface_hub | |
| import os | |
| # Load Hugging Face API key from environment variable | |
| HF_API_KEY = os.getenv("HF_API_KEY") | |
| # Login to Hugging Face Hub | |
| huggingface_hub.login(HF_API_KEY) | |
| # Load the fine-tuned model from Hugging Face | |
| def load_model(): | |
| return pipeline("summarization", model="Darshan03/t5-model-small") | |
| summarizer = load_model() | |
| # Main app interface | |
| st.title("T5 Headline Generator") | |
| st.write("Generate concise headlines from news articles using a fine-tuned T5 model.") | |
| # Input text box | |
| article = st.text_area("Enter your news article below:", height=300) | |
| if st.button("Generate Headline"): | |
| if article.strip(): | |
| with st.spinner("Generating headline..."): | |
| headline = summarizer(article, max_length=42, min_length=5, do_sample=False)[0]['summary_text'] | |
| st.success("Generated Headline:") | |
| st.write(headline) | |
| else: | |
| st.warning("Please enter a news article to generate a headline.") | |
| # Sidebar documentation | |
| st.sidebar.title("📄 Project Documentation") | |
| st.sidebar.markdown(""" | |
| ### Dataset Description, Pre-processing, and Exploratory Data Analysis (EDA) | |
| The dataset used in this project comprises labeled news articles with corresponding captions. The columns in the dataset include "News Article" and "Caption." The dataset is pre-processed by: | |
| - Renaming columns for consistency | |
| - Dropping missing values | |
| **Key Statistics:** | |
| - Average length of news articles: 997.53 words | |
| - Average length of captions: 42.29 words | |
| --- | |
| ### Model Selection | |
| #### I. Qwen-2.5 0.5 Model | |
| - **Tuning Approach:** Instruction Tuning | |
| - **Insights:** The model yielded precise answers but often produced long summaries even after instruction tuning. This behavior was unsuitable for the task's objective of generating concise captions. | |
| #### II. T5-base Model | |
| - **Tuning Approach:** Normal Fine-tuning | |
| - **Rationale:** The T5-base model was ultimately chosen due to its ability to generate more concise and contextually relevant captions. | |
| --- | |
| ### Hyperparameters and Tuning Process | |
| - **Model Name:** `t5-base` | |
| - **Max Token Length:** 256 | |
| - **Batch Size:** 8 | |
| - **Learning Rate:** 2e-5 | |
| - **Number of Training Epochs:** 7 | |
| - **Gradient Accumulation Steps:** 2 | |
| --- | |
| ### Training Process | |
| #### Data Split | |
| The dataset was not split into training, validation, and test sets as separate training datasets were used. | |
| #### Techniques Used | |
| - **Tokenization:** Used the T5 tokenizer to convert text into input IDs with padding and truncation. | |
| - **Evaluation Strategy:** Evaluated the model using ROUGE scores at regular intervals during training. | |
| - **Early Stopping:** Used to prevent overfitting. | |
| - **Mixed Precision Training:** Enabled via `fp16` for faster training and reduced memory usage. | |
| --- | |
| ### Training Arguments | |
| - `output_dir="./t5-headline-generator"` | |
| - `per_device_train_batch_size=8` | |
| - `per_device_eval_batch_size=8` | |
| - `evaluation_strategy="steps"` | |
| - `save_total_limit=2` | |
| - `push_to_hub=True` | |
| --- | |
| ### Future Work | |
| - Experiment with larger variants of the T5 model to improve performance. | |
| - Explore the use of data augmentation to enhance the dataset. | |
| --- | |
| ### Conclusion | |
| This project successfully fine-tuned a T5 model for generating captions from news articles. The T5-base model demonstrated strong performance, providing more concise and contextually relevant summaries compared to the Qwen-2.5 0.5 model. The project lays the foundation for further improvements and potential deployment. | |
| --- | |
| ### Live App Demonstration | |
| The developed app allows users to input a news article and receive a generated headline. The app is hosted on Hugging Face Hub and can be accessed via the following link: | |
| [**Darshan03/t5-model-small**](https://huggingface.co/spaces/Darshan03/LogicLoom-app) | |
| """) | |