Spaces:

Darshan03
/

LogicLoom-app

Sleeping

App Files Files Community

LogicLoom-app / app.py

Darshan03

Update app.py

88bd6c9 verified about 1 year ago

raw

history blame contribute delete

3.82 kB

	import streamlit as st
	from transformers import pipeline
	import huggingface_hub
	import os

	# Load Hugging Face API key from environment variable
	HF_API_KEY = os.getenv("HF_API_KEY")
	# Login to Hugging Face Hub
	huggingface_hub.login(HF_API_KEY)


	# Load the fine-tuned model from Hugging Face
	@st.cache_resource
	def load_model():
	return pipeline("summarization", model="Darshan03/t5-model-small")

	summarizer = load_model()

	# Main app interface
	st.title("T5 Headline Generator")
	st.write("Generate concise headlines from news articles using a fine-tuned T5 model.")

	# Input text box
	article = st.text_area("Enter your news article below:", height=300)

	if st.button("Generate Headline"):
	if article.strip():
	with st.spinner("Generating headline..."):
	headline = summarizer(article, max_length=42, min_length=5, do_sample=False)[0]['summary_text']
	st.success("Generated Headline:")
	st.write(headline)
	else:
	st.warning("Please enter a news article to generate a headline.")

	# Sidebar documentation
	st.sidebar.title("📄 Project Documentation")

	st.sidebar.markdown("""
	### Dataset Description, Pre-processing, and Exploratory Data Analysis (EDA)
	The dataset used in this project comprises labeled news articles with corresponding captions. The columns in the dataset include "News Article" and "Caption." The dataset is pre-processed by:
	- Renaming columns for consistency
	- Dropping missing values

	Key Statistics:
	- Average length of news articles: 997.53 words
	- Average length of captions: 42.29 words

	---

	### Model Selection

	#### I. Qwen-2.5 0.5 Model
	- Tuning Approach: Instruction Tuning
	- Insights: The model yielded precise answers but often produced long summaries even after instruction tuning. This behavior was unsuitable for the task's objective of generating concise captions.

	#### II. T5-base Model
	- Tuning Approach: Normal Fine-tuning
	- Rationale: The T5-base model was ultimately chosen due to its ability to generate more concise and contextually relevant captions.

	---

	### Hyperparameters and Tuning Process
	- Model Name: `t5-base`
	- Max Token Length: 256
	- Batch Size: 8
	- Learning Rate: 2e-5
	- Number of Training Epochs: 7
	- Gradient Accumulation Steps: 2

	---

	### Training Process
	#### Data Split
	The dataset was not split into training, validation, and test sets as separate training datasets were used.

	#### Techniques Used
	- Tokenization: Used the T5 tokenizer to convert text into input IDs with padding and truncation.
	- Evaluation Strategy: Evaluated the model using ROUGE scores at regular intervals during training.
	- Early Stopping: Used to prevent overfitting.
	- Mixed Precision Training: Enabled via `fp16` for faster training and reduced memory usage.

	---

	### Training Arguments
	- `output_dir="./t5-headline-generator"`
	- `per_device_train_batch_size=8`
	- `per_device_eval_batch_size=8`
	- `evaluation_strategy="steps"`
	- `save_total_limit=2`
	- `push_to_hub=True`

	---

	### Future Work
	- Experiment with larger variants of the T5 model to improve performance.
	- Explore the use of data augmentation to enhance the dataset.
	---

	### Conclusion
	This project successfully fine-tuned a T5 model for generating captions from news articles. The T5-base model demonstrated strong performance, providing more concise and contextually relevant summaries compared to the Qwen-2.5 0.5 model. The project lays the foundation for further improvements and potential deployment.

	---

	### Live App Demonstration
	The developed app allows users to input a news article and receive a generated headline. The app is hosted on Hugging Face Hub and can be accessed via the following link:

	[Darshan03/t5-model-small](https://huggingface.co/spaces/Darshan03/LogicLoom-app)
	""")