Spaces:
Sleeping
Sleeping
Upload folder using huggingface_hub
Browse files- .gitignore +28 -0
- Dockerfile +19 -0
- README.md +103 -8
- app/app.py +85 -0
- app/static/script.js +32 -0
- app/static/style.css +100 -0
- app/templates/index.html +28 -0
- requirements.txt +6 -0
- test-cases.md +508 -0
.gitignore
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Virtual environments
|
| 2 |
+
.venv/
|
| 3 |
+
venv/
|
| 4 |
+
env/
|
| 5 |
+
|
| 6 |
+
# Python bytecode
|
| 7 |
+
__pycache__/
|
| 8 |
+
*.pyc
|
| 9 |
+
*.pyo
|
| 10 |
+
|
| 11 |
+
# Jupyter
|
| 12 |
+
.ipynb_checkpoints/
|
| 13 |
+
|
| 14 |
+
# Environment variables
|
| 15 |
+
.env
|
| 16 |
+
|
| 17 |
+
# Model weights (too large for GitHub β load from Hugging Face Hub)
|
| 18 |
+
final_model/
|
| 19 |
+
|
| 20 |
+
# OS files
|
| 21 |
+
.DS_Store
|
| 22 |
+
Thumbs.db
|
| 23 |
+
|
| 24 |
+
# IDE / editor
|
| 25 |
+
.vscode/
|
| 26 |
+
.idea/
|
| 27 |
+
*.swp
|
| 28 |
+
*.swo
|
Dockerfile
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
FROM python:3.9-slim
|
| 2 |
+
|
| 3 |
+
# Set working directory
|
| 4 |
+
WORKDIR /code
|
| 5 |
+
|
| 6 |
+
# Copy requirements first for better caching
|
| 7 |
+
COPY ./requirements.txt /code/requirements.txt
|
| 8 |
+
|
| 9 |
+
# Install dependencies
|
| 10 |
+
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
|
| 11 |
+
|
| 12 |
+
# Copy the app code
|
| 13 |
+
COPY ./app /code/app
|
| 14 |
+
|
| 15 |
+
# Set environment variables
|
| 16 |
+
ENV MODEL_ID="unnat17/Text-Summarizer"
|
| 17 |
+
|
| 18 |
+
# HF Spaces run on port 7860 by default
|
| 19 |
+
CMD ["uvicorn", "app.app:app", "--host", "0.0.0.0", "--port", "7860"]
|
README.md
CHANGED
|
@@ -1,10 +1,105 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
| 1 |
+
# Text Summarizer
|
| 2 |
+
|
| 3 |
+
## Project Overview
|
| 4 |
+
|
| 5 |
+
A full-stack web application that generates concise summaries from user-provided text. The backend uses a T5-small transformer model served through a FastAPI endpoint, while the frontend provides a minimal interface for text input and summary display.
|
| 6 |
+
|
| 7 |
+
## Features
|
| 8 |
+
|
| 9 |
+
- Text input via a web-based interface
|
| 10 |
+
- Server-side summarization using a T5-small model with beam search decoding
|
| 11 |
+
- Automatic device selection (CUDA, MPS, or CPU)
|
| 12 |
+
- Input preprocessing: whitespace normalization, HTML tag removal, case folding
|
| 13 |
+
- Asynchronous request handling via FastAPI
|
| 14 |
+
|
| 15 |
+
## Tech Stack
|
| 16 |
+
|
| 17 |
+
| Layer | Technology |
|
| 18 |
+
|------------|------------------------------------|
|
| 19 |
+
| Backend | Python, FastAPI, Uvicorn |
|
| 20 |
+
| Frontend | HTML, CSS, JavaScript |
|
| 21 |
+
| ML | Hugging Face Transformers, PyTorch |
|
| 22 |
+
| Model | T5-small |
|
| 23 |
+
| Templating | Jinja2 |
|
| 24 |
+
|
| 25 |
+
## Model Details
|
| 26 |
+
|
| 27 |
+
- **Architecture:** T5-small (Text-to-Text Transfer Transformer)
|
| 28 |
+
- **Input format:** Raw text prefixed with `summarize:`
|
| 29 |
+
- **Tokenization:** `T5Tokenizer` with padding and truncation at 512 tokens
|
| 30 |
+
- **Decoding:** Beam search (`num_beams=4`) with `max_length=150` and early stopping
|
| 31 |
+
- **Inference:** Runs under `torch.no_grad()` for reduced memory usage
|
| 32 |
+
|
| 33 |
+
## Project Structure
|
| 34 |
+
|
| 35 |
+
```
|
| 36 |
+
text-summarizer/
|
| 37 |
+
βββ app/
|
| 38 |
+
β βββ app.py # FastAPI application and inference logic
|
| 39 |
+
β βββ static/
|
| 40 |
+
β β βββ style.css # Frontend styles
|
| 41 |
+
β β βββ script.js # Client-side form handling and API calls
|
| 42 |
+
β βββ templates/
|
| 43 |
+
β βββ index.html # Main UI template (Jinja2)
|
| 44 |
+
βββ requirements.txt # Python dependencies
|
| 45 |
+
βββ test-cases.md # Test scenarios and expected outputs
|
| 46 |
+
βββ .gitignore
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## Installation and Setup
|
| 50 |
+
|
| 51 |
+
**1. Clone the repository**
|
| 52 |
+
|
| 53 |
+
```bash
|
| 54 |
+
git clone https://github.com/unnat-git/Text-Summarizer.git
|
| 55 |
+
cd Text-Summarizer
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
**2. Create and activate a virtual environment**
|
| 59 |
+
|
| 60 |
+
```bash
|
| 61 |
+
python -m venv venv
|
| 62 |
+
source venv/bin/activate # Linux/macOS
|
| 63 |
+
venv\Scripts\activate # Windows
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
**3. Install dependencies**
|
| 67 |
+
|
| 68 |
+
```bash
|
| 69 |
+
pip install -r requirements.txt
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
**4. Run the application**
|
| 73 |
+
|
| 74 |
+
```bash
|
| 75 |
+
uvicorn app.app:app --reload
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
The application will be available at `http://127.0.0.1:8000`.
|
| 79 |
+
|
| 80 |
+
## Usage
|
| 81 |
+
|
| 82 |
+
1. Open the application in a browser.
|
| 83 |
+
2. Enter or paste text into the input field.
|
| 84 |
+
3. Click **Summarize**.
|
| 85 |
+
4. The generated summary appears below the input area.
|
| 86 |
+
|
| 87 |
+
The `/summarize` endpoint also accepts direct POST requests:
|
| 88 |
+
|
| 89 |
+
```bash
|
| 90 |
+
curl -X POST http://127.0.0.1:8000/summarize \
|
| 91 |
+
-H "Content-Type: application/json" \
|
| 92 |
+
-d '{ "dialogue": "Your text here." }'
|
| 93 |
+
```
|
| 94 |
+
|
| 95 |
+
## Future Improvements
|
| 96 |
+
|
| 97 |
+
- Support for additional input formats (PDF, URL extraction)
|
| 98 |
+
- Adjustable summary length parameter
|
| 99 |
+
- Response caching for repeated inputs
|
| 100 |
+
- Containerized deployment (Docker)
|
| 101 |
+
- Hosting on a cloud platform (Render, Railway)
|
| 102 |
+
|
| 103 |
---
|
| 104 |
|
| 105 |
+
Built with β€οΈ as a portfolio project showcasing NLP, deep learning, and full-stack Python development.
|
app/app.py
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import re
|
| 3 |
+
|
| 4 |
+
import torch
|
| 5 |
+
from fastapi import FastAPI, Request
|
| 6 |
+
from fastapi.responses import HTMLResponse
|
| 7 |
+
from fastapi.staticfiles import StaticFiles
|
| 8 |
+
from fastapi.templating import Jinja2Templates
|
| 9 |
+
from pydantic import BaseModel
|
| 10 |
+
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
| 11 |
+
|
| 12 |
+
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
|
| 13 |
+
|
| 14 |
+
app = FastAPI(
|
| 15 |
+
title="Text Summarizer App",
|
| 16 |
+
description="Dialogue summarization powered by a fine-tuned T5 model",
|
| 17 |
+
version="1.0",
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
templates = Jinja2Templates(directory=os.path.join(BASE_DIR, "templates"))
|
| 21 |
+
app.mount(
|
| 22 |
+
"/static",
|
| 23 |
+
StaticFiles(directory=os.path.join(BASE_DIR, "static")),
|
| 24 |
+
name="static",
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
MODEL_ID = os.getenv("MODEL_ID", "unnat17/Text-Summarizer")
|
| 28 |
+
|
| 29 |
+
model = T5ForConditionalGeneration.from_pretrained(MODEL_ID)
|
| 30 |
+
tokenizer = T5Tokenizer.from_pretrained(MODEL_ID)
|
| 31 |
+
|
| 32 |
+
if torch.cuda.is_available():
|
| 33 |
+
device = torch.device("cuda")
|
| 34 |
+
elif torch.backends.mps.is_available():
|
| 35 |
+
device = torch.device("mps")
|
| 36 |
+
else:
|
| 37 |
+
device = torch.device("cpu")
|
| 38 |
+
|
| 39 |
+
model.to(device)
|
| 40 |
+
model.eval()
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
class DialogueInput(BaseModel):
|
| 44 |
+
dialogue: str
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
def clean_text(text: str) -> str:
|
| 48 |
+
text = re.sub(r"\r\n", " ", text)
|
| 49 |
+
text = re.sub(r"\s+", " ", text)
|
| 50 |
+
text = re.sub(r"<.*?>", " ", text)
|
| 51 |
+
return text.strip().lower()
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
def summarize_dialogue(dialogue: str) -> str:
|
| 55 |
+
dialogue = "summarize: " + clean_text(dialogue)
|
| 56 |
+
|
| 57 |
+
inputs = tokenizer(
|
| 58 |
+
dialogue,
|
| 59 |
+
padding="max_length",
|
| 60 |
+
max_length=512,
|
| 61 |
+
truncation=True,
|
| 62 |
+
return_tensors="pt",
|
| 63 |
+
).to(device)
|
| 64 |
+
|
| 65 |
+
with torch.no_grad():
|
| 66 |
+
output_ids = model.generate(
|
| 67 |
+
input_ids=inputs["input_ids"],
|
| 68 |
+
attention_mask=inputs["attention_mask"],
|
| 69 |
+
max_length=150,
|
| 70 |
+
num_beams=4,
|
| 71 |
+
early_stopping=True,
|
| 72 |
+
)
|
| 73 |
+
|
| 74 |
+
return tokenizer.decode(output_ids[0], skip_special_tokens=True)
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
@app.get("/", response_class=HTMLResponse)
|
| 78 |
+
async def home(request: Request):
|
| 79 |
+
return templates.TemplateResponse(request=request, name="index.html")
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
@app.post("/summarize")
|
| 83 |
+
async def summarize(dialogue_input: DialogueInput):
|
| 84 |
+
summary = summarize_dialogue(dialogue_input.dialogue)
|
| 85 |
+
return {"summary": summary}
|
app/static/script.js
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
document.getElementById("summarization-form").addEventListener("submit", async (e) => {
|
| 2 |
+
e.preventDefault();
|
| 3 |
+
|
| 4 |
+
const dialogueInput = document.getElementById("dialogue-input");
|
| 5 |
+
const summaryText = document.getElementById("summary-text");
|
| 6 |
+
const submitButton = document.getElementById("summarize-btn");
|
| 7 |
+
|
| 8 |
+
const dialogue = dialogueInput.value.trim();
|
| 9 |
+
if (!dialogue) return;
|
| 10 |
+
|
| 11 |
+
summaryText.innerText = "Processing...";
|
| 12 |
+
submitButton.disabled = true;
|
| 13 |
+
|
| 14 |
+
try {
|
| 15 |
+
const response = await fetch("/summarize", {
|
| 16 |
+
method: "POST",
|
| 17 |
+
headers: { "Content-Type": "application/json" },
|
| 18 |
+
body: JSON.stringify({ dialogue }),
|
| 19 |
+
});
|
| 20 |
+
|
| 21 |
+
if (!response.ok) {
|
| 22 |
+
throw new Error(`Server error: ${response.status}`);
|
| 23 |
+
}
|
| 24 |
+
|
| 25 |
+
const data = await response.json();
|
| 26 |
+
summaryText.innerText = data.summary || "No summary returned.";
|
| 27 |
+
} catch (err) {
|
| 28 |
+
summaryText.innerText = `Error: ${err.message}`;
|
| 29 |
+
} finally {
|
| 30 |
+
submitButton.disabled = false;
|
| 31 |
+
}
|
| 32 |
+
});
|
app/static/style.css
ADDED
|
@@ -0,0 +1,100 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
* {
|
| 2 |
+
margin: 0;
|
| 3 |
+
padding: 0;
|
| 4 |
+
box-sizing: border-box;
|
| 5 |
+
}
|
| 6 |
+
|
| 7 |
+
body {
|
| 8 |
+
font-family: "Segoe UI", Roboto, sans-serif;
|
| 9 |
+
height: 100vh;
|
| 10 |
+
display: flex;
|
| 11 |
+
justify-content: center;
|
| 12 |
+
align-items: center;
|
| 13 |
+
background: linear-gradient(135deg, #1e293b, #0f172a);
|
| 14 |
+
color: #e2e8f0;
|
| 15 |
+
}
|
| 16 |
+
|
| 17 |
+
.container {
|
| 18 |
+
width: 100%;
|
| 19 |
+
max-width: 650px;
|
| 20 |
+
padding: 30px;
|
| 21 |
+
border-radius: 16px;
|
| 22 |
+
background: rgba(255, 255, 255, 0.05);
|
| 23 |
+
backdrop-filter: blur(12px);
|
| 24 |
+
-webkit-backdrop-filter: blur(12px);
|
| 25 |
+
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.4);
|
| 26 |
+
border: 1px solid rgba(255, 255, 255, 0.1);
|
| 27 |
+
}
|
| 28 |
+
|
| 29 |
+
h1 {
|
| 30 |
+
text-align: center;
|
| 31 |
+
font-size: 2rem;
|
| 32 |
+
margin-bottom: 5px;
|
| 33 |
+
background: linear-gradient(90deg, #fb923c, #f97316);
|
| 34 |
+
-webkit-background-clip: text;
|
| 35 |
+
-webkit-text-fill-color: transparent;
|
| 36 |
+
}
|
| 37 |
+
|
| 38 |
+
h3 {
|
| 39 |
+
text-align: center;
|
| 40 |
+
margin-bottom: 30px;
|
| 41 |
+
font-weight: 400;
|
| 42 |
+
color: #cbd5f5;
|
| 43 |
+
}
|
| 44 |
+
|
| 45 |
+
form {
|
| 46 |
+
padding: 5px;
|
| 47 |
+
display: flex;
|
| 48 |
+
flex-direction: column;
|
| 49 |
+
gap: 15px;
|
| 50 |
+
}
|
| 51 |
+
|
| 52 |
+
textarea {
|
| 53 |
+
height: 160px;
|
| 54 |
+
padding: 12px;
|
| 55 |
+
border-radius: 10px;
|
| 56 |
+
border: none;
|
| 57 |
+
outline: none;
|
| 58 |
+
font-size: 15px;
|
| 59 |
+
background: rgba(255, 255, 255, 0.08);
|
| 60 |
+
color: #fff;
|
| 61 |
+
resize: none;
|
| 62 |
+
transition: 0.3s ease;
|
| 63 |
+
}
|
| 64 |
+
|
| 65 |
+
textarea:focus {
|
| 66 |
+
box-shadow: 0 0 0 2px #fb923c;
|
| 67 |
+
}
|
| 68 |
+
|
| 69 |
+
button {
|
| 70 |
+
padding: 12px;
|
| 71 |
+
border-radius: 10px;
|
| 72 |
+
border: none;
|
| 73 |
+
font-size: 16px;
|
| 74 |
+
cursor: pointer;
|
| 75 |
+
background: linear-gradient(135deg, #fb923c, #f97316);
|
| 76 |
+
color: white;
|
| 77 |
+
font-weight: 600;
|
| 78 |
+
transition: all 0.3s ease;
|
| 79 |
+
box-shadow: 0 4px 15px rgba(251, 146, 60, 0.4);
|
| 80 |
+
}
|
| 81 |
+
|
| 82 |
+
button:hover {
|
| 83 |
+
transform: translateY(-2px) scale(1.02);
|
| 84 |
+
box-shadow: 0 6px 25px rgba(249, 115, 22, 0.6);
|
| 85 |
+
}
|
| 86 |
+
|
| 87 |
+
#summary-output {
|
| 88 |
+
margin-top: 20px;
|
| 89 |
+
padding: 15px;
|
| 90 |
+
border-radius: 10px;
|
| 91 |
+
background: rgba(251, 146, 60, 0.08);
|
| 92 |
+
border: 1px solid rgba(251, 146, 60, 0.3);
|
| 93 |
+
backdrop-filter: blur(6px);
|
| 94 |
+
}
|
| 95 |
+
|
| 96 |
+
#summary-heading {
|
| 97 |
+
margin-bottom: 10px;
|
| 98 |
+
font-weight: 600;
|
| 99 |
+
color: #fb923c;
|
| 100 |
+
}
|
app/templates/index.html
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
| 6 |
+
<meta name="description" content="Summarize dialogue and conversations instantly using a fine-tuned T5 transformer model.">
|
| 7 |
+
<title>Text Summarizer β T5 + FastAPI</title>
|
| 8 |
+
<link rel="stylesheet" href="/static/style.css">
|
| 9 |
+
</head>
|
| 10 |
+
<body>
|
| 11 |
+
<div class="container">
|
| 12 |
+
<h1>Text Summarizer</h1>
|
| 13 |
+
<h3>using HuggingFace Transformer</h3>
|
| 14 |
+
<p>Enter or paste your dialogue below for quick summarization.</p>
|
| 15 |
+
|
| 16 |
+
<form id="summarization-form">
|
| 17 |
+
<textarea id="dialogue-input" placeholder="Enter your dialogue here..." required></textarea>
|
| 18 |
+
<button type="submit" id="summarize-btn">Summarize</button>
|
| 19 |
+
</form>
|
| 20 |
+
|
| 21 |
+
<div id="summary-output">
|
| 22 |
+
<h3 id="summary-heading">Content Summary</h3>
|
| 23 |
+
<p id="summary-text"></p>
|
| 24 |
+
</div>
|
| 25 |
+
</div>
|
| 26 |
+
<script src="/static/script.js"></script>
|
| 27 |
+
</body>
|
| 28 |
+
</html>
|
requirements.txt
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
fastapi
|
| 2 |
+
uvicorn[standard]
|
| 3 |
+
transformers
|
| 4 |
+
torch
|
| 5 |
+
pydantic
|
| 6 |
+
jinja2
|
test-cases.md
ADDED
|
@@ -0,0 +1,508 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# π§ Dialogue Summarization β Test Cases
|
| 2 |
+
|
| 3 |
+
**15 Multi-Speaker Realistic Conversations for NLP Testing**
|
| 4 |
+
|
| 5 |
+
Each test case contains a multi-speaker dialogue and an **expected summary** (approximate β model output may vary in phrasing but should capture the same key points).
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## π§ͺ Test Case 1 β Sports: The Glory of Virat Kohli in Test Cricket
|
| 10 |
+
|
| 11 |
+
**Input Dialogue:**
|
| 12 |
+
|
| 13 |
+
> **Arjun:** Sometimes I feel like we didn't fully appreciate Kohli's dominance in Test cricket while it was happening. Especially that 2016β2019 phase β it was unreal.
|
| 14 |
+
>
|
| 15 |
+
> **Meera:** Absolutely. The way he transformed himself overseas was the most impressive part. Before him, we always had that narrative that Indian batsmen struggled in places like Australia and England.
|
| 16 |
+
>
|
| 17 |
+
> **Rohit:** And he didn't just survive there β he dominated. The 2018 Australia tour alone is enough to define a career. Four centuries in a single series, and every one of them under pressure.
|
| 18 |
+
>
|
| 19 |
+
> **Priya:** What stood out to me was his intensity. He treated every Test match like it was a World Cup final. That kind of mindset changed how the entire Indian team approached red-ball cricket.
|
| 20 |
+
>
|
| 21 |
+
> **Arjun:** Exactly. Under his captaincy, India became a fast-bowling powerhouse. Winning series in Australia, competing aggressively in England β that wasn't common before his era.
|
| 22 |
+
>
|
| 23 |
+
> **Meera:** And Virat's fitness revolution. He set a completely new benchmark. Suddenly, being supremely fit wasn't optional anymore β it was expected.
|
| 24 |
+
>
|
| 25 |
+
> **Rohit:** His conversion rate was insane too. Once he crossed fifty, you just knew a hundred was coming. That hunger for big scores is what separated him from others.
|
| 26 |
+
>
|
| 27 |
+
> **Priya:** Plus, the way he handled pressure. Whether it was a collapse around him or a tough pitch, he seemed to thrive in those situations rather than shrink.
|
| 28 |
+
>
|
| 29 |
+
> **Arjun:** I still think Virat's 149 at Edgbaston in 2018 is one of the greatest Test innings by an Indian. That was pure skill and determination.
|
| 30 |
+
>
|
| 31 |
+
> **Meera:** And the Adelaide 2014 innings β announcing himself as a fearless Test captain right from the start. He wasn't playing for a draw, he was playing to win.
|
| 32 |
+
>
|
| 33 |
+
> **Rohit:** That's the legacy. He didn't just score runs β he changed the attitude of Indian Test cricket. Made aggression and belief the default.
|
| 34 |
+
>
|
| 35 |
+
> **Priya:** Which is why even years later, when we talk about modern Test greats, his name is always right at the top.
|
| 36 |
+
|
| 37 |
+
**Expected Summary:**
|
| 38 |
+
Arjun, Meera, Rohit, and Priya discuss Virat Kohli's dominance in Test cricket, highlighting his overseas performance, fitness revolution, captaincy that built India's fast-bowling attack, and his legacy of changing Indian cricket's aggressive mindset.
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
+
|
| 42 |
+
## π§ͺ Test Case 2 β Entertainment: OTT Platforms and the Death of Cinema Culture
|
| 43 |
+
|
| 44 |
+
**Input Dialogue:**
|
| 45 |
+
|
| 46 |
+
> **Sana:** I genuinely cannot remember the last time I went to a multiplex.
|
| 47 |
+
>
|
| 48 |
+
> **Dev:** Same. Why leave my couch for overpriced popcorn?
|
| 49 |
+
>
|
| 50 |
+
> **Lena:** You're missing the point. Cinema is about shared experience.
|
| 51 |
+
>
|
| 52 |
+
> **Marcus:** Watching a kid react in a theater hits differently.
|
| 53 |
+
>
|
| 54 |
+
> **Sana:** New generation doesn't care about that.
|
| 55 |
+
>
|
| 56 |
+
> **Dev:** Studios know it too β theatrical windows are shrinking.
|
| 57 |
+
>
|
| 58 |
+
> **Lena:** We're losing mid-budget films entirely.
|
| 59 |
+
>
|
| 60 |
+
> **Marcus:** That's the real tragedy.
|
| 61 |
+
>
|
| 62 |
+
> **Dev:** But accessibility is better now.
|
| 63 |
+
>
|
| 64 |
+
> **Sana:** Still, something is lost in scrolling culture.
|
| 65 |
+
>
|
| 66 |
+
> **Lena:** Algorithms decide visibility now β that's dangerous.
|
| 67 |
+
|
| 68 |
+
**Expected Summary:**
|
| 69 |
+
Sana, Dev, Lena, and Marcus debate whether OTT platforms are killing cinema culture. While convenience and accessibility have improved, they agree that mid-budget films are disappearing and the shared theater experience is being lost to algorithm-driven scrolling.
|
| 70 |
+
|
| 71 |
+
---
|
| 72 |
+
|
| 73 |
+
## π§ͺ Test Case 3 β Technology: AI Replacing Jobs vs Creating Opportunities
|
| 74 |
+
|
| 75 |
+
**Input Dialogue:**
|
| 76 |
+
|
| 77 |
+
> **Nadia:** My design team got replaced by AI. Twelve people gone.
|
| 78 |
+
>
|
| 79 |
+
> **Chris:** Companies were already planning cuts β AI just accelerated it.
|
| 80 |
+
>
|
| 81 |
+
> **James:** Doesn't matter. People still lost jobs.
|
| 82 |
+
>
|
| 83 |
+
> **Priya:** In research, AI actually created more jobs.
|
| 84 |
+
>
|
| 85 |
+
> **Nadia:** That's a class divide issue.
|
| 86 |
+
>
|
| 87 |
+
> **Chris:** Speed of change is the real problem.
|
| 88 |
+
>
|
| 89 |
+
> **James:** Retraining narrative is unrealistic.
|
| 90 |
+
>
|
| 91 |
+
> **Priya:** Policy response is missing.
|
| 92 |
+
>
|
| 93 |
+
> **Nadia:** Regulation isn't keeping up.
|
| 94 |
+
>
|
| 95 |
+
> **Chris:** Tech moves faster than law.
|
| 96 |
+
>
|
| 97 |
+
> **James:** UBI might be inevitable.
|
| 98 |
+
>
|
| 99 |
+
> **Priya:** Market won't solve this alone.
|
| 100 |
+
|
| 101 |
+
**Expected Summary:**
|
| 102 |
+
Nadia, Chris, James, and Priya discuss AI-driven job displacement. While AI creates some new roles, the speed of change, lack of policy response, and inadequate retraining programs make the transition painful, leading them to consider UBI and regulation as necessary interventions.
|
| 103 |
+
|
| 104 |
+
---
|
| 105 |
+
|
| 106 |
+
## π§ͺ Test Case 4 β World Politics: Multipolar World Order
|
| 107 |
+
|
| 108 |
+
**Input Dialogue:**
|
| 109 |
+
|
| 110 |
+
> **Elena:** G7 feels symbolic now.
|
| 111 |
+
>
|
| 112 |
+
> **Omar:** Still holds institutional power.
|
| 113 |
+
>
|
| 114 |
+
> **Yuki:** But enforcement is weakening.
|
| 115 |
+
>
|
| 116 |
+
> **Elena:** Countries are adapting to sanctions.
|
| 117 |
+
>
|
| 118 |
+
> **Omar:** Dollar dominance still matters.
|
| 119 |
+
>
|
| 120 |
+
> **Carlos:** BRICS is growing.
|
| 121 |
+
>
|
| 122 |
+
> **Yuki:** World is becoming multipolar.
|
| 123 |
+
>
|
| 124 |
+
> **Elena:** Alliances are transactional now.
|
| 125 |
+
>
|
| 126 |
+
> **Omar:** That might increase flexibility.
|
| 127 |
+
>
|
| 128 |
+
> **Carlos:** Or increase conflict risk.
|
| 129 |
+
>
|
| 130 |
+
> **Yuki:** Nuclear deterrence changes everything.
|
| 131 |
+
|
| 132 |
+
**Expected Summary:**
|
| 133 |
+
Elena, Omar, Yuki, and Carlos discuss the shift from a unipolar to a multipolar world order. While the G7 retains institutional power and dollar dominance persists, BRICS growth, weakening enforcement, and transactional alliances signal a fundamental geopolitical transformation.
|
| 134 |
+
|
| 135 |
+
---
|
| 136 |
+
|
| 137 |
+
## π§ͺ Test Case 5 β Social Issues: Social Media & Teen Mental Health
|
| 138 |
+
|
| 139 |
+
**Input Dialogue:**
|
| 140 |
+
|
| 141 |
+
> **Dr. Anita:** Social media is now central to teen mental health issues.
|
| 142 |
+
>
|
| 143 |
+
> **Ravi:** Removing Instagram helped my daughter.
|
| 144 |
+
>
|
| 145 |
+
> **Sophia:** Even non-users feel excluded.
|
| 146 |
+
>
|
| 147 |
+
> **Dr. Anita:** Passive scrolling is most harmful.
|
| 148 |
+
>
|
| 149 |
+
> **Marcus:** Other factors also matter.
|
| 150 |
+
>
|
| 151 |
+
> **Ravi:** Social media amplifies everything.
|
| 152 |
+
>
|
| 153 |
+
> **Sophia:** Popularity is now quantified.
|
| 154 |
+
>
|
| 155 |
+
> **Dr. Anita:** That has clinical impact.
|
| 156 |
+
>
|
| 157 |
+
> **Marcus:** Solutions are unclear.
|
| 158 |
+
>
|
| 159 |
+
> **Sophia:** Platforms need redesign.
|
| 160 |
+
>
|
| 161 |
+
> **Ravi:** They won't self-regulate.
|
| 162 |
+
>
|
| 163 |
+
> **Dr. Anita:** Regulation is necessary.
|
| 164 |
+
|
| 165 |
+
**Expected Summary:**
|
| 166 |
+
Dr. Anita, Ravi, Sophia, and Marcus discuss the impact of social media on teen mental health. Passive scrolling and quantified popularity cause clinical harm. They conclude that platforms won't self-regulate and external regulation is necessary.
|
| 167 |
+
|
| 168 |
+
---
|
| 169 |
+
|
| 170 |
+
## π§ͺ Test Case 6 β Business: Startup Funding Winter
|
| 171 |
+
|
| 172 |
+
**Input Dialogue:**
|
| 173 |
+
|
| 174 |
+
> **Tara:** Couldn't raise Series C.
|
| 175 |
+
>
|
| 176 |
+
> **Joel:** VC expectations changed.
|
| 177 |
+
>
|
| 178 |
+
> **Zara:** That's healthy correction.
|
| 179 |
+
>
|
| 180 |
+
> **Tara:** It's existential for founders.
|
| 181 |
+
>
|
| 182 |
+
> **Joel:** What's your burn rate?
|
| 183 |
+
>
|
| 184 |
+
> **Tara:** We're cutting costs.
|
| 185 |
+
>
|
| 186 |
+
> **Zara:** Consider acquisition.
|
| 187 |
+
>
|
| 188 |
+
> **Tara:** Not viable yet.
|
| 189 |
+
>
|
| 190 |
+
> **Joel:** Bridge funding exists.
|
| 191 |
+
>
|
| 192 |
+
> **Tara:** Might pivot to enterprise.
|
| 193 |
+
>
|
| 194 |
+
> **Zara:** That's where money is.
|
| 195 |
+
>
|
| 196 |
+
> **Joel:** Timing is everything.
|
| 197 |
+
|
| 198 |
+
**Expected Summary:**
|
| 199 |
+
Tara struggles to raise Series C funding amid a VC correction. Joel and Zara suggest bridge funding, acquisition, or pivoting to enterprise. They acknowledge the funding winter is existential for founders but represents a healthy market correction.
|
| 200 |
+
|
| 201 |
+
---
|
| 202 |
+
|
| 203 |
+
## π§ͺ Test Case 7 β Casual: Europe Trip Planning
|
| 204 |
+
|
| 205 |
+
**Input Dialogue:**
|
| 206 |
+
|
| 207 |
+
> **Nina:** Interrail isn't too expensive.
|
| 208 |
+
>
|
| 209 |
+
> **Sam:** But I need comfort now.
|
| 210 |
+
>
|
| 211 |
+
> **Lucia:** Hostels are traumatic.
|
| 212 |
+
>
|
| 213 |
+
> **Ben:** That's the fun part.
|
| 214 |
+
>
|
| 215 |
+
> **Nina:** Focus β where are we going?
|
| 216 |
+
>
|
| 217 |
+
> **Lucia:** Lisbon to Italy route.
|
| 218 |
+
>
|
| 219 |
+
> **Ben:** Can't skip Paris.
|
| 220 |
+
>
|
| 221 |
+
> **Nina:** Paris is overrated now.
|
| 222 |
+
>
|
| 223 |
+
> **Sam:** What about Croatia?
|
| 224 |
+
>
|
| 225 |
+
> **Lucia:** Adds time.
|
| 226 |
+
>
|
| 227 |
+
> **Ben:** Cut something else.
|
| 228 |
+
>
|
| 229 |
+
> **Nina:** Let's just start somewhere.
|
| 230 |
+
>
|
| 231 |
+
> **Sam:** Lisbon confirmed.
|
| 232 |
+
|
| 233 |
+
**Expected Summary:**
|
| 234 |
+
Nina, Sam, Lucia, and Ben plan a Europe trip via Interrail. After debating comfort, hostels, and whether to include Paris or Croatia, they settle on starting from Lisbon with a route toward Italy.
|
| 235 |
+
|
| 236 |
+
---
|
| 237 |
+
|
| 238 |
+
## π§ͺ Test Case 8 β Workplace: Remote Work Debate
|
| 239 |
+
|
| 240 |
+
**Input Dialogue:**
|
| 241 |
+
|
| 242 |
+
> **David:** Return-to-office killed morale.
|
| 243 |
+
>
|
| 244 |
+
> **Amara:** What's leadership saying?
|
| 245 |
+
>
|
| 246 |
+
> **David:** "Culture" β but it's control.
|
| 247 |
+
>
|
| 248 |
+
> **Kevin:** Hybrid has value though.
|
| 249 |
+
>
|
| 250 |
+
> **Amara:** Needs better design, not mandates.
|
| 251 |
+
>
|
| 252 |
+
> **David:** Productivity was fine remotely.
|
| 253 |
+
>
|
| 254 |
+
> **Priya:** Data is mixed.
|
| 255 |
+
>
|
| 256 |
+
> **Amara:** Real estate plays a role.
|
| 257 |
+
>
|
| 258 |
+
> **Kevin:** Possibly.
|
| 259 |
+
>
|
| 260 |
+
> **Priya:** Impacts parents heavily.
|
| 261 |
+
>
|
| 262 |
+
> **David:** We lost talent because of it.
|
| 263 |
+
|
| 264 |
+
**Expected Summary:**
|
| 265 |
+
David, Amara, Kevin, and Priya debate return-to-office mandates. While leadership cites "culture," the group argues it's about control. Remote productivity was adequate, hybrid needs thoughtful design, and rigid mandates disproportionately impact parents and cause talent loss.
|
| 266 |
+
|
| 267 |
+
---
|
| 268 |
+
|
| 269 |
+
## π§ͺ Test Case 9 β AI: Open Source vs Proprietary Models
|
| 270 |
+
|
| 271 |
+
**Input Dialogue:**
|
| 272 |
+
|
| 273 |
+
> **Felix:** Open-source models changed everything.
|
| 274 |
+
>
|
| 275 |
+
> **Hana:** Benchmarks aren't everything.
|
| 276 |
+
>
|
| 277 |
+
> **Tom:** Privacy drives adoption.
|
| 278 |
+
>
|
| 279 |
+
> **Felix:** On-premise is key.
|
| 280 |
+
>
|
| 281 |
+
> **Hana:** Security risks exist.
|
| 282 |
+
>
|
| 283 |
+
> **Tom:** Same as any open-source system.
|
| 284 |
+
>
|
| 285 |
+
> **Hana:** But less mature ecosystem.
|
| 286 |
+
>
|
| 287 |
+
> **Felix:** Community is fast-growing.
|
| 288 |
+
>
|
| 289 |
+
> **Raj:** Power concentration is risky.
|
| 290 |
+
>
|
| 291 |
+
> **Hana:** Compute still centralized.
|
| 292 |
+
>
|
| 293 |
+
> **Tom:** Still better than nothing.
|
| 294 |
+
|
| 295 |
+
**Expected Summary:**
|
| 296 |
+
Felix, Hana, Tom, and Raj discuss open-source vs proprietary AI models. Open-source offers privacy and on-premise deployment, but faces security concerns and ecosystem immaturity. They agree that power concentration in proprietary AI is a significant risk.
|
| 297 |
+
|
| 298 |
+
---
|
| 299 |
+
|
| 300 |
+
## π§ͺ Test Case 10 β Social: Redefining Success
|
| 301 |
+
|
| 302 |
+
**Input Dialogue:**
|
| 303 |
+
|
| 304 |
+
> **Maya:** I rejected a promotion.
|
| 305 |
+
>
|
| 306 |
+
> **Jordan:** Old generation wouldn't.
|
| 307 |
+
>
|
| 308 |
+
> **Aisha:** Burnout isn't success.
|
| 309 |
+
>
|
| 310 |
+
> **Maya:** Saw it happen to my father.
|
| 311 |
+
>
|
| 312 |
+
> **Dev:** Privilege matters here.
|
| 313 |
+
>
|
| 314 |
+
> **Jordan:** True.
|
| 315 |
+
>
|
| 316 |
+
> **Maya:** But mindset is shifting.
|
| 317 |
+
>
|
| 318 |
+
> **Aisha:** Pandemic changed priorities.
|
| 319 |
+
>
|
| 320 |
+
> **Dev:** Economics still constrain choices.
|
| 321 |
+
>
|
| 322 |
+
> **Jordan:** Meaningful work matters now.
|
| 323 |
+
>
|
| 324 |
+
> **Maya:** Not sacrificing life for work.
|
| 325 |
+
>
|
| 326 |
+
> **Aisha:** System still ties identity to jobs.
|
| 327 |
+
|
| 328 |
+
**Expected Summary:**
|
| 329 |
+
Maya, Jordan, Aisha, and Dev discuss redefining success beyond career advancement. Maya rejected a promotion to avoid burnout. While privilege affects who can make such choices, the pandemic shifted priorities toward meaningful work and work-life balance.
|
| 330 |
+
|
| 331 |
+
---
|
| 332 |
+
|
| 333 |
+
## π§ͺ Test Case 11 β Sports: Women's Football Growth
|
| 334 |
+
|
| 335 |
+
**Input Dialogue:**
|
| 336 |
+
|
| 337 |
+
> **Fatima:** Viewership is massive now.
|
| 338 |
+
>
|
| 339 |
+
> **Chris:** Quality improved significantly.
|
| 340 |
+
>
|
| 341 |
+
> **Sara:** Infrastructure caught up.
|
| 342 |
+
>
|
| 343 |
+
> **Fatima:** Spain is dominant.
|
| 344 |
+
>
|
| 345 |
+
> **Mike:** Is dominance good?
|
| 346 |
+
>
|
| 347 |
+
> **Chris:** It builds narratives.
|
| 348 |
+
>
|
| 349 |
+
> **Sara:** US transition is interesting.
|
| 350 |
+
>
|
| 351 |
+
> **Fatima:** New stars emerging.
|
| 352 |
+
>
|
| 353 |
+
> **Mike:** Star power expands audience.
|
| 354 |
+
>
|
| 355 |
+
> **Chris:** Pay gap still exists.
|
| 356 |
+
>
|
| 357 |
+
> **Sara:** Access is unequal.
|
| 358 |
+
>
|
| 359 |
+
> **Fatima:** Academies must improve.
|
| 360 |
+
|
| 361 |
+
**Expected Summary:**
|
| 362 |
+
Fatima, Chris, Sara, and Mike discuss the growth of women's football. Viewership and quality have surged, Spain is dominant, and new stars are emerging. However, the pay gap persists and academy-level access remains unequal.
|
| 363 |
+
|
| 364 |
+
---
|
| 365 |
+
|
| 366 |
+
## π§ͺ Test Case 12 β Climate: Individual vs Systemic Action
|
| 367 |
+
|
| 368 |
+
**Input Dialogue:**
|
| 369 |
+
|
| 370 |
+
> **Elena:** Do personal actions even matter?
|
| 371 |
+
>
|
| 372 |
+
> **Reem:** They shift norms.
|
| 373 |
+
>
|
| 374 |
+
> **Paul:** Carbon footprint idea was PR.
|
| 375 |
+
>
|
| 376 |
+
> **Elena:** Still not meaningless.
|
| 377 |
+
>
|
| 378 |
+
> **Reem:** Policy matters more.
|
| 379 |
+
>
|
| 380 |
+
> **Tomas:** Political will is key.
|
| 381 |
+
>
|
| 382 |
+
> **Paul:** Focus energy wisely.
|
| 383 |
+
>
|
| 384 |
+
> **Elena:** Both personal and political matter.
|
| 385 |
+
>
|
| 386 |
+
> **Reem:** It's about balance.
|
| 387 |
+
>
|
| 388 |
+
> **Tomas:** Renewables are scaling fast.
|
| 389 |
+
>
|
| 390 |
+
> **Paul:** But inequality persists.
|
| 391 |
+
>
|
| 392 |
+
> **Elena:** Climate justice matters.
|
| 393 |
+
|
| 394 |
+
**Expected Summary:**
|
| 395 |
+
Elena, Reem, Paul, and Tomas debate individual vs systemic climate action. While personal actions shift cultural norms, policy and political will are more impactful. They agree both approaches are needed, with climate justice and renewable energy scaling being top priorities.
|
| 396 |
+
|
| 397 |
+
---
|
| 398 |
+
|
| 399 |
+
## π§ͺ Test Case 13 β Ethics: Deepfakes and Digital Safety
|
| 400 |
+
|
| 401 |
+
**Input Dialogue:**
|
| 402 |
+
|
| 403 |
+
> **Leah:** My colleague was deepfaked.
|
| 404 |
+
>
|
| 405 |
+
> **Marcus:** Detection is slow.
|
| 406 |
+
>
|
| 407 |
+
> **Soo:** Jurisdiction is messy.
|
| 408 |
+
>
|
| 409 |
+
> **Leah:** Platforms inconsistent.
|
| 410 |
+
>
|
| 411 |
+
> **Marcus:** There are good use cases too.
|
| 412 |
+
>
|
| 413 |
+
> **Soo:** Governance is lagging.
|
| 414 |
+
>
|
| 415 |
+
> **Theo:** What about watermarking?
|
| 416 |
+
>
|
| 417 |
+
> **Leah:** Bad actors won't comply.
|
| 418 |
+
>
|
| 419 |
+
> **Theo:** Make it mandatory.
|
| 420 |
+
>
|
| 421 |
+
> **Marcus:** Politically complex.
|
| 422 |
+
>
|
| 423 |
+
> **Soo:** Safety matters more.
|
| 424 |
+
>
|
| 425 |
+
> **Leah:** Damage is long-lasting.
|
| 426 |
+
|
| 427 |
+
**Expected Summary:**
|
| 428 |
+
Leah, Marcus, Soo, and Theo discuss deepfake threats after a colleague was targeted. Detection is slow, jurisdictions are inconsistent, and governance lags behind the technology. While watermarking is proposed, enforcement challenges remain and the long-lasting damage to victims is the core concern.
|
| 429 |
+
|
| 430 |
+
---
|
| 431 |
+
|
| 432 |
+
## π§ͺ Test Case 14 β Travel: Over-Tourism and Local Impact
|
| 433 |
+
|
| 434 |
+
**Input Dialogue:**
|
| 435 |
+
|
| 436 |
+
> **Carmen:** My neighborhood is gone.
|
| 437 |
+
>
|
| 438 |
+
> **Finn:** Regulation can help.
|
| 439 |
+
>
|
| 440 |
+
> **Carmen:** Enforcement is weak.
|
| 441 |
+
>
|
| 442 |
+
> **Yuna:** Global issue now.
|
| 443 |
+
>
|
| 444 |
+
> **Finn:** Tourism brings money.
|
| 445 |
+
>
|
| 446 |
+
> **Carmen:** Locals are displaced.
|
| 447 |
+
>
|
| 448 |
+
> **Yuna:** Cultural loss is real.
|
| 449 |
+
>
|
| 450 |
+
> **Finn:** Authenticity is subjective.
|
| 451 |
+
>
|
| 452 |
+
> **Carmen:** It's about livability.
|
| 453 |
+
>
|
| 454 |
+
> **Yuna:** Visitor caps help.
|
| 455 |
+
>
|
| 456 |
+
> **Finn:** But elitist.
|
| 457 |
+
>
|
| 458 |
+
> **Carmen:** Current system isn't fair.
|
| 459 |
+
|
| 460 |
+
**Expected Summary:**
|
| 461 |
+
Carmen, Finn, and Yuna discuss over-tourism's impact on local communities. While tourism brings revenue, it displaces residents, erodes culture, and reduces livability. Visitor caps are suggested but debated as potentially elitist. The current system is seen as unfair to locals.
|
| 462 |
+
|
| 463 |
+
---
|
| 464 |
+
|
| 465 |
+
## π§ͺ Test Case 15 β Mental Health in High-Pressure Professions
|
| 466 |
+
|
| 467 |
+
**Input Dialogue:**
|
| 468 |
+
|
| 469 |
+
> **Dr. James:** Doctors hide mental health issues.
|
| 470 |
+
>
|
| 471 |
+
> **Nora:** System reinforces stigma.
|
| 472 |
+
>
|
| 473 |
+
> **Sam:** Same in law.
|
| 474 |
+
>
|
| 475 |
+
> **Dr. James:** Wellness is performative.
|
| 476 |
+
>
|
| 477 |
+
> **Nora:** Can increase shame.
|
| 478 |
+
>
|
| 479 |
+
> **Maya:** Culture matters more.
|
| 480 |
+
>
|
| 481 |
+
> **Sam:** Not everyone can switch careers.
|
| 482 |
+
>
|
| 483 |
+
> **Dr. James:** Regulation needed.
|
| 484 |
+
>
|
| 485 |
+
> **Nora:** Some progress exists.
|
| 486 |
+
>
|
| 487 |
+
> **Maya:** Peer support helps.
|
| 488 |
+
>
|
| 489 |
+
> **Sam:** Colleagues are easier to talk to.
|
| 490 |
+
>
|
| 491 |
+
> **Dr. James:** Leadership drives change.
|
| 492 |
+
|
| 493 |
+
**Expected Summary:**
|
| 494 |
+
Dr. James, Nora, Sam, and Maya discuss mental health stigma in medicine and law. Wellness programs are often performative and can increase shame. They agree that peer support, cultural change led by leadership, and regulation are needed to make real progress.
|
| 495 |
+
|
| 496 |
+
---
|
| 497 |
+
|
| 498 |
+
# β
Usage Notes
|
| 499 |
+
|
| 500 |
+
- Designed for **dialogue summarization models** (fine-tuned T5, BART, etc.)
|
| 501 |
+
- Includes **multi-speaker complexity** across all test cases
|
| 502 |
+
- Covers **diverse domains**: sports, tech, politics, climate, entertainment, business, ethics, workplace, travel, social issues
|
| 503 |
+
- **Expected summaries** are approximate β model output should capture similar key points but may vary in phrasing
|
| 504 |
+
- Useful for:
|
| 505 |
+
- LLM evaluation and benchmarking
|
| 506 |
+
- NLP model comparison
|
| 507 |
+
- Prompt engineering experiments
|
| 508 |
+
- Regression testing during model updates
|