unnat17 commited on
Commit
7e67197
Β·
verified Β·
1 Parent(s): dafa3f9

Upload folder using huggingface_hub

Browse files
Files changed (9) hide show
  1. .gitignore +28 -0
  2. Dockerfile +19 -0
  3. README.md +103 -8
  4. app/app.py +85 -0
  5. app/static/script.js +32 -0
  6. app/static/style.css +100 -0
  7. app/templates/index.html +28 -0
  8. requirements.txt +6 -0
  9. test-cases.md +508 -0
.gitignore ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Virtual environments
2
+ .venv/
3
+ venv/
4
+ env/
5
+
6
+ # Python bytecode
7
+ __pycache__/
8
+ *.pyc
9
+ *.pyo
10
+
11
+ # Jupyter
12
+ .ipynb_checkpoints/
13
+
14
+ # Environment variables
15
+ .env
16
+
17
+ # Model weights (too large for GitHub β€” load from Hugging Face Hub)
18
+ final_model/
19
+
20
+ # OS files
21
+ .DS_Store
22
+ Thumbs.db
23
+
24
+ # IDE / editor
25
+ .vscode/
26
+ .idea/
27
+ *.swp
28
+ *.swo
Dockerfile ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.9-slim
2
+
3
+ # Set working directory
4
+ WORKDIR /code
5
+
6
+ # Copy requirements first for better caching
7
+ COPY ./requirements.txt /code/requirements.txt
8
+
9
+ # Install dependencies
10
+ RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
11
+
12
+ # Copy the app code
13
+ COPY ./app /code/app
14
+
15
+ # Set environment variables
16
+ ENV MODEL_ID="unnat17/Text-Summarizer"
17
+
18
+ # HF Spaces run on port 7860 by default
19
+ CMD ["uvicorn", "app.app:app", "--host", "0.0.0.0", "--port", "7860"]
README.md CHANGED
@@ -1,10 +1,105 @@
1
- ---
2
- title: Text Summarizer Demo
3
- emoji: πŸƒ
4
- colorFrom: purple
5
- colorTo: blue
6
- sdk: docker
7
- pinned: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
+ # Text Summarizer
2
+
3
+ ## Project Overview
4
+
5
+ A full-stack web application that generates concise summaries from user-provided text. The backend uses a T5-small transformer model served through a FastAPI endpoint, while the frontend provides a minimal interface for text input and summary display.
6
+
7
+ ## Features
8
+
9
+ - Text input via a web-based interface
10
+ - Server-side summarization using a T5-small model with beam search decoding
11
+ - Automatic device selection (CUDA, MPS, or CPU)
12
+ - Input preprocessing: whitespace normalization, HTML tag removal, case folding
13
+ - Asynchronous request handling via FastAPI
14
+
15
+ ## Tech Stack
16
+
17
+ | Layer | Technology |
18
+ |------------|------------------------------------|
19
+ | Backend | Python, FastAPI, Uvicorn |
20
+ | Frontend | HTML, CSS, JavaScript |
21
+ | ML | Hugging Face Transformers, PyTorch |
22
+ | Model | T5-small |
23
+ | Templating | Jinja2 |
24
+
25
+ ## Model Details
26
+
27
+ - **Architecture:** T5-small (Text-to-Text Transfer Transformer)
28
+ - **Input format:** Raw text prefixed with `summarize:`
29
+ - **Tokenization:** `T5Tokenizer` with padding and truncation at 512 tokens
30
+ - **Decoding:** Beam search (`num_beams=4`) with `max_length=150` and early stopping
31
+ - **Inference:** Runs under `torch.no_grad()` for reduced memory usage
32
+
33
+ ## Project Structure
34
+
35
+ ```
36
+ text-summarizer/
37
+ β”œβ”€β”€ app/
38
+ β”‚ β”œβ”€β”€ app.py # FastAPI application and inference logic
39
+ β”‚ β”œβ”€β”€ static/
40
+ β”‚ β”‚ β”œβ”€β”€ style.css # Frontend styles
41
+ β”‚ β”‚ └── script.js # Client-side form handling and API calls
42
+ β”‚ └── templates/
43
+ β”‚ └── index.html # Main UI template (Jinja2)
44
+ β”œβ”€β”€ requirements.txt # Python dependencies
45
+ β”œβ”€β”€ test-cases.md # Test scenarios and expected outputs
46
+ └── .gitignore
47
+ ```
48
+
49
+ ## Installation and Setup
50
+
51
+ **1. Clone the repository**
52
+
53
+ ```bash
54
+ git clone https://github.com/unnat-git/Text-Summarizer.git
55
+ cd Text-Summarizer
56
+ ```
57
+
58
+ **2. Create and activate a virtual environment**
59
+
60
+ ```bash
61
+ python -m venv venv
62
+ source venv/bin/activate # Linux/macOS
63
+ venv\Scripts\activate # Windows
64
+ ```
65
+
66
+ **3. Install dependencies**
67
+
68
+ ```bash
69
+ pip install -r requirements.txt
70
+ ```
71
+
72
+ **4. Run the application**
73
+
74
+ ```bash
75
+ uvicorn app.app:app --reload
76
+ ```
77
+
78
+ The application will be available at `http://127.0.0.1:8000`.
79
+
80
+ ## Usage
81
+
82
+ 1. Open the application in a browser.
83
+ 2. Enter or paste text into the input field.
84
+ 3. Click **Summarize**.
85
+ 4. The generated summary appears below the input area.
86
+
87
+ The `/summarize` endpoint also accepts direct POST requests:
88
+
89
+ ```bash
90
+ curl -X POST http://127.0.0.1:8000/summarize \
91
+ -H "Content-Type: application/json" \
92
+ -d '{ "dialogue": "Your text here." }'
93
+ ```
94
+
95
+ ## Future Improvements
96
+
97
+ - Support for additional input formats (PDF, URL extraction)
98
+ - Adjustable summary length parameter
99
+ - Response caching for repeated inputs
100
+ - Containerized deployment (Docker)
101
+ - Hosting on a cloud platform (Render, Railway)
102
+
103
  ---
104
 
105
+ Built with ❀️ as a portfolio project showcasing NLP, deep learning, and full-stack Python development.
app/app.py ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import re
3
+
4
+ import torch
5
+ from fastapi import FastAPI, Request
6
+ from fastapi.responses import HTMLResponse
7
+ from fastapi.staticfiles import StaticFiles
8
+ from fastapi.templating import Jinja2Templates
9
+ from pydantic import BaseModel
10
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
11
+
12
+ BASE_DIR = os.path.dirname(os.path.abspath(__file__))
13
+
14
+ app = FastAPI(
15
+ title="Text Summarizer App",
16
+ description="Dialogue summarization powered by a fine-tuned T5 model",
17
+ version="1.0",
18
+ )
19
+
20
+ templates = Jinja2Templates(directory=os.path.join(BASE_DIR, "templates"))
21
+ app.mount(
22
+ "/static",
23
+ StaticFiles(directory=os.path.join(BASE_DIR, "static")),
24
+ name="static",
25
+ )
26
+
27
+ MODEL_ID = os.getenv("MODEL_ID", "unnat17/Text-Summarizer")
28
+
29
+ model = T5ForConditionalGeneration.from_pretrained(MODEL_ID)
30
+ tokenizer = T5Tokenizer.from_pretrained(MODEL_ID)
31
+
32
+ if torch.cuda.is_available():
33
+ device = torch.device("cuda")
34
+ elif torch.backends.mps.is_available():
35
+ device = torch.device("mps")
36
+ else:
37
+ device = torch.device("cpu")
38
+
39
+ model.to(device)
40
+ model.eval()
41
+
42
+
43
+ class DialogueInput(BaseModel):
44
+ dialogue: str
45
+
46
+
47
+ def clean_text(text: str) -> str:
48
+ text = re.sub(r"\r\n", " ", text)
49
+ text = re.sub(r"\s+", " ", text)
50
+ text = re.sub(r"<.*?>", " ", text)
51
+ return text.strip().lower()
52
+
53
+
54
+ def summarize_dialogue(dialogue: str) -> str:
55
+ dialogue = "summarize: " + clean_text(dialogue)
56
+
57
+ inputs = tokenizer(
58
+ dialogue,
59
+ padding="max_length",
60
+ max_length=512,
61
+ truncation=True,
62
+ return_tensors="pt",
63
+ ).to(device)
64
+
65
+ with torch.no_grad():
66
+ output_ids = model.generate(
67
+ input_ids=inputs["input_ids"],
68
+ attention_mask=inputs["attention_mask"],
69
+ max_length=150,
70
+ num_beams=4,
71
+ early_stopping=True,
72
+ )
73
+
74
+ return tokenizer.decode(output_ids[0], skip_special_tokens=True)
75
+
76
+
77
+ @app.get("/", response_class=HTMLResponse)
78
+ async def home(request: Request):
79
+ return templates.TemplateResponse(request=request, name="index.html")
80
+
81
+
82
+ @app.post("/summarize")
83
+ async def summarize(dialogue_input: DialogueInput):
84
+ summary = summarize_dialogue(dialogue_input.dialogue)
85
+ return {"summary": summary}
app/static/script.js ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ document.getElementById("summarization-form").addEventListener("submit", async (e) => {
2
+ e.preventDefault();
3
+
4
+ const dialogueInput = document.getElementById("dialogue-input");
5
+ const summaryText = document.getElementById("summary-text");
6
+ const submitButton = document.getElementById("summarize-btn");
7
+
8
+ const dialogue = dialogueInput.value.trim();
9
+ if (!dialogue) return;
10
+
11
+ summaryText.innerText = "Processing...";
12
+ submitButton.disabled = true;
13
+
14
+ try {
15
+ const response = await fetch("/summarize", {
16
+ method: "POST",
17
+ headers: { "Content-Type": "application/json" },
18
+ body: JSON.stringify({ dialogue }),
19
+ });
20
+
21
+ if (!response.ok) {
22
+ throw new Error(`Server error: ${response.status}`);
23
+ }
24
+
25
+ const data = await response.json();
26
+ summaryText.innerText = data.summary || "No summary returned.";
27
+ } catch (err) {
28
+ summaryText.innerText = `Error: ${err.message}`;
29
+ } finally {
30
+ submitButton.disabled = false;
31
+ }
32
+ });
app/static/style.css ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ * {
2
+ margin: 0;
3
+ padding: 0;
4
+ box-sizing: border-box;
5
+ }
6
+
7
+ body {
8
+ font-family: "Segoe UI", Roboto, sans-serif;
9
+ height: 100vh;
10
+ display: flex;
11
+ justify-content: center;
12
+ align-items: center;
13
+ background: linear-gradient(135deg, #1e293b, #0f172a);
14
+ color: #e2e8f0;
15
+ }
16
+
17
+ .container {
18
+ width: 100%;
19
+ max-width: 650px;
20
+ padding: 30px;
21
+ border-radius: 16px;
22
+ background: rgba(255, 255, 255, 0.05);
23
+ backdrop-filter: blur(12px);
24
+ -webkit-backdrop-filter: blur(12px);
25
+ box-shadow: 0 8px 32px rgba(0, 0, 0, 0.4);
26
+ border: 1px solid rgba(255, 255, 255, 0.1);
27
+ }
28
+
29
+ h1 {
30
+ text-align: center;
31
+ font-size: 2rem;
32
+ margin-bottom: 5px;
33
+ background: linear-gradient(90deg, #fb923c, #f97316);
34
+ -webkit-background-clip: text;
35
+ -webkit-text-fill-color: transparent;
36
+ }
37
+
38
+ h3 {
39
+ text-align: center;
40
+ margin-bottom: 30px;
41
+ font-weight: 400;
42
+ color: #cbd5f5;
43
+ }
44
+
45
+ form {
46
+ padding: 5px;
47
+ display: flex;
48
+ flex-direction: column;
49
+ gap: 15px;
50
+ }
51
+
52
+ textarea {
53
+ height: 160px;
54
+ padding: 12px;
55
+ border-radius: 10px;
56
+ border: none;
57
+ outline: none;
58
+ font-size: 15px;
59
+ background: rgba(255, 255, 255, 0.08);
60
+ color: #fff;
61
+ resize: none;
62
+ transition: 0.3s ease;
63
+ }
64
+
65
+ textarea:focus {
66
+ box-shadow: 0 0 0 2px #fb923c;
67
+ }
68
+
69
+ button {
70
+ padding: 12px;
71
+ border-radius: 10px;
72
+ border: none;
73
+ font-size: 16px;
74
+ cursor: pointer;
75
+ background: linear-gradient(135deg, #fb923c, #f97316);
76
+ color: white;
77
+ font-weight: 600;
78
+ transition: all 0.3s ease;
79
+ box-shadow: 0 4px 15px rgba(251, 146, 60, 0.4);
80
+ }
81
+
82
+ button:hover {
83
+ transform: translateY(-2px) scale(1.02);
84
+ box-shadow: 0 6px 25px rgba(249, 115, 22, 0.6);
85
+ }
86
+
87
+ #summary-output {
88
+ margin-top: 20px;
89
+ padding: 15px;
90
+ border-radius: 10px;
91
+ background: rgba(251, 146, 60, 0.08);
92
+ border: 1px solid rgba(251, 146, 60, 0.3);
93
+ backdrop-filter: blur(6px);
94
+ }
95
+
96
+ #summary-heading {
97
+ margin-bottom: 10px;
98
+ font-weight: 600;
99
+ color: #fb923c;
100
+ }
app/templates/index.html ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <meta name="description" content="Summarize dialogue and conversations instantly using a fine-tuned T5 transformer model.">
7
+ <title>Text Summarizer β€” T5 + FastAPI</title>
8
+ <link rel="stylesheet" href="/static/style.css">
9
+ </head>
10
+ <body>
11
+ <div class="container">
12
+ <h1>Text Summarizer</h1>
13
+ <h3>using HuggingFace Transformer</h3>
14
+ <p>Enter or paste your dialogue below for quick summarization.</p>
15
+
16
+ <form id="summarization-form">
17
+ <textarea id="dialogue-input" placeholder="Enter your dialogue here..." required></textarea>
18
+ <button type="submit" id="summarize-btn">Summarize</button>
19
+ </form>
20
+
21
+ <div id="summary-output">
22
+ <h3 id="summary-heading">Content Summary</h3>
23
+ <p id="summary-text"></p>
24
+ </div>
25
+ </div>
26
+ <script src="/static/script.js"></script>
27
+ </body>
28
+ </html>
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ fastapi
2
+ uvicorn[standard]
3
+ transformers
4
+ torch
5
+ pydantic
6
+ jinja2
test-cases.md ADDED
@@ -0,0 +1,508 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🧠 Dialogue Summarization β€” Test Cases
2
+
3
+ **15 Multi-Speaker Realistic Conversations for NLP Testing**
4
+
5
+ Each test case contains a multi-speaker dialogue and an **expected summary** (approximate β€” model output may vary in phrasing but should capture the same key points).
6
+
7
+ ---
8
+
9
+ ## πŸ§ͺ Test Case 1 β€” Sports: The Glory of Virat Kohli in Test Cricket
10
+
11
+ **Input Dialogue:**
12
+
13
+ > **Arjun:** Sometimes I feel like we didn't fully appreciate Kohli's dominance in Test cricket while it was happening. Especially that 2016–2019 phase β€” it was unreal.
14
+ >
15
+ > **Meera:** Absolutely. The way he transformed himself overseas was the most impressive part. Before him, we always had that narrative that Indian batsmen struggled in places like Australia and England.
16
+ >
17
+ > **Rohit:** And he didn't just survive there β€” he dominated. The 2018 Australia tour alone is enough to define a career. Four centuries in a single series, and every one of them under pressure.
18
+ >
19
+ > **Priya:** What stood out to me was his intensity. He treated every Test match like it was a World Cup final. That kind of mindset changed how the entire Indian team approached red-ball cricket.
20
+ >
21
+ > **Arjun:** Exactly. Under his captaincy, India became a fast-bowling powerhouse. Winning series in Australia, competing aggressively in England β€” that wasn't common before his era.
22
+ >
23
+ > **Meera:** And Virat's fitness revolution. He set a completely new benchmark. Suddenly, being supremely fit wasn't optional anymore β€” it was expected.
24
+ >
25
+ > **Rohit:** His conversion rate was insane too. Once he crossed fifty, you just knew a hundred was coming. That hunger for big scores is what separated him from others.
26
+ >
27
+ > **Priya:** Plus, the way he handled pressure. Whether it was a collapse around him or a tough pitch, he seemed to thrive in those situations rather than shrink.
28
+ >
29
+ > **Arjun:** I still think Virat's 149 at Edgbaston in 2018 is one of the greatest Test innings by an Indian. That was pure skill and determination.
30
+ >
31
+ > **Meera:** And the Adelaide 2014 innings β€” announcing himself as a fearless Test captain right from the start. He wasn't playing for a draw, he was playing to win.
32
+ >
33
+ > **Rohit:** That's the legacy. He didn't just score runs β€” he changed the attitude of Indian Test cricket. Made aggression and belief the default.
34
+ >
35
+ > **Priya:** Which is why even years later, when we talk about modern Test greats, his name is always right at the top.
36
+
37
+ **Expected Summary:**
38
+ Arjun, Meera, Rohit, and Priya discuss Virat Kohli's dominance in Test cricket, highlighting his overseas performance, fitness revolution, captaincy that built India's fast-bowling attack, and his legacy of changing Indian cricket's aggressive mindset.
39
+
40
+ ---
41
+
42
+ ## πŸ§ͺ Test Case 2 β€” Entertainment: OTT Platforms and the Death of Cinema Culture
43
+
44
+ **Input Dialogue:**
45
+
46
+ > **Sana:** I genuinely cannot remember the last time I went to a multiplex.
47
+ >
48
+ > **Dev:** Same. Why leave my couch for overpriced popcorn?
49
+ >
50
+ > **Lena:** You're missing the point. Cinema is about shared experience.
51
+ >
52
+ > **Marcus:** Watching a kid react in a theater hits differently.
53
+ >
54
+ > **Sana:** New generation doesn't care about that.
55
+ >
56
+ > **Dev:** Studios know it too β€” theatrical windows are shrinking.
57
+ >
58
+ > **Lena:** We're losing mid-budget films entirely.
59
+ >
60
+ > **Marcus:** That's the real tragedy.
61
+ >
62
+ > **Dev:** But accessibility is better now.
63
+ >
64
+ > **Sana:** Still, something is lost in scrolling culture.
65
+ >
66
+ > **Lena:** Algorithms decide visibility now β€” that's dangerous.
67
+
68
+ **Expected Summary:**
69
+ Sana, Dev, Lena, and Marcus debate whether OTT platforms are killing cinema culture. While convenience and accessibility have improved, they agree that mid-budget films are disappearing and the shared theater experience is being lost to algorithm-driven scrolling.
70
+
71
+ ---
72
+
73
+ ## πŸ§ͺ Test Case 3 β€” Technology: AI Replacing Jobs vs Creating Opportunities
74
+
75
+ **Input Dialogue:**
76
+
77
+ > **Nadia:** My design team got replaced by AI. Twelve people gone.
78
+ >
79
+ > **Chris:** Companies were already planning cuts β€” AI just accelerated it.
80
+ >
81
+ > **James:** Doesn't matter. People still lost jobs.
82
+ >
83
+ > **Priya:** In research, AI actually created more jobs.
84
+ >
85
+ > **Nadia:** That's a class divide issue.
86
+ >
87
+ > **Chris:** Speed of change is the real problem.
88
+ >
89
+ > **James:** Retraining narrative is unrealistic.
90
+ >
91
+ > **Priya:** Policy response is missing.
92
+ >
93
+ > **Nadia:** Regulation isn't keeping up.
94
+ >
95
+ > **Chris:** Tech moves faster than law.
96
+ >
97
+ > **James:** UBI might be inevitable.
98
+ >
99
+ > **Priya:** Market won't solve this alone.
100
+
101
+ **Expected Summary:**
102
+ Nadia, Chris, James, and Priya discuss AI-driven job displacement. While AI creates some new roles, the speed of change, lack of policy response, and inadequate retraining programs make the transition painful, leading them to consider UBI and regulation as necessary interventions.
103
+
104
+ ---
105
+
106
+ ## πŸ§ͺ Test Case 4 β€” World Politics: Multipolar World Order
107
+
108
+ **Input Dialogue:**
109
+
110
+ > **Elena:** G7 feels symbolic now.
111
+ >
112
+ > **Omar:** Still holds institutional power.
113
+ >
114
+ > **Yuki:** But enforcement is weakening.
115
+ >
116
+ > **Elena:** Countries are adapting to sanctions.
117
+ >
118
+ > **Omar:** Dollar dominance still matters.
119
+ >
120
+ > **Carlos:** BRICS is growing.
121
+ >
122
+ > **Yuki:** World is becoming multipolar.
123
+ >
124
+ > **Elena:** Alliances are transactional now.
125
+ >
126
+ > **Omar:** That might increase flexibility.
127
+ >
128
+ > **Carlos:** Or increase conflict risk.
129
+ >
130
+ > **Yuki:** Nuclear deterrence changes everything.
131
+
132
+ **Expected Summary:**
133
+ Elena, Omar, Yuki, and Carlos discuss the shift from a unipolar to a multipolar world order. While the G7 retains institutional power and dollar dominance persists, BRICS growth, weakening enforcement, and transactional alliances signal a fundamental geopolitical transformation.
134
+
135
+ ---
136
+
137
+ ## πŸ§ͺ Test Case 5 β€” Social Issues: Social Media & Teen Mental Health
138
+
139
+ **Input Dialogue:**
140
+
141
+ > **Dr. Anita:** Social media is now central to teen mental health issues.
142
+ >
143
+ > **Ravi:** Removing Instagram helped my daughter.
144
+ >
145
+ > **Sophia:** Even non-users feel excluded.
146
+ >
147
+ > **Dr. Anita:** Passive scrolling is most harmful.
148
+ >
149
+ > **Marcus:** Other factors also matter.
150
+ >
151
+ > **Ravi:** Social media amplifies everything.
152
+ >
153
+ > **Sophia:** Popularity is now quantified.
154
+ >
155
+ > **Dr. Anita:** That has clinical impact.
156
+ >
157
+ > **Marcus:** Solutions are unclear.
158
+ >
159
+ > **Sophia:** Platforms need redesign.
160
+ >
161
+ > **Ravi:** They won't self-regulate.
162
+ >
163
+ > **Dr. Anita:** Regulation is necessary.
164
+
165
+ **Expected Summary:**
166
+ Dr. Anita, Ravi, Sophia, and Marcus discuss the impact of social media on teen mental health. Passive scrolling and quantified popularity cause clinical harm. They conclude that platforms won't self-regulate and external regulation is necessary.
167
+
168
+ ---
169
+
170
+ ## πŸ§ͺ Test Case 6 β€” Business: Startup Funding Winter
171
+
172
+ **Input Dialogue:**
173
+
174
+ > **Tara:** Couldn't raise Series C.
175
+ >
176
+ > **Joel:** VC expectations changed.
177
+ >
178
+ > **Zara:** That's healthy correction.
179
+ >
180
+ > **Tara:** It's existential for founders.
181
+ >
182
+ > **Joel:** What's your burn rate?
183
+ >
184
+ > **Tara:** We're cutting costs.
185
+ >
186
+ > **Zara:** Consider acquisition.
187
+ >
188
+ > **Tara:** Not viable yet.
189
+ >
190
+ > **Joel:** Bridge funding exists.
191
+ >
192
+ > **Tara:** Might pivot to enterprise.
193
+ >
194
+ > **Zara:** That's where money is.
195
+ >
196
+ > **Joel:** Timing is everything.
197
+
198
+ **Expected Summary:**
199
+ Tara struggles to raise Series C funding amid a VC correction. Joel and Zara suggest bridge funding, acquisition, or pivoting to enterprise. They acknowledge the funding winter is existential for founders but represents a healthy market correction.
200
+
201
+ ---
202
+
203
+ ## πŸ§ͺ Test Case 7 β€” Casual: Europe Trip Planning
204
+
205
+ **Input Dialogue:**
206
+
207
+ > **Nina:** Interrail isn't too expensive.
208
+ >
209
+ > **Sam:** But I need comfort now.
210
+ >
211
+ > **Lucia:** Hostels are traumatic.
212
+ >
213
+ > **Ben:** That's the fun part.
214
+ >
215
+ > **Nina:** Focus β€” where are we going?
216
+ >
217
+ > **Lucia:** Lisbon to Italy route.
218
+ >
219
+ > **Ben:** Can't skip Paris.
220
+ >
221
+ > **Nina:** Paris is overrated now.
222
+ >
223
+ > **Sam:** What about Croatia?
224
+ >
225
+ > **Lucia:** Adds time.
226
+ >
227
+ > **Ben:** Cut something else.
228
+ >
229
+ > **Nina:** Let's just start somewhere.
230
+ >
231
+ > **Sam:** Lisbon confirmed.
232
+
233
+ **Expected Summary:**
234
+ Nina, Sam, Lucia, and Ben plan a Europe trip via Interrail. After debating comfort, hostels, and whether to include Paris or Croatia, they settle on starting from Lisbon with a route toward Italy.
235
+
236
+ ---
237
+
238
+ ## πŸ§ͺ Test Case 8 β€” Workplace: Remote Work Debate
239
+
240
+ **Input Dialogue:**
241
+
242
+ > **David:** Return-to-office killed morale.
243
+ >
244
+ > **Amara:** What's leadership saying?
245
+ >
246
+ > **David:** "Culture" β€” but it's control.
247
+ >
248
+ > **Kevin:** Hybrid has value though.
249
+ >
250
+ > **Amara:** Needs better design, not mandates.
251
+ >
252
+ > **David:** Productivity was fine remotely.
253
+ >
254
+ > **Priya:** Data is mixed.
255
+ >
256
+ > **Amara:** Real estate plays a role.
257
+ >
258
+ > **Kevin:** Possibly.
259
+ >
260
+ > **Priya:** Impacts parents heavily.
261
+ >
262
+ > **David:** We lost talent because of it.
263
+
264
+ **Expected Summary:**
265
+ David, Amara, Kevin, and Priya debate return-to-office mandates. While leadership cites "culture," the group argues it's about control. Remote productivity was adequate, hybrid needs thoughtful design, and rigid mandates disproportionately impact parents and cause talent loss.
266
+
267
+ ---
268
+
269
+ ## πŸ§ͺ Test Case 9 β€” AI: Open Source vs Proprietary Models
270
+
271
+ **Input Dialogue:**
272
+
273
+ > **Felix:** Open-source models changed everything.
274
+ >
275
+ > **Hana:** Benchmarks aren't everything.
276
+ >
277
+ > **Tom:** Privacy drives adoption.
278
+ >
279
+ > **Felix:** On-premise is key.
280
+ >
281
+ > **Hana:** Security risks exist.
282
+ >
283
+ > **Tom:** Same as any open-source system.
284
+ >
285
+ > **Hana:** But less mature ecosystem.
286
+ >
287
+ > **Felix:** Community is fast-growing.
288
+ >
289
+ > **Raj:** Power concentration is risky.
290
+ >
291
+ > **Hana:** Compute still centralized.
292
+ >
293
+ > **Tom:** Still better than nothing.
294
+
295
+ **Expected Summary:**
296
+ Felix, Hana, Tom, and Raj discuss open-source vs proprietary AI models. Open-source offers privacy and on-premise deployment, but faces security concerns and ecosystem immaturity. They agree that power concentration in proprietary AI is a significant risk.
297
+
298
+ ---
299
+
300
+ ## πŸ§ͺ Test Case 10 β€” Social: Redefining Success
301
+
302
+ **Input Dialogue:**
303
+
304
+ > **Maya:** I rejected a promotion.
305
+ >
306
+ > **Jordan:** Old generation wouldn't.
307
+ >
308
+ > **Aisha:** Burnout isn't success.
309
+ >
310
+ > **Maya:** Saw it happen to my father.
311
+ >
312
+ > **Dev:** Privilege matters here.
313
+ >
314
+ > **Jordan:** True.
315
+ >
316
+ > **Maya:** But mindset is shifting.
317
+ >
318
+ > **Aisha:** Pandemic changed priorities.
319
+ >
320
+ > **Dev:** Economics still constrain choices.
321
+ >
322
+ > **Jordan:** Meaningful work matters now.
323
+ >
324
+ > **Maya:** Not sacrificing life for work.
325
+ >
326
+ > **Aisha:** System still ties identity to jobs.
327
+
328
+ **Expected Summary:**
329
+ Maya, Jordan, Aisha, and Dev discuss redefining success beyond career advancement. Maya rejected a promotion to avoid burnout. While privilege affects who can make such choices, the pandemic shifted priorities toward meaningful work and work-life balance.
330
+
331
+ ---
332
+
333
+ ## πŸ§ͺ Test Case 11 β€” Sports: Women's Football Growth
334
+
335
+ **Input Dialogue:**
336
+
337
+ > **Fatima:** Viewership is massive now.
338
+ >
339
+ > **Chris:** Quality improved significantly.
340
+ >
341
+ > **Sara:** Infrastructure caught up.
342
+ >
343
+ > **Fatima:** Spain is dominant.
344
+ >
345
+ > **Mike:** Is dominance good?
346
+ >
347
+ > **Chris:** It builds narratives.
348
+ >
349
+ > **Sara:** US transition is interesting.
350
+ >
351
+ > **Fatima:** New stars emerging.
352
+ >
353
+ > **Mike:** Star power expands audience.
354
+ >
355
+ > **Chris:** Pay gap still exists.
356
+ >
357
+ > **Sara:** Access is unequal.
358
+ >
359
+ > **Fatima:** Academies must improve.
360
+
361
+ **Expected Summary:**
362
+ Fatima, Chris, Sara, and Mike discuss the growth of women's football. Viewership and quality have surged, Spain is dominant, and new stars are emerging. However, the pay gap persists and academy-level access remains unequal.
363
+
364
+ ---
365
+
366
+ ## πŸ§ͺ Test Case 12 β€” Climate: Individual vs Systemic Action
367
+
368
+ **Input Dialogue:**
369
+
370
+ > **Elena:** Do personal actions even matter?
371
+ >
372
+ > **Reem:** They shift norms.
373
+ >
374
+ > **Paul:** Carbon footprint idea was PR.
375
+ >
376
+ > **Elena:** Still not meaningless.
377
+ >
378
+ > **Reem:** Policy matters more.
379
+ >
380
+ > **Tomas:** Political will is key.
381
+ >
382
+ > **Paul:** Focus energy wisely.
383
+ >
384
+ > **Elena:** Both personal and political matter.
385
+ >
386
+ > **Reem:** It's about balance.
387
+ >
388
+ > **Tomas:** Renewables are scaling fast.
389
+ >
390
+ > **Paul:** But inequality persists.
391
+ >
392
+ > **Elena:** Climate justice matters.
393
+
394
+ **Expected Summary:**
395
+ Elena, Reem, Paul, and Tomas debate individual vs systemic climate action. While personal actions shift cultural norms, policy and political will are more impactful. They agree both approaches are needed, with climate justice and renewable energy scaling being top priorities.
396
+
397
+ ---
398
+
399
+ ## πŸ§ͺ Test Case 13 β€” Ethics: Deepfakes and Digital Safety
400
+
401
+ **Input Dialogue:**
402
+
403
+ > **Leah:** My colleague was deepfaked.
404
+ >
405
+ > **Marcus:** Detection is slow.
406
+ >
407
+ > **Soo:** Jurisdiction is messy.
408
+ >
409
+ > **Leah:** Platforms inconsistent.
410
+ >
411
+ > **Marcus:** There are good use cases too.
412
+ >
413
+ > **Soo:** Governance is lagging.
414
+ >
415
+ > **Theo:** What about watermarking?
416
+ >
417
+ > **Leah:** Bad actors won't comply.
418
+ >
419
+ > **Theo:** Make it mandatory.
420
+ >
421
+ > **Marcus:** Politically complex.
422
+ >
423
+ > **Soo:** Safety matters more.
424
+ >
425
+ > **Leah:** Damage is long-lasting.
426
+
427
+ **Expected Summary:**
428
+ Leah, Marcus, Soo, and Theo discuss deepfake threats after a colleague was targeted. Detection is slow, jurisdictions are inconsistent, and governance lags behind the technology. While watermarking is proposed, enforcement challenges remain and the long-lasting damage to victims is the core concern.
429
+
430
+ ---
431
+
432
+ ## πŸ§ͺ Test Case 14 β€” Travel: Over-Tourism and Local Impact
433
+
434
+ **Input Dialogue:**
435
+
436
+ > **Carmen:** My neighborhood is gone.
437
+ >
438
+ > **Finn:** Regulation can help.
439
+ >
440
+ > **Carmen:** Enforcement is weak.
441
+ >
442
+ > **Yuna:** Global issue now.
443
+ >
444
+ > **Finn:** Tourism brings money.
445
+ >
446
+ > **Carmen:** Locals are displaced.
447
+ >
448
+ > **Yuna:** Cultural loss is real.
449
+ >
450
+ > **Finn:** Authenticity is subjective.
451
+ >
452
+ > **Carmen:** It's about livability.
453
+ >
454
+ > **Yuna:** Visitor caps help.
455
+ >
456
+ > **Finn:** But elitist.
457
+ >
458
+ > **Carmen:** Current system isn't fair.
459
+
460
+ **Expected Summary:**
461
+ Carmen, Finn, and Yuna discuss over-tourism's impact on local communities. While tourism brings revenue, it displaces residents, erodes culture, and reduces livability. Visitor caps are suggested but debated as potentially elitist. The current system is seen as unfair to locals.
462
+
463
+ ---
464
+
465
+ ## πŸ§ͺ Test Case 15 β€” Mental Health in High-Pressure Professions
466
+
467
+ **Input Dialogue:**
468
+
469
+ > **Dr. James:** Doctors hide mental health issues.
470
+ >
471
+ > **Nora:** System reinforces stigma.
472
+ >
473
+ > **Sam:** Same in law.
474
+ >
475
+ > **Dr. James:** Wellness is performative.
476
+ >
477
+ > **Nora:** Can increase shame.
478
+ >
479
+ > **Maya:** Culture matters more.
480
+ >
481
+ > **Sam:** Not everyone can switch careers.
482
+ >
483
+ > **Dr. James:** Regulation needed.
484
+ >
485
+ > **Nora:** Some progress exists.
486
+ >
487
+ > **Maya:** Peer support helps.
488
+ >
489
+ > **Sam:** Colleagues are easier to talk to.
490
+ >
491
+ > **Dr. James:** Leadership drives change.
492
+
493
+ **Expected Summary:**
494
+ Dr. James, Nora, Sam, and Maya discuss mental health stigma in medicine and law. Wellness programs are often performative and can increase shame. They agree that peer support, cultural change led by leadership, and regulation are needed to make real progress.
495
+
496
+ ---
497
+
498
+ # βœ… Usage Notes
499
+
500
+ - Designed for **dialogue summarization models** (fine-tuned T5, BART, etc.)
501
+ - Includes **multi-speaker complexity** across all test cases
502
+ - Covers **diverse domains**: sports, tech, politics, climate, entertainment, business, ethics, workplace, travel, social issues
503
+ - **Expected summaries** are approximate β€” model output should capture similar key points but may vary in phrasing
504
+ - Useful for:
505
+ - LLM evaluation and benchmarking
506
+ - NLP model comparison
507
+ - Prompt engineering experiments
508
+ - Regression testing during model updates