Plasmoxy commited on
Commit
5dda5dc
·
1 Parent(s): 4477bd7

Add dcsum code

Browse files
Files changed (1) hide show
  1. app.py +222 -4
app.py CHANGED
@@ -1,7 +1,225 @@
 
 
 
 
 
1
  import gradio as gr
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- def greet(name):
4
- return "Hello " + name + "!!"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
- demo = gr.Interface(fn=greet, inputs="text", outputs="text")
7
- demo.launch()
 
1
+ """
2
+ DiscordSum - Hugging Face Space Gradio App
3
+ Conversation summarization using Qwen3-0.6B-DiscordSum-mini-v1
4
+ """
5
+
6
  import gradio as gr
7
+ import torch
8
+ import time
9
+ import re
10
+ from typing import Dict, Any
11
+ from transformers import AutoModelForCausalLM, AutoTokenizer
12
+
13
+ # Model configuration
14
+ MODEL_NAME = "Plasmoxy/Qwen3-0.6B-DiscordSum-mini-v1"
15
+
16
+ # Sample conversation for demo
17
+ SAMPLE_CONVERSATION = """[TechLead_Sarah]: Good morning team! We need to discuss the upcoming Q1 release. There are some critical issues that came up during yesterday's sprint review.
18
+ [Backend_Mike]: Morning! Yeah, I noticed the authentication service is having intermittent failures in staging. We're seeing about 5% of login attempts timing out.
19
+ [DevOps_Chen]: I can confirm that. The logs show connection pool exhaustion during peak load. We might need to increase the max connections or implement better connection recycling.
20
+ [Frontend_Emma]: That explains the user complaints we've been getting. Is this affecting the password reset flow too?
21
+ [Backend_Mike]: Good question. Let me check... yes, it looks like any endpoint that touches the auth service is affected. Password resets, token refreshes, and social login callbacks.
22
+ [TechLead_Sarah]: This is a P0 issue then. Mike, can you take the lead on fixing this? We need it resolved before the release.
23
+ [Backend_Mike]: Absolutely. I'll start by profiling the connection usage patterns. Chen, can you help me analyze the infrastructure metrics?
24
+ [DevOps_Chen]: Sure thing. I'll pull the CloudWatch data and set up a dashboard. Should have something by end of day.
25
+ [QA_Alex]: While we're on critical issues, I found a data corruption bug in the export feature. When users export large datasets (>10k rows), some columns are getting scrambled.
26
+ [Backend_Mike]: Oh no, that sounds serious. Do you have reproduction steps?
27
+ [QA_Alex]: Yes, I documented everything in JIRA ticket ENG-2847. Happens consistently with the customer data export when you select more than 5 columns and filter by date range.
28
+ [Frontend_Emma]: I worked on that feature last month. Let me take a look at the ticket. It might be related to how we're chunking the data before sending it to the backend.
29
+ [TechLead_Sarah]: Emma, pair with Alex on this one. We can't ship with data corruption issues. What's the ETA on a fix?
30
+ [Frontend_Emma]: Give me a few hours to investigate. If it's what I think it is, should be a quick fix in the data serialization logic.
31
+ [Product_Manager_Lisa]: Just joining - are these issues going to delay our release? We have customer commitments for next Friday.
32
+ [TechLead_Sarah]: Too early to say definitively, but we're treating both as blockers. Lisa, can you give us until tomorrow morning to assess the scope?
33
+ [Product_Manager_Lisa]: Tomorrow morning works. I'll prepare a communication plan for customers in case we need to push back the date.
34
+ [DevOps_Chen]: One more thing - our staging environment is going to undergo scheduled maintenance tonight from 11 PM to 2 AM EST. Just a heads up for anyone planning to work late.
35
+ [Backend_Mike]: Thanks for the notice. I'll do my connection pool testing before then.
36
+ [Security_James]: Hey folks, not to pile on, but I need to mention that our security audit identified some concerns with how we're handling API keys in the logging system. We're potentially exposing sensitive tokens in debug logs.
37
+ [TechLead_Sarah]: James, is this something that needs immediate attention or can it wait until after the release?
38
+ [Security_James]: It's not being actively exploited, but it's a significant vulnerability. I'd recommend we fix it this sprint. I can prepare a PR that redacts sensitive data from logs.
39
+ [Backend_Mike]: I can review that PR. Should be straightforward - we just need to update our logging middleware.
40
+ [Frontend_Emma]: Sarah, should we schedule a follow-up meeting to go through all these items in detail?
41
+ [TechLead_Sarah]: Yes, let's do a quick sync at 2 PM today. I'll send out a calendar invite. Priority items: auth service failures, data export corruption, and security logging issue.
42
+ [QA_Alex]: I'll prepare a full regression test plan for the auth service fix. We need to make sure we don't break anything else.
43
+ [DevOps_Chen]: I'll also set up automated load testing for the auth service so we can catch these issues earlier in the future.
44
+ [Product_Manager_Lisa]: Appreciate everyone jumping on this. I'll be in the 2 PM meeting with updates from the customer success team.
45
+ [Backend_Mike]: Quick question - do we have any insight into when the auth issues started? Was it after the last deployment?
46
+ [DevOps_Chen]: Looking at the metrics now... it started appearing about 4 days ago, which coincides with our database migration to the new instance type.
47
+ [Backend_Mike]: Ah! That's a crucial data point. The new instance might have different connection limits or network characteristics.
48
+ [DevOps_Chen]: Exactly what I was thinking. I'll check the RDS configuration and compare it with our old setup.
49
+ [TechLead_Sarah]: Great detective work. Let's keep this thread updated with findings. Mike and Chen, prioritize the auth issue. Emma and Alex, focus on the export bug. James, get that security PR ready for review.
50
+ [Security_James]: Will do. I'll have it ready by noon.
51
+ [Frontend_Emma]: Alex, I'm looking at your ticket now. Can you jump on a quick call to walk me through the reproduction?
52
+ [QA_Alex]: Sure, sending you a Zoom link now.
53
+ [TechLead_Sarah]: Thanks everyone for the quick response. Let's crush these bugs and get back on track for the release!"""
54
+
55
+ # Global model and tokenizer
56
+ model = None
57
+ tokenizer = None
58
+
59
+
60
+ def load_model():
61
+ """Load model and tokenizer"""
62
+ global model, tokenizer
63
+
64
+ print(f"Loading model: {MODEL_NAME}")
65
+
66
+ tokenizer = AutoTokenizer.from_pretrained(
67
+ MODEL_NAME,
68
+ trust_remote_code=True,
69
+ padding_side="right"
70
+ )
71
+
72
+ if tokenizer.pad_token is None:
73
+ tokenizer.pad_token = tokenizer.eos_token
74
+ tokenizer.pad_token_id = tokenizer.eos_token_id
75
+
76
+ model = AutoModelForCausalLM.from_pretrained(
77
+ MODEL_NAME,
78
+ device_map="auto",
79
+ torch_dtype=torch.float32,
80
+ trust_remote_code=True,
81
+ )
82
+
83
+ model.eval()
84
+
85
+ print("Model loaded successfully!")
86
+
87
+
88
+ def format_inference_prompt(conversation: str) -> str:
89
+ """Format inference prompt using chat template"""
90
+ messages = [
91
+ {
92
+ "role": "system",
93
+ "content": "Summarize Discord conversations into a paragraph capturing key points, decisions, and action items."
94
+ },
95
+ {
96
+ "role": "user",
97
+ "content": f"Summarize the following conversation:\n\n{conversation}"
98
+ }
99
+ ]
100
+
101
+ formatted = tokenizer.apply_chat_template(
102
+ messages,
103
+ tokenize=False,
104
+ add_generation_prompt=True,
105
+ enable_thinking=False
106
+ )
107
+
108
+ # Clean up chat template output
109
+ formatted = re.sub(r'<think>[\s\S]*?</think>', '', formatted)
110
+ formatted = re.sub(r'(<\|im_end\|>)(?=<\|im_start\|>)', r'\1\n', formatted)
111
+ formatted = re.sub(r'(<\|im_start\|>[^<>\n]+)\s*\n\s*\n', r'\1\n', formatted)
112
+ formatted = re.sub(r'\n{3,}', '\n\n', formatted)
113
+ formatted = formatted.strip()
114
+
115
+ return formatted
116
+
117
+
118
+ def extract_summary(response: str) -> str:
119
+ """Extract summary from model response"""
120
+ match = re.search(r'Summary:\s*(.*?)(?:<\|im_end\|>|$)', response, re.DOTALL)
121
+ if match:
122
+ return match.group(1).strip()
123
+ return response.strip()
124
+
125
+
126
+ def summarize_conversation(conversation: str):
127
+ """Summarize conversation using the model"""
128
+ if not conversation or not conversation.strip():
129
+ return "Error: Conversation cannot be empty", None
130
+
131
+ try:
132
+ start_time = time.time()
133
+
134
+ # Format prompt
135
+ prompt = format_inference_prompt(conversation)
136
+
137
+ # Tokenize
138
+ inputs = tokenizer(
139
+ prompt,
140
+ return_tensors="pt",
141
+ truncation=True,
142
+ max_length=2048
143
+ ).to(model.device)
144
+
145
+ input_tokens = inputs["input_ids"].shape[1]
146
+ warmup_time = time.time() - start_time
147
+
148
+ # Generate
149
+ generation_start = time.time()
150
+
151
+ with torch.no_grad():
152
+ outputs = model.generate(
153
+ **inputs,
154
+ max_new_tokens=200,
155
+ temperature=0.7,
156
+ top_p=0.9,
157
+ do_sample=True,
158
+ pad_token_id=tokenizer.pad_token_id,
159
+ eos_token_id=tokenizer.eos_token_id,
160
+ )
161
+
162
+ inference_time = time.time() - generation_start
163
+
164
+ # Decode
165
+ response = tokenizer.decode(
166
+ outputs[0][input_tokens:],
167
+ skip_special_tokens=True
168
+ )
169
+
170
+ # Extract summary
171
+ summary = extract_summary(response)
172
+
173
+ # Calculate stats
174
+ output_tokens = outputs.shape[1] - input_tokens
175
+ total_time = time.time() - start_time
176
+ tokens_per_second = output_tokens / inference_time if inference_time > 0 else 0
177
+
178
+ # Create stats table data
179
+ stats_data = [
180
+ ["Inference Time", f"{inference_time:.2f}s"],
181
+ ["Warmup Time", f"{warmup_time:.2f}s"],
182
+ ["Total Time", f"{total_time:.2f}s"],
183
+ ["Tokens/Second", f"{tokens_per_second:.1f}"],
184
+ ["Input Tokens", str(input_tokens)],
185
+ ["Output Tokens", str(output_tokens)],
186
+ ["Total Tokens", str(outputs.shape[1])],
187
+ ]
188
+
189
+ return summary, stats_data
190
+ except Exception as e:
191
+ return f"Error: {str(e)}", None
192
+
193
+
194
+ # Load model on startup
195
+ load_model()
196
 
197
+ # Create Gradio interface
198
+ demo = gr.Interface(
199
+ fn=summarize_conversation,
200
+ inputs=gr.Textbox(
201
+ label="Discord Conversation",
202
+ placeholder="Paste your Discord conversation here...",
203
+ lines=15,
204
+ value=SAMPLE_CONVERSATION
205
+ ),
206
+ outputs=[
207
+ gr.Textbox(
208
+ label="Summary",
209
+ lines=10
210
+ ),
211
+ gr.Dataframe(
212
+ label="Statistics",
213
+ headers=["Metric", "Value"],
214
+ datatype=["str", "str"],
215
+ row_count=7,
216
+ column_count=2,
217
+ )
218
+ ],
219
+ title="DiscordSum - Conversation Summarizer",
220
+ description="Summarize Discord conversations into short paragraphs. Runs [Plasmoxy/Qwen3-0.6B-DiscordSum-mini-v1](https://huggingface.co/Plasmoxy/Qwen3-0.6B-DiscordSum-mini-v1).",
221
+ examples=[[SAMPLE_CONVERSATION]],
222
+ )
223
 
224
+ if __name__ == "__main__":
225
+ demo.launch()