Spaces:

Plasmoxy
/

DiscordSum-Demo

Running

File size: 13,326 Bytes

"""
DiscordSum - Hugging Face Space Gradio App
Conversation summarization using Qwen3-0.6B-DiscordSum-mini-v1
"""

import gradio as gr
import torch
import time
import re
from typing import Dict, Any
from transformers import AutoModelForCausalLM, AutoTokenizer

# Model configuration
MODEL_NAME = "Plasmoxy/Qwen3-0.6B-DiscordSum-mini-v1"

# Sample conversations for demo
SAMPLE_CONVERSATIONS = {
    "Quick Bug Fix": """[Sarah]: Hey team, found a bug in the login form
[Mike]: What's the issue?
[Sarah]: The password field isn't validating minimum length
[Mike]: I'll fix that today, should be a quick patch
[Sarah]: Thanks! Let me know when it's ready for testing
[Tom]: Is this affecting production users?
[Sarah]: Not yet, caught it in staging
[Tom]: Good catch! We should add more validation tests
[Mike]: Just pushed the fix to dev branch
[Sarah]: Testing now... looks good!
[Tom]: Can you also check the email validation while you're at it?
[Sarah]: Sure, testing that too... email validation works fine
[Mike]: Great, I'll merge it to main
[Tom]: Don't forget to update the changelog
[Mike]: Already done, also added unit tests for this
[Sarah]: Perfect! This should prevent similar issues in the future""",

    "Gaming Session": """[Alex]: Anyone up for some Valorant tonight?
[Jake]: I'm down! What time?
[Alex]: Around 8 PM?
[Emma]: Can I join? I'm still learning though
[Jake]: Of course! We can do unrated matches
[Alex]: Perfect, see you all at 8!
[Emma]: What agents should I practice?
[Jake]: Try Sage or Brimstone, they're beginner friendly
[Emma]: Thanks for the tips!
[Alex]: We'll help you learn the maps too
[Jake]: Emma, have you played any tactical shooters before?
[Emma]: A bit of CS:GO, but Valorant feels different
[Alex]: The abilities make it unique, but the shooting mechanics are similar
[Emma]: Should I focus on aim or learning abilities first?
[Jake]: Both, but start with crosshair placement and basic abilities
[Alex]: We can do some custom games first to practice
[Emma]: That sounds great! I don't want to drag the team down
[Jake]: Don't worry, we all started somewhere
[Alex]: Plus unrated is perfect for learning, no pressure""",

    "Anime Discussion": """[Yuki]: Just finished the latest Attack on Titan episode!
[Marcus]: No spoilers! I'm still on season 3
[Yuki]: My lips are sealed 🤐
[Lisa]: The animation this season is incredible
[Yuki]: Right?! The studio really outdid themselves
[Marcus]: Okay now I'm excited to catch up
[Lisa]: The soundtrack is also amazing this season
[Yuki]: Agreed! That final scene gave me chills
[Marcus]: Stop, you're making me want to binge it all tonight
[Lisa]: Do it! You won't regret it
[Yuki]: The character development has been insane too
[Lisa]: I know! I didn't expect that plot twist at all
[Marcus]: OKAY STOPPING HERE, no more hints!
[Yuki]: Sorry sorry! But seriously, catch up soon so we can discuss
[Lisa]: Have you guys seen the new Demon Slayer movie yet?
[Yuki]: Not yet, is it good?
[Lisa]: Amazing! The fight scenes are breathtaking
[Marcus]: I heard the animation budget was huge
[Lisa]: You can tell, every frame is gorgeous
[Yuki]: We should all go watch it together next weekend""",

    "Movie Night Planning": """[Chris]: Movie night this Friday?
[Sam]: Yes! What should we watch?
[Chris]: How about the new Dune movie?
[Taylor]: I'm in, but can we do Saturday instead?
[Sam]: Saturday works better for me too
[Chris]: Saturday it is then, 7 PM at my place
[Taylor]: Should we bring snacks?
[Chris]: I'll handle popcorn, you guys bring drinks
[Sam]: I'll bring some sodas and chips
[Taylor]: I'll make my famous nachos!
[Chris]: Perfect, this is going to be awesome
[Sam]: Should we watch the first Dune movie before?
[Chris]: Good idea! We could start at 4 PM with the first one
[Taylor]: That's a long movie marathon, I'm so in!
[Sam]: Do we need to read the books first?
[Chris]: Nah, the movies are great on their own
[Taylor]: I'll bring my projector if you want a bigger screen
[Chris]: Oh that would be amazing! My TV is only 55 inches
[Sam]: This is turning into a proper cinema experience
[Taylor]: Should I bring my surround sound speakers too?
[Chris]: Yes please! Let's go all out
[Sam]: I'll make sure to bring extra snacks then
[Taylor]: Can't wait! This is going to be epic
[Chris]: Best movie night ever incoming!
[Sam]: Should we invite anyone else?
[Chris]: Let's keep it small, my living room isn't huge
[Taylor]: Fair enough, cozy movie night with the crew
[Sam]: Perfect! See you all Saturday at 4 PM""",

    "Recipe Sharing": """[Mom_Chef]: Just made the best chocolate chip cookies!
[FoodieFan]: Recipe please! 🍪
[Mom_Chef]: 2 cups flour, 1 cup butter, 1 cup sugar, chocolate chips
[FoodieFan]: What temperature?
[Mom_Chef]: 350°F for 12 minutes
[FoodieFan]: Making these tonight, thanks!
[Mom_Chef]: Don't forget to cream the butter and sugar first
[FoodieFan]: How long should I cream them?
[Mom_Chef]: About 3-4 minutes until fluffy
[FoodieFan]: Got it! Any other tips?
[Mom_Chef]: Let the dough chill for 30 minutes before baking
[FoodieFan]: You're the best!
[BakingQueen]: Can I get that recipe too?
[Mom_Chef]: Of course! Also add 2 eggs and 1 tsp vanilla extract
[BakingQueen]: What kind of chocolate chips work best?
[Mom_Chef]: I use semi-sweet, but dark chocolate works great too
[FoodieFan]: Can I add nuts?
[Mom_Chef]: Absolutely! Walnuts or pecans are perfect
[BakingQueen]: How many cookies does this make?
[Mom_Chef]: About 36 medium-sized cookies
[FoodieFan]: Can I freeze the dough?
[Mom_Chef]: Yes! It freezes beautifully for up to 3 months
[BakingQueen]: Do you use salted or unsalted butter?
[Mom_Chef]: Unsalted, then add 1 tsp of salt to the dry ingredients
[FoodieFan]: Should the butter be room temperature?
[Mom_Chef]: Yes, soft but not melted
[BakingQueen]: Thanks for all the details! Making these this weekend
[FoodieFan]: Just put mine in the oven, house smells amazing already!
[Mom_Chef]: Let me know how they turn out!
[FoodieFan]: They're perfect! Crispy edges, soft center
[BakingQueen]: Now I'm even more excited to try them
[Mom_Chef]: So happy they worked out! Enjoy everyone!""",

    "Study Group": """[Student1]: Can someone explain the calculus homework?
[Student2]: Which problem are you stuck on?
[Student1]: Problem 5, the integration one
[Student2]: You need to use substitution method there
[Student1]: Oh! That makes sense now
[Student2]: Want to meet up tomorrow to go over the rest?
[Student1]: That would be great, thanks!
[Student3]: Can I join too? I'm struggling with problem 7
[Student2]: Sure! Let's meet at the library at 3 PM
[Student1]: Perfect, see you there
[Student3]: Thanks guys, really appreciate it
[Student2]: No problem, we're all in this together
[Student4]: Is this for Professor Johnson's class?
[Student1]: Yes! Are you in that class too?
[Student4]: Yeah, I'm also confused about problem 8
[Student2]: We can go over that one tomorrow as well
[Student3]: Should we bring our textbooks?
[Student2]: Definitely, and your notes too
[Student1]: I'll bring my laptop in case we need to look anything up
[Student4]: Can we also review the previous chapter? I'm still shaky on that
[Student2]: Good idea, the concepts build on each other
[Student3]: What about problem 10? That one seems really hard
[Student1]: I haven't even attempted that one yet
[Student2]: It's tricky but we'll work through it together
[Student4]: Should we plan for more than an hour?
[Student2]: Probably, let's say 2-3 hours to be safe
[Student3]: I can stay until 6 PM if needed
[Student1]: Same here, I really want to understand this material
[Student4]: This study group is a lifesaver
[Student2]: We should make this a regular thing
[Student3]: Agreed! Maybe every week before assignments are due?
[Student1]: I'm in! Let's create a group chat
[Student4]: Great idea, I'll set one up
[Student2]: Perfect! See everyone tomorrow at 3 PM, library second floor
[Student1]: Thanks everyone, feeling much better about this now
[Student3]: Same! Can't wait to finally understand these problems
[Student4]: This is why study groups are the best!"""
}

# Global model and tokenizer
model = None
tokenizer = None


def load_model():
    """Load model and tokenizer"""
    global model, tokenizer

    print(f"Loading model: {MODEL_NAME}")

    tokenizer = AutoTokenizer.from_pretrained(
        MODEL_NAME,
        trust_remote_code=True,
        padding_side="right"
    )

    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token
        tokenizer.pad_token_id = tokenizer.eos_token_id

    model = AutoModelForCausalLM.from_pretrained(
        MODEL_NAME,
        device_map="auto",
        torch_dtype=torch.float32,
        trust_remote_code=True,
    )

    model.eval()

    print("Model loaded successfully!")


def format_inference_prompt(conversation: str) -> str:
    """Format inference prompt using chat template"""
    messages = [
        {
            "role": "system",
            "content": "Summarize Discord conversations into a paragraph capturing key points, decisions, and action items."
        },
        {
            "role": "user",
            "content": f"Summarize the following conversation:\n\n{conversation}"
        }
    ]

    formatted = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False
    )

    # Clean up chat template output
    formatted = re.sub(r'<think>[\s\S]*?</think>', '', formatted)
    formatted = re.sub(r'(<\|im_end\|>)(?=<\|im_start\|>)', r'\1\n', formatted)
    formatted = re.sub(r'(<\|im_start\|>[^<>\n]+)\s*\n\s*\n', r'\1\n', formatted)
    formatted = re.sub(r'\n{3,}', '\n\n', formatted)
    formatted = formatted.strip()

    return formatted


def extract_summary(response: str) -> str:
    """Extract summary from model response"""
    match = re.search(r'Summary:\s*(.*?)(?:<\|im_end\|>|$)', response, re.DOTALL)
    if match:
        return match.group(1).strip()
    return response.strip()


def summarize_conversation(conversation: str):
    """Summarize conversation using the model"""
    if not conversation or not conversation.strip():
        return "Error: Conversation cannot be empty", None

    try:
        start_time = time.time()

        # Format prompt
        prompt = format_inference_prompt(conversation)

        # Tokenize
        inputs = tokenizer(
            prompt,
            return_tensors="pt",
            truncation=True,
            max_length=2048
        ).to(model.device)

        input_tokens = inputs["input_ids"].shape[1]
        warmup_time = time.time() - start_time

        # Generate
        generation_start = time.time()

        with torch.no_grad():
            outputs = model.generate(
                **inputs,
                max_new_tokens=200,
                temperature=0.7,
                top_p=0.9,
                do_sample=True,
                pad_token_id=tokenizer.pad_token_id,
                eos_token_id=tokenizer.eos_token_id,
            )

        inference_time = time.time() - generation_start

        # Decode
        response = tokenizer.decode(
            outputs[0][input_tokens:],
            skip_special_tokens=True
        )

        # Extract summary
        summary = extract_summary(response)

        # Calculate stats
        output_tokens = outputs.shape[1] - input_tokens
        total_time = time.time() - start_time
        tokens_per_second = output_tokens / inference_time if inference_time > 0 else 0

        # Create stats table data
        stats_data = [
            ["Inference Time", f"{inference_time:.2f}s"],
            ["Warmup Time", f"{warmup_time:.2f}s"],
            ["Total Time", f"{total_time:.2f}s"],
            ["Tokens/Second", f"{tokens_per_second:.1f}"],
            ["Input Tokens", str(input_tokens)],
            ["Output Tokens", str(output_tokens)],
            ["Total Tokens", str(outputs.shape[1])],
        ]

        return summary, stats_data
    except Exception as e:
        return f"Error: {str(e)}", None


# Load model on startup
load_model()

# Create Gradio interface
demo = gr.Interface(
    fn=summarize_conversation,
    inputs=gr.Textbox(
        label="Discord Conversation",
        placeholder="Paste your Discord conversation here...",
        lines=15,
        value=SAMPLE_CONVERSATIONS["Movie Night Planning"]
    ),
    outputs=[
        gr.Textbox(
            label="Summary",
            lines=10
        ),
        gr.Dataframe(
            label="Statistics",
            headers=["Metric", "Value"],
            datatype=["str", "str"],
            row_count=7,
            column_count=2,
        )
    ],
    title="DiscordSum - Conversation Summarizer",
    description="Summarize Discord conversations into short paragraphs. Runs [Plasmoxy/Qwen3-0.6B-DiscordSum-mini-v1](https://huggingface.co/Plasmoxy/Qwen3-0.6B-DiscordSum-mini-v1).",
    examples=[
        [SAMPLE_CONVERSATIONS["Movie Night Planning"]],
        [SAMPLE_CONVERSATIONS["Quick Bug Fix"]],
        [SAMPLE_CONVERSATIONS["Gaming Session"]],
        [SAMPLE_CONVERSATIONS["Anime Discussion"]],
        [SAMPLE_CONVERSATIONS["Recipe Sharing"]],
        [SAMPLE_CONVERSATIONS["Study Group"]],
    ],
)

if __name__ == "__main__":
    demo.launch()