File size: 8,518 Bytes
9a8a0bf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3dc0b3d
 
 
 
 
9a8a0bf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3dc0b3d
9a8a0bf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
import gradio as gr
import os
from openai import OpenAI
# Load system prompt and README
def load_system_prompt():
    with open("system-prompt.md", "r") as f:
        return f.read()

SYSTEM_PROMPT_CONTENT = load_system_prompt()

# README content for About tab
README_CONTENT = """# Basic STT Transcript Cleanup Tool (Version 3)

A foundational speech-to-text transcript remediation tool that provides purpose-agnostic text cleanup instructions. This is the **daily workhorse** for cleaning up raw speech-to-text transcripts that naturally contain undesirable material.

## Purpose & Philosophy

This tool implements **Version 3** of the Basic Speech-to-Text Cleanup prompt - a carefully crafted system prompt that provides sufficiently deterministic guidance without overstepping into actual content editing. The challenge in developing this prompt was ensuring it cleans up technical artifacts of speech-to-text conversion while preserving the authentic voice and intent of the original speaker.

## Foundational Design

This basic cleanup prompt serves as a **foundation layer** that can be combined with specialized text transformation prompts:

- **Standalone Use**: Perfect for general transcript cleanup
- **Modular Design**: Can be concatenated with purpose-specific prompts from extensive libraries
- **Purpose-Agnostic**: Works across all content types and domains
- **Extensible**: Hundreds of specialized transformation prompts can be layered on top

## Features

- **AI-Powered Cleanup**: Uses OpenAI's GPT models with a refined system prompt
- **BYOK (Bring Your Own Key)**: Secure - uses your own OpenAI API key
- **Copy to Clipboard**: Easy copying of cleaned text
- **Re-run Capability**: Quickly re-process the same text
- **System Prompt Viewer**: Transparent - see exactly how the AI processes your text
- **Deterministic Processing**: Consistent, predictable cleanup results

## How to Use

1. **Enter API Key**: Provide your OpenAI API key (required for processing)
2. **Paste Transcript**: Add your raw speech-to-text transcript
3. **Process**: Click "Clean Up Transcript" to apply remediation
4. **Copy Results**: Use the cleaned output or re-run if needed

## What It Does

The tool applies these **foundational improvements** to your transcripts:

### Core Remediations
- **Removes filler words** (like "um")
- **Adds punctuation, sentence structure, and paragraph spacing**
- **Fixes obvious STT hallucinations and mistranscriptions** (e.g., "McDonuts" → "McDonalds")
- **Removes repetitive or run-on thoughts** that would not be helpful to readers
- **Follows inferred instructions** to omit certain clauses (e.g., "wait .. scratch that from the note")

### What It Preserves
- **All important content** and meaning
- **Original speaker's voice** and intent
- **Factual accuracy** and details
- **Natural flow** of conversation

## Design Principles

1. **Light Touch Editing**: Minimal intervention while maximizing clarity
2. **Content Preservation**: Never removes or alters important information
3. **Deterministic Guidance**: Consistent, predictable results
4. **Purpose Agnostic**: Works across all content domains
5. **Modular Foundation**: Ready for specialized prompt layering

## Extended Ecosystem

This basic cleanup prompt is part of a larger ecosystem:
- **Hundreds of specialized prompts** available in shared libraries
- **Domain-specific transformations** for various use cases
- **Concatenation-ready design** for complex workflows
- **Shared on Hugging Face** and other platforms

## System Prompt

The tool uses a carefully crafted system prompt (Version 3, September 2025) that balances cleanup effectiveness with content preservation. View the complete prompt using the "Show System Prompt" feature in the interface.
"""

def cleanup_transcript(text, api_key):
    """Clean up STT transcript using OpenAI API"""
    if not text.strip():
        return "Please provide text to clean up."
    
    if not api_key.strip():
        return "Please provide your OpenAI API key."
    
    try:
        client = OpenAI(api_key=api_key)
        
        response = client.chat.completions.create(
            model="gpt-4o-mini",  # Using cost-effective model
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT_CONTENT},
                {"role": "user", "content": text}
            ],
            temperature=0.3,
            max_tokens=4000
        )
        
        cleaned_text = response.choices[0].message.content
        return cleaned_text
        
    except Exception as e:
        return f"Error: {str(e)}"

def copy_to_clipboard_js():
    """JavaScript function to copy text to clipboard"""
    return """
    function copyToClipboard(text) {
        navigator.clipboard.writeText(text).then(function() {
            // You could add a toast notification here
        });
    }
    """

# Create the Gradio interface
with gr.Blocks(
    title="STT Transcript Cleanup Tool",
    theme=gr.themes.Soft(),
    css="""
    .main-header {
        text-align: center;
        margin-bottom: 2rem;
    }
    .attribution {
        text-align: center;
        margin-top: 2rem;
        padding: 1rem;
        background-color: #f8f9fa;
        border-radius: 8px;
    }
    """
) as demo:
    
    gr.HTML("""
    <div class="main-header">
        <h1>🎤 STT Transcript Cleanup Tool</h1>
        <p>Clean up speech-to-text transcripts by removing filler words, adding punctuation, and improving readability.</p>
        <p><strong>Note:</strong> This tool requires your own OpenAI API key (BYOK - Bring Your Own Key)</p>
    </div>
    """)
    
    with gr.Tabs():
        with gr.TabItem("Cleanup Tool"):
            with gr.Row():
                with gr.Column():
                    api_key_input = gr.Textbox(
                        label="OpenAI API Key",
                        placeholder="sk-...",
                        type="password",
                        info="Your API key is not stored and only used for this session"
                    )
                    
                    input_text = gr.Textbox(
                        label="Raw STT Transcript",
                        placeholder="Paste your speech-to-text transcript here...",
                        lines=10,
                        max_lines=20
                    )
                    
                    with gr.Row():
                        cleanup_btn = gr.Button("Clean Up Transcript", variant="primary")
                        clear_btn = gr.Button("Clear", variant="secondary")
                
                with gr.Column():
                    output_text = gr.Textbox(
                        label="Cleaned Transcript",
                        lines=10,
                        max_lines=20,
                        interactive=False
                    )
                    
                    with gr.Row():
                        copy_btn = gr.Button("Copy to Clipboard", variant="secondary")
                        rerun_btn = gr.Button("Run Again", variant="secondary")
        
        with gr.TabItem("System Prompt"):
            gr.Markdown("## Current System Prompt")
            gr.Markdown("This is the prompt used to instruct the AI on how to clean up your transcripts:")
            gr.Code(SYSTEM_PROMPT_CONTENT, language="markdown", label="system-prompt.md")
        
        with gr.TabItem("About"):
            gr.Markdown(README_CONTENT)
    
    # Event handlers
    cleanup_btn.click(
        fn=cleanup_transcript,
        inputs=[input_text, api_key_input],
        outputs=output_text
    )
    
    rerun_btn.click(
        fn=cleanup_transcript,
        inputs=[input_text, api_key_input],
        outputs=output_text
    )
    
    clear_btn.click(
        fn=lambda: ("", ""),
        outputs=[input_text, output_text]
    )
    
    # Copy to clipboard functionality
    copy_btn.click(
        fn=None,
        inputs=output_text,
        outputs=None,
        js="(text) => navigator.clipboard.writeText(text)"
    )
    
    # Attribution
    gr.HTML("""
    <div class="attribution">
        <p><strong>Created by:</strong> <a href="https://danielrosehill.com" target="_blank">Daniel Rosehill</a></p>
        <p>This tool helps clean up speech-to-text transcripts by removing filler words, adding proper punctuation, 
        and improving overall readability while preserving the original meaning and important details.</p>
    </div>
    """)

if __name__ == "__main__":
    demo.launch()