Spaces:

LEMAS-Project
/

LEMAS-Edit

Sleeping

App Files Files Community

Approximetal commited on Jan 9

Commit

1cec52c

verified ·

1 Parent(s): b68fd46

Update gradio_mix.py

Browse files

Files changed (1) hide show

gradio_mix.py +43 -0

gradio_mix.py CHANGED Viewed

@@ -1073,6 +1073,49 @@ def get_app():
                 output_audio = gr.Audio(label="Output Audio", type="numpy")
                 with gr.Accordion("Inference transcript", open=True):
                     inference_transcript = gr.Textbox(label="Inference transcript", lines=5, interactive=False, info="Inference was performed on this transcript.")
                 with gr.Group(visible=False) as long_tts_sentence_editor:
                     sentence_selector = gr.Dropdown(label="Sentence", value=None,
                                                     info="Select sentence you want to regenerate")

                 output_audio = gr.Audio(label="Output Audio", type="numpy")
                 with gr.Accordion("Inference transcript", open=True):
                     inference_transcript = gr.Textbox(label="Inference transcript", lines=5, interactive=False, info="Inference was performed on this transcript.")
+                # Simple in-app README to guide users through the editing workflow.
+                # Use HTML so we can cap the height (~12 lines) and enable scrolling.
+                readme_help = gr.HTML(
+                    value=(
+                        '<div style="max-height: 12em; overflow-y: auto; white-space: pre-wrap;">'
+                        "<h4>README: How to Use This Tool</h4>"
+                        "<p><b>1. Load models</b><br>"
+                        "Click <b>&ldquo;Load Models&rdquo;</b> and wait for all models to finish loading. "
+                        "Note that <b>WhisperX</b> takes the longest to initialize, so please be patient.</p>"
+                        "<p><b>2. Upload input audio</b><br>"
+                        "Click <b>&ldquo;Input Audio&rdquo;</b> and upload the audio file you want to edit.</p>"
+                        "<p><b>3. Transcribe and correct text</b><br>"
+                        "Click <b>&ldquo;Transcribe&rdquo;</b> to perform speech recognition. If the transcription is inaccurate, "
+                        "edit the text in <b>&ldquo;Original transcript&rdquo;</b>, then click <b>&ldquo;ReAlign&rdquo;</b> to recompute "
+                        "word-level timestamps.</p>"
+                        "<p><b>4. (Optional) Denoise noisy audio</b><br>"
+                        "If the input audio is noisy and affects recognition or synthesis quality, click "
+                        "<b>&ldquo;Denoise&rdquo;</b> to apply noise reduction. If you are not satisfied with the denoised result, "
+                        "click <b>&ldquo;Cancel Denoise&rdquo;</b> to restore the original audio, or switch to a different denoiser "
+                        "under <b>&ldquo;Select models&rdquo;</b> and reload.</p>"
+                        "<p><b>5. Select the edit span</b><br>"
+                        "Use <b>&ldquo;First word to edit&rdquo;</b> and <b>&ldquo;Last word to edit&rdquo;</b> to specify the region to modify, "
+                        "then click <b>&ldquo;Check edit words&rdquo;</b> to preview the selection. For finer control, you may also adjust "
+                        "<b>&ldquo;Edit from time&rdquo;</b> and <b>&ldquo;Edit to time&rdquo;</b>.</p>"
+                        "<p><b>6. Enter the new text</b><br>"
+                        "In the <b>&ldquo;Text&rdquo;</b> box, enter the text that should replace the selected segment.</p>"
+                        "<p><b>7. Run the edit</b><br>"
+                        "Click <b>&ldquo;Run&rdquo;</b> and wait for the model to generate the edited audio.</p>"
+                        "<p><b>8. Inspect the result</b><br>"
+                        "The edited waveform will appear in <b>&ldquo;Output Audio&rdquo;</b>, and the corresponding edited text will be "
+                        "shown under <b>&ldquo;Inference transcript&rdquo;</b>.</p>"
+                        "<p><b>9. Refine or change models</b><br>"
+                        "If the result is not satisfactory, try adjusting the <b>&ldquo;Generation Parameters&rdquo;</b> or selecting a "
+                        "different <b>&ldquo;Edit Model&rdquo;</b> under <b>&ldquo;Select models&rdquo;</b>, then run again.</p>"
+                        "<p><b>10. Feedback</b><br>"
+                        "For bug reports or feature requests, feel free to:<br>"
+                        "1) Open a GitHub issue<br>"
+                        "2) Post on the Hugging Face community page<br>"
+                        "3) Contact us via email at <code>approximetal@gmail.com</code></p>"
+                        "</div>"
+                    )
+                )
                 with gr.Group(visible=False) as long_tts_sentence_editor:
                     sentence_selector = gr.Dropdown(label="Sentence", value=None,
                                                     info="Select sentence you want to regenerate")