Spaces:

thepatch
/

stable-melodyflow

Running on Zero

App Files Files Community

thecollabagepatch commited on Jul 12, 2025

Commit

694aa5b

1 Parent(s): 59f0768

decapitalizing

Browse files

Files changed (1) hide show

app.py +82 -81

app.py CHANGED Viewed

@@ -339,86 +339,87 @@ def calculate_optimal_bars(bpm):
 # ========== GRADIO INTERFACE ==========
-with gr.Blocks(title="🎵 Stable Audio Loop Generator") as iface:
-    gr.Markdown("# 🎵 Stable Audio Loop Generator")
-    gr.Markdown("**Generate synchronized drum and instrument loops with stable-audio-open-small, then transform with MelodyFlow!**")
     # ========== MODELS & PROJECT INFO ==========
-    with gr.Accordion("📚 About the Models & Project", open=False):
         with gr.Accordion("🚀 stable-audio-open-small", open=False):
             gr.Markdown("""
             **stable-audio-open-small** is an incredibly fast model from the zachs and friends at Stability AI. It's capable of generating 12 seconds of audio in under a second, which gives rise to a lot of very interesting kinds of UX.
-            **Note about generation speed in this zerogpu space:** You'll notice generation times are a little slower here than if you were to use the model on a local GPU. That's just a result of the way zerogpu spaces work I think... let me know if there's a way to keep the model loaded in a zerogpu space!
-            **Links:**
-            - 🤗 [Model on HuggingFace](https://huggingface.co/stabilityai/stable-audio-open-small)
-            - 🐳 [Docker API Implementation](https://github.com/betweentwomidnights/stable-audio-api)
             """)
-        with gr.Accordion("🎛️ MelodyFlow", open=False):
             gr.Markdown("""
-            **MelodyFlow** is a model by Meta that can use regularized latent inversion to do transformations of input audio.
-            It's not officially a part of the audiocraft repo yet, but we use it as a docker container in the backend for gary4live.
-            **Links:**
-            - 🤗 [MelodyFlow Space](https://huggingface.co/spaces/Facebook/MelodyFlow)
-            - 🐳 [Standalone API Implementation](https://github.com/betweentwomidnights/melodyflow)
             """)
-        with gr.Accordion("🎹 gary4live Project", open=False):
             gr.Markdown("""
-            **gary4live** is a free/open source project that uses these models, along with MusicGen, inside of Ableton Live. I run a backend myself so that we can all experiment with it, but you can also spin the backend up locally using docker-compose with our repo.
-            **Project Links:**
-            - 🎵 [Main Project Repo](https://github.com/betweentwomidnights/gary4live)
-            - 🖥️ [Backend Implementation](https://github.com/betweentwomidnights/gary-backend-combined)
-            **Installers:**
-            - 💿 [PC & Mac Installers on Gumroad](https://thepatch.gumroad.com/l/gary4live)
             """)
-    with gr.Accordion("How This Works", open=False):
         gr.Markdown("""
-        **Workflow:**
-        1. **Set global BPM and bars** - affects both drum and instrument generation
-        2. **Generate drum loop** - creates BPM-aware percussion
-        3. **Generate instrument loop** - creates melodic/harmonic content
-        4. **Combine loops** - layer them together with repetitions (up to 30s)
-        5. **Transform** - use MelodyFlow to stylistically transform the combined result
-        **Features:**
-        - BPM-aware generation ensures perfect sync between loops
-        - Negative prompting separates drums from instruments cleanly
-        - Smart bar calculation optimizes loop length for the BPM
-        - MelodyFlow integration for advanced style transfer
         """)
     # ========== GLOBAL CONTROLS ==========
-    gr.Markdown("## 🎛️ Global Settings")
     with gr.Row():
         global_bpm = gr.Dropdown(
-            label="Global BPM",
             choices=[90, 100, 110, 120, 130, 140, 150],
             value=120,
-            info="BPM applied to both drum and instrument generation"
         )
         global_bars = gr.Dropdown(
-            label="Loop Length (Bars)",
-            choices=[1, 2, 4, 8],
             value=4,
-            info="Number of bars for each loop"
         )
         base_prompt = gr.Textbox(
-            label="Base Prompt",
-            value="techno",
-            placeholder="e.g., 'techno', 'jazz', 'ambient', 'hip-hop'",
-            info="Style applied to both loops"
         )
     # Auto-suggest optimal bars based on BPM
@@ -429,64 +430,64 @@ with gr.Blocks(title="🎵 Stable Audio Loop Generator") as iface:
     global_bpm.change(update_suggested_bars, inputs=[global_bpm], outputs=[global_bars])
     # ========== LOOP GENERATION ==========
-    gr.Markdown("## 🥁 Step 1: Generate Individual Loops")
     with gr.Row():
         with gr.Column():
-            gr.Markdown("### 🥁 Drum Loop")
-            generate_drums_btn = gr.Button("Generate Drums", variant="primary", size="lg")
-            drums_audio = gr.Audio(label="Drum Loop", type="filepath")
-            drums_status = gr.Textbox(label="Drums Status", value="Ready to generate")
         with gr.Column():
-            gr.Markdown("### 🎹 Instrument Loop")
-            generate_instruments_btn = gr.Button("Generate Instruments", variant="secondary", size="lg")
-            instruments_audio = gr.Audio(label="Instrument Loop", type="filepath")
-            instruments_status = gr.Textbox(label="Instruments Status", value="Ready to generate")
     # Seed controls
     with gr.Row():
-        drums_seed = gr.Number(label="Drums Seed", value=-1, info="-1 for random")
-        instruments_seed = gr.Number(label="Instruments Seed", value=-1, info="-1 for random")
     # ========== COMBINATION ==========
-    gr.Markdown("## 🎛️ Step 2: Combine Loops")
     with gr.Row():
         num_repeats = gr.Slider(
-            label="Number of Repetitions",
             minimum=1,
             maximum=5,
             step=1,
             value=2,
-            info="How many times to repeat each loop (creates longer audio)"
         )
-        combine_btn = gr.Button("🎛️ Combine Loops", variant="primary", size="lg")
-    combined_audio = gr.Audio(label="Combined Loops", type="filepath")
-    combine_status = gr.Textbox(label="Combine Status", value="Generate loops first")
     # ========== MELODYFLOW TRANSFORMATION ==========
-    gr.Markdown("## 🎨 Step 3: Transform with MelodyFlow")
     with gr.Row():
         with gr.Column():
             transform_prompt = gr.Textbox(
-                label="Transformation Prompt",
                 value="aggressive industrial techno with distorted sounds",
-                placeholder="Describe the style transformation",
                 lines=2
             )
         with gr.Column():
             transform_solver = gr.Dropdown(
-                label="Solver",
                 choices=["euler", "midpoint"],
                 value="euler",
                 info="EULER: faster (25 steps), MIDPOINT: slower (64 steps)"
             )
             transform_flowstep = gr.Slider(
-                label="Transform Intensity",
                 minimum=0.0,
                 maximum=0.15,
                 step=0.01,
@@ -494,9 +495,9 @@ with gr.Blocks(title="🎵 Stable Audio Loop Generator") as iface:
                 info="Lower = more dramatic transformation"
             )
-    transform_btn = gr.Button("🎨 Transform Audio", variant="secondary", size="lg")
-    transformed_audio = gr.Audio(label="Transformed Audio", type="filepath")
-    transform_status = gr.Textbox(label="Transform Status", value="Combine audio first")
     # ========== EVENT HANDLERS ==========
@@ -528,19 +529,19 @@ with gr.Blocks(title="🎵 Stable Audio Loop Generator") as iface:
         outputs=[transformed_audio, transform_status]
     )
-    # ========== EXAMPLES ==========
-    gr.Markdown("## 🎯 Example Workflows")
-    examples = gr.Examples(
-        examples=[
-            ["techno", 128, 4, "aggressive industrial techno"],
-            ["jazz", 110, 2, "smooth lo-fi jazz with vinyl crackle"],
-            ["ambient", 90, 8, "ethereal ambient soundscape"],
-            ["hip-hop", 100, 4, "classic boom bap hip-hop"],
-            ["drum and bass", 140, 4, "liquid drum and bass"],
-        ],
-        inputs=[base_prompt, global_bpm, global_bars, transform_prompt],
-    )
 if __name__ == "__main__":
     iface.launch()

 # ========== GRADIO INTERFACE ==========
+with gr.Blocks(title="stable-melodyflow") as iface:
+    gr.Markdown("# stable-melodyflow (aka jerry and terry)")
+    gr.Markdown("**generate synchronized drum and instrument loops with stable-audio-open-small (jerry), then transform with melodyflow (terry)!**")
     # ========== MODELS & PROJECT INFO ==========
+    with gr.Accordion(" some info about these models", open=False):
         with gr.Accordion("🚀 stable-audio-open-small", open=False):
             gr.Markdown("""
             **stable-audio-open-small** is an incredibly fast model from the zachs and friends at Stability AI. It's capable of generating 12 seconds of audio in under a second, which gives rise to a lot of very interesting kinds of UX.
+            **note about generation speed in this zerogpu space:** you'll notice generation times are a little slower here than if you were to use the model on a local gpu. that's just a result of the way zerogpu spaces work i think... let me know if there's a way to keep the model loaded in a zerogpu space!
+            **links:**
+            - 🤗 [model on HuggingFace](https://huggingface.co/stabilityai/stable-audio-open-small)
+                        there's a docker container at this repo that can be spun up as a standalone api specifically for stable-audio-open-small:
+            -  [stable-audio-api](https://github.com/betweentwomidnights/stable-audio-api)
             """)
+        with gr.Accordion("🎛️ melodyflow", open=False):
             gr.Markdown("""
+            **MelodyFlow** is a model by meta that can use regularized latent inversion to do transformations of input audio.
+            It's not officially a part of the audiocraft repo yet, but we use it as a docker container in the backend for gary4live. i really enjoy turning my guitar riffs into orchestra.
+            **links:**
+            - 🤗 [Official MelodyFlow Space](https://huggingface.co/spaces/Facebook/MelodyFlow)
+            -  [our melodyflow api](https://github.com/betweentwomidnights/melodyflow)
             """)
+        with gr.Accordion("gary4live Project", open=False):
             gr.Markdown("""
+            **gary4live** is a free/open source project that uses these models, along with musicGen, inside of ableton live to iterate on your projects with you. i run a backend myself so that we can all experiment with it, but you can also spin the backend up locally using docker-compose with our repo.
+            **project Links:**
+            -  [frontend repo](https://github.com/betweentwomidnights/gary4live)
+            -  [backend repo](https://github.com/betweentwomidnights/gary-backend-combined)
+            **installers:**
+            -  [p.c. & mac installers on gumroad](https://thepatch.gumroad.com/l/gary4live)
             """)
+    with gr.Accordion("how this works", open=False):
         gr.Markdown("""
+        **workflow:**
+        1. **set global bpm and bars** - affects both drum and instrument generation
+        2. **generate drum loop** - creates BPM-aware percussion with negative prompting to attempt to get rid of instruments
+        3. **generate instrument loop** - creates melodic/harmonic content with negative prompting to attempt to get rid of drums
+        4. **combine loops** - layer them together with repetitions (up to 30s)
+        5. **transform** - use melodyflow to stylistically transform the combined result
+        **features:**
+        - bpm-aware generation ensures perfect sync between loops (most the time lol)
+        - negative prompting separates drums from instruments (most the time)
+        - smart bar calculation optimizes loop length for the BPM
         """)
     # ========== GLOBAL CONTROLS ==========
+    gr.Markdown("## 🎛️ global settings")
     with gr.Row():
         global_bpm = gr.Dropdown(
+            label="global bpm",
             choices=[90, 100, 110, 120, 130, 140, 150],
             value=120,
+            info="bpm applied to both drum and instrument generation. keep this the same for the combine step to work correctly"
         )
         global_bars = gr.Dropdown(
+            label="loop length (bars)",
+            choices=[1, 2, 4],
             value=4,
+            info="number of bars for each loop. keep this the same for both pieces of audio"
         )
         base_prompt = gr.Textbox(
+            label="base prompt",
+            value="lofi hiphop with pianos",
+            placeholder="e.g., 'aggressive techno', 'lofi hiphop', 'chillwave', 'liquid drum and bass'",
+            info="prompt applied to either loop. make it more drum/instrument specific for best results"
         )
     # Auto-suggest optimal bars based on BPM
     global_bpm.change(update_suggested_bars, inputs=[global_bpm], outputs=[global_bars])
     # ========== LOOP GENERATION ==========
+    gr.Markdown("## step one: generate individual loops")
     with gr.Row():
         with gr.Column():
+            gr.Markdown("### drums")
+            generate_drums_btn = gr.Button("generate drums", variant="primary", size="lg")
+            drums_audio = gr.Audio(label="drum loop", type="filepath")
+            drums_status = gr.Textbox(label="status", value="ready to generate")
         with gr.Column():
+            gr.Markdown("### instruments")
+            generate_instruments_btn = gr.Button("generate instruments", variant="secondary", size="lg")
+            instruments_audio = gr.Audio(label="instrument loop", type="filepath")
+            instruments_status = gr.Textbox(label="status", value="Ready to generate")
     # Seed controls
     with gr.Row():
+        drums_seed = gr.Number(label="drums seed", value=-1, info="-1 for random")
+        instruments_seed = gr.Number(label="instruments seed", value=-1, info="-1 for random")
     # ========== COMBINATION ==========
+    gr.Markdown("## step two: combine loops")
     with gr.Row():
         num_repeats = gr.Slider(
+            label="number of repetitions",
             minimum=1,
             maximum=5,
             step=1,
             value=2,
+            info="how many times to repeat each loop (creates longer audio). aim for 30 seconds max"
         )
+        combine_btn = gr.Button("combine", variant="primary", size="lg")
+    combined_audio = gr.Audio(label="combined loops", type="filepath")
+    combine_status = gr.Textbox(label="status", value="Generate loops first")
     # ========== MELODYFLOW TRANSFORMATION ==========
+    gr.Markdown("## step three: transform with melodyflow")
     with gr.Row():
         with gr.Column():
             transform_prompt = gr.Textbox(
+                label="transformation prompt",
                 value="aggressive industrial techno with distorted sounds",
+                placeholder="describe the style of transformation",
                 lines=2
             )
         with gr.Column():
             transform_solver = gr.Dropdown(
+                label="solver",
                 choices=["euler", "midpoint"],
                 value="euler",
                 info="EULER: faster (25 steps), MIDPOINT: slower (64 steps)"
             )
             transform_flowstep = gr.Slider(
+                label="transform intensity",
                 minimum=0.0,
                 maximum=0.15,
                 step=0.01,
                 info="Lower = more dramatic transformation"
             )
+    transform_btn = gr.Button("transform audio", variant="secondary", size="lg")
+    transformed_audio = gr.Audio(label="transformed audio", type="filepath")
+    transform_status = gr.Textbox(label="status", value="Combine audio first")
     # ========== EVENT HANDLERS ==========
         outputs=[transformed_audio, transform_status]
     )
+    # # ========== EXAMPLES ==========
+    # gr.Markdown("## 🎯 Example Workflows")
+    # examples = gr.Examples(
+    #     examples=[
+    #         ["techno", 128, 4, "aggressive industrial techno"],
+    #         ["jazz", 110, 2, "smooth lo-fi jazz with vinyl crackle"],
+    #         ["ambient", 90, 8, "ethereal ambient soundscape"],
+    #         ["hip-hop", 100, 4, "classic boom bap hip-hop"],
+    #         ["drum and bass", 140, 4, "liquid drum and bass"],
+    #     ],
+    #     inputs=[base_prompt, global_bpm, global_bars, transform_prompt],
+    # )
 if __name__ == "__main__":
     iface.launch()