Spaces:

yoyolicoris
/

diffvox

Running

App Files Files Community

yoyolicoris commited on May 10

Commit

15fe46b

1 Parent(s): eebf84c

Add detailed documentation for Random Vocal Effects Generator

Browse files

Files changed (1) hide show

app.py +29 -6

app.py CHANGED Viewed

@@ -12,6 +12,32 @@ from functools import partial
 from modules.utils import chain_functions, vec2statedict, get_chunks
 from modules.fx import clip_delay_eq_Q
 SLIDER_MAX = 3
 SLIDER_MIN = -3
 NUMBER_OF_PCS = 10
@@ -116,14 +142,11 @@ def get_important_pcs(n=10, **kwargs):
 with gr.Blocks() as demo:
-    gr.Markdown(
-        """
-        # Hadamard Transform
-        This is a demo of the Hadamard transform.
-        """
-    )
     with gr.Row():
         with gr.Column():
             audio_input = gr.Audio(type="numpy", sources="upload", label="Input Audio")
             with gr.Row():
                 random_button = gr.Button(

 from modules.utils import chain_functions, vec2statedict, get_chunks
 from modules.fx import clip_delay_eq_Q
+space_md = """
+# Random Vocal Effects Generator
+This is a demo of the paper [DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions](https://arxiv.org/abs/2504.14735), accepted at DAFx 2025.
+In this demo, you can upload a raw vocal audio file and apply random effects to make it sound better!
+The effects consist of series of EQ, compressor, delay, and reverb.
+The generator is a PCA model derived from 365 vocal effects presets fitted with the same effects chain.
+This interface allows you to control the first 10 principal components (PCs) of the generator, randomise them, and render the audio.
+For the rest of the PCs, you can choose to randomise them or set them to zero.
+To give you some idea, we emperically found that the first PC controls the amount of reverb and the second PC controls the amount of brightness.
+Note that adding these PCs together does not necessarily mean that their effects are additive in the final audio.
+We found sometimes the effects of least important PCs are more perceptible.
+Try to play around with the sliders and buttons and see what you can come up with!
+Currently only a portion of PCs are tweakable, but in the future we will add more controls and visualisation tools.
+For example:
+- Exposing all the PCs
+- Directly controlling the parameters of the effects
+- Visualising the PCA space
+- Visualising the frequency responses/dynamic curves of the effects
+- Exporting the effects settings as JSON files
+"""
 SLIDER_MAX = 3
 SLIDER_MIN = -3
 NUMBER_OF_PCS = 10
 with gr.Blocks() as demo:
     with gr.Row():
         with gr.Column():
+            gr.Markdown(
+                space_md,
+            )
             audio_input = gr.Audio(type="numpy", sources="upload", label="Input Audio")
             with gr.Row():
                 random_button = gr.Button(