Spaces:

FreedomIntelligence
/

S2S-Arena

Running

App Files Files Community

KurtDu commited on Nov 21, 2024

Commit

373bffb

verified ·

1 Parent(s): 09c824e

Update templates/index.html

Browse files

Files changed (1) hide show

templates/index.html +66 -6

templates/index.html CHANGED Viewed

@@ -57,16 +57,76 @@
     <div class="container py-5">
         <h3>Welcome to the Speech-to-Speech Model Evaluation</h3>
-        <div id="evaluation-info" class="mb-4">
-            <p>
                 <strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation!</strong>
                 <br><br>
-                In this evaluation, you will assess the performance of various S2S models, such as
                 <strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, and
-                <strong>Mini-Omni</strong>. The goal is to evaluate how well these models handle various speech tasks across different domains.
                 <br><br>
-                You will listen to audio inputs and evaluate the models' outputs based on their ability to follow instructions.
-                Get ready to explore the cutting-edge of speech technology!
             </p>
         </div>

     <div class="container py-5">
         <h3>Welcome to the Speech-to-Speech Model Evaluation</h3>
+        <div id="evaluation-info" class="mb-5">
+            <p class="text-start">
                 <strong>Welcome to the Speech-to-Speech (S2S) Model Evaluation!</strong>
                 <br><br>
+                In this evaluation, you will assess the performance of 4 S2S models:
                 <strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, and
+                <strong>Mini-Omni</strong>.
+                The goal is to evaluate how well these models handle various speech tasks across different domains.
                 <br><br>
+                Once you select a specific domain and task (e.g., <em>Educational Tutoring</em> and <em>Rhythm Control</em>),
+                you will proceed to the evaluation stage. In each round, you will be presented with an audio input.
+                For example:
+                <br><br>
+                <!-- Left-aligned Audio Sample and Audio Control -->
+                <span style="vertical-align: middle; line-height: 1.2; display: inline-block;"><strong>Audio Sample:</strong></span>
+                <audio controls style="vertical-align: middle;">
+                    <source src="/static/audio/sample/input_audio.wav" type="audio/wav">
+                </audio>
+                <br><br>
+                The corresponding text is:
+                <em>"Say the following sentence at my speed first, then say it again very slowly:
+                    'Artificial intelligence is changing the world in many ways.'" </em>
+                <small>(Note: the audio plays at 1.5x the normal speed.)</small>
+                <br><br>
+                The responses of different S2S models will be provided, and your task is to choose which response best follows
+                the instructions. For example<small>(Note: During the evaluation process, you will be provided with responses from only the two models that have the most comparative significance.)</small>:
+                <br><br>
+                <!-- ChatGPT-4o Output -->
+                <span><strong>ChatGPT-4o:</strong></span>
+                <audio controls style="vertical-align: middle;">
+                    <source src="/static/audio/sample/4o_audio.wav" type="audio/wav">
+                </audio>
+                <p class="text-start" style="margin-left: 20px;">
+                    <strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
+                </p>
+                <!-- FunAudioLLM Output -->
+                <span><strong>FunAudioLLM:</strong></span>
+                <audio controls style="vertical-align: middle;">
+                    <source src="/static/audio/sample/FunAudio_audio.wav" type="audio/wav">
+                </audio>
+                <p class="text-start" style="margin-left: 20px;">
+                    <strong>Performance:</strong> Speech: Partially followed the instruction on speed. Semantics: Accurately followed the instruction, with no semantic deviation or missing information.
+                </p>
+                <!-- SpeechGPT Output -->
+                <span><strong>SpeechGPT:</strong></span>
+                <audio controls style="vertical-align: middle;">
+                    <source src="/static/audio/sample/SpeechGPT.wav" type="audio/wav">
+                </audio>
+                <p class="text-start" style="margin-left: 20px;">
+                    <strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Partially followed the instruction, with minor semantic deviation and missing information.
+                </p>
+                <!-- Mini-Omni Output -->
+                <span><strong>Mini-Omni:</strong></span>
+                <audio controls style="vertical-align: middle;">
+                    <source src="/static/audio/sample/mini-omni.wav" type="audio/wav">
+                </audio>
+                <p class="text-start" style="margin-left: 20px;">
+                    <strong>Performance:</strong> Speech: Did not follow the instruction on speed. Semantics: Did not follow the instruction, with significant semantic deviation and missing information.
+                </p>
+                <p class="text-start">
+                    After making your choice, you'll proceed to the next round.
+                </p>
+                <strong>Please enter your username and start the evaluation!</strong>
             </p>
         </div>