laion
/

Empathic-Insight-Voice-Small

Model card Files Files and versions

ChristophSchuhmann commited on May 21, 2025

Commit

67074fe

·

verified ·

1 Parent(s): 1fda629

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ license: cc-by-4.0
 ## Example Video Analyses (Top 3 Emotions)
 <!-- This section will be populated by the HTML from Cell 0 -->
-{{<div style='display: flex; flex-wrap: wrap; justify-content: flex-start; gap: 15px;'>
             <div style='flex: 0 1 auto; margin-bottom: 15px; text-align: center; width: 480px; max-width: 480px;'>
                 <a href='https://www.youtube.com/watch?v=TsTVKCmqHhk' target='_blank' title='Watch video TsTVKCmqHhk'>
                     <img src='https://img.youtube.com/vi/TsTVKCmqHhk/hqdefault.jpg' alt='YouTube Thumbnail for TsTVKCmqHhk' style='width: 100%; height: auto; border: 1px solid #ccc; border-radius: 4px; display: block;'>
@@ -32,7 +32,7 @@ license: cc-by-4.0
                 </a>
                 <p style='font-size: 0.8em; margin-top: 5px; word-break: break-all;'>ID: dDrmjcUq8W4</p>
             </div>
-            </div>}}
 **Empathic-Insight-Voice-Small** is a suite of 40+ emotion and attribute regression models trained on the EMONET-VOICE benchmark dataset, which is derived from the large-scale, multilingual synthetic voice-acting dataset LAION'S GOT TALENT. Each model is designed to predict the intensity of a specific fine-grained emotion or attribute from speech audio. These models leverage embeddings from a fine-tuned Whisper model (mkrausio/EmoWhisper-AnS-Small-v0.1) followed by dedicated MLP regression heads for each dimension.

 ## Example Video Analyses (Top 3 Emotions)
 <!-- This section will be populated by the HTML from Cell 0 -->
+<div style='display: flex; flex-wrap: wrap; justify-content: flex-start; gap: 15px;'>
             <div style='flex: 0 1 auto; margin-bottom: 15px; text-align: center; width: 480px; max-width: 480px;'>
                 <a href='https://www.youtube.com/watch?v=TsTVKCmqHhk' target='_blank' title='Watch video TsTVKCmqHhk'>
                     <img src='https://img.youtube.com/vi/TsTVKCmqHhk/hqdefault.jpg' alt='YouTube Thumbnail for TsTVKCmqHhk' style='width: 100%; height: auto; border: 1px solid #ccc; border-radius: 4px; display: block;'>
                 </a>
                 <p style='font-size: 0.8em; margin-top: 5px; word-break: break-all;'>ID: dDrmjcUq8W4</p>
             </div>
+            </div>
 **Empathic-Insight-Voice-Small** is a suite of 40+ emotion and attribute regression models trained on the EMONET-VOICE benchmark dataset, which is derived from the large-scale, multilingual synthetic voice-acting dataset LAION'S GOT TALENT. Each model is designed to predict the intensity of a specific fine-grained emotion or attribute from speech audio. These models leverage embeddings from a fine-tuned Whisper model (mkrausio/EmoWhisper-AnS-Small-v0.1) followed by dedicated MLP regression heads for each dimension.