matteospanio (Matteo Spanio)

liked a model 9 days ago

yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF

Text Generation • 12B • Updated 7 days ago • 496k • 2.36k

upvoted a changelog 17 days ago

Hugging Face Changelog

Publish models from CI without HF_TOKEN

17 days ago

• 107

reacted to their post with 🚀 19 days ago

Post

7052

🎶 Released mule-torch — an unofficial PyTorch port of MULE (SF-NFNet-F0), SiriusXM/Pandora's music-audio embedding model (McCallum et al., ISMIR 2022).

No retraining: I re-implemented the architecture in pure PyTorch and transferred the original TensorFlow weights, then checked it layer by layer against the genuine TF pipeline.

✅ End-to-end clip-embedding cosine 0.9999999 vs the original
✅ ONNX backbone parity < 1e-6
✅ 62.35M params (paper: ~62.4M)
✅ Batched, GPU-native, ONNX-exportable — none of which the original Analysis pipeline does

pip install mule-torch

from mule_torch import MuleModel
emb = MuleModel.from_pretrained()(waveform)   # (B, T)@16kHz -> (B, 1728)

🤗 Weights: matteospanio/mule
💻 Code: https://github.com/matteospanio/mule-torch
📦 PyPI: https://pypi.org/project/mule-torch/

The fun bug: parity was perfect through every conv but the block output was anti-correlated (cos = −1). Cause: the learnable skip-init gains couldn't be mapped by layer name (Keras scrambles the order) — they had to be recovered from the graph.

⚠️ Unofficial, community port — not affiliated with or endorsed by the original authors. All credit to them; please cite the paper. Weights inherit CC-BY-NC-4.0.

liked a model 19 days ago

matteospanio/mule

Feature Extraction • 66.6M • Updated 19 days ago • 117 • 3

posted an update 19 days ago

Post

7052

🎶 Released mule-torch — an unofficial PyTorch port of MULE (SF-NFNet-F0), SiriusXM/Pandora's music-audio embedding model (McCallum et al., ISMIR 2022).

No retraining: I re-implemented the architecture in pure PyTorch and transferred the original TensorFlow weights, then checked it layer by layer against the genuine TF pipeline.

✅ End-to-end clip-embedding cosine 0.9999999 vs the original
✅ ONNX backbone parity < 1e-6
✅ 62.35M params (paper: ~62.4M)
✅ Batched, GPU-native, ONNX-exportable — none of which the original Analysis pipeline does

pip install mule-torch

from mule_torch import MuleModel
emb = MuleModel.from_pretrained()(waveform)   # (B, T)@16kHz -> (B, 1728)

🤗 Weights: matteospanio/mule
💻 Code: https://github.com/matteospanio/mule-torch
📦 PyPI: https://pypi.org/project/mule-torch/

The fun bug: parity was perfect through every conv but the block output was anti-correlated (cos = −1). Cause: the learnable skip-init gains couldn't be mapped by layer name (Keras scrambles the order) — they had to be recovered from the graph.

⚠️ Unofficial, community port — not affiliated with or endorsed by the original authors. All credit to them; please cite the paper. Weights inherit CC-BY-NC-4.0.

updated a model 20 days ago

matteospanio/mule

Feature Extraction • 66.6M • Updated 19 days ago • 117 • 3

published a model 20 days ago

matteospanio/mule

Feature Extraction • 66.6M • Updated 19 days ago • 117 • 3

upvoted a paper about 1 month ago

Stable Audio 3

Paper • 2605.17991 • Published May 18 • 20

liked a Space about 1 month ago

Stable Audio 3

🎵

114

Text-to-audio with SA3 Medium / Small Music / Small SFX.

liked a model about 1 month ago

liujiafeng/Khala-MusicGeneration-v1.0

Updated May 3 • 24

authored a paper 2 months ago

BMdataset: A Musicologically Curated LilyPond Dataset

Paper • 2604.10628 • Published Apr 12 • 2

upvoted a paper 2 months ago

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Paper • 2604.10905 • Published Apr 13 • 29

liked a model 2 months ago

nvidia/audio-flamingo-next-hf

Audio-Text-to-Text • 8B • Updated May 13 • 8.52k • 56

upvoted a paper 2 months ago

BMdataset: A Musicologically Curated LilyPond Dataset

Paper • 2604.10628 • Published Apr 12 • 2

submitted a paper to Daily Papers 2 months ago

BMdataset: A Musicologically Curated LilyPond Dataset

Paper • 2604.10628 • Published Apr 12 • 2

updated a model 2 months ago

csc-unipd/lilybert

Fill-Mask • 0.1B • Updated Apr 14 • 3 • 1

liked a model 3 months ago

csc-unipd/lilybert

Fill-Mask • 0.1B • Updated Apr 14 • 3 • 1

published a model 3 months ago

csc-unipd/lilybert

Fill-Mask • 0.1B • Updated Apr 14 • 3 • 1

liked a dataset 3 months ago

projectlosangeles/Discover-MIDI-Dataset

Updated Dec 28, 2025 • 108 • 68

posted an update 4 months ago

Post

156

🎧 Help us evaluate AI-generated music across cultures

We are running a new online survey to collect data for a follow-up study on AI-generated music and sonic seasoning: the phenomenon where sound can influence the perception of taste.

This study builds on our previous work:
📄 Paper: A Multimodal Symphony: Integrating Taste and Sound through Generative AI (2503.02823)
🤗 Model release: csc-unipd/tasty-musicgen-small

In the previous study we introduced Tasty MusicGen, a model designed to generate music associated with taste descriptors, and showed that it can effectively produce music that evokes specific taste-related sensations.

With this new survey, we aim to expand the participant pool to create a more inclusive and cross-cultural evaluation, helping us understand how these musical cues are perceived across different linguistic and cultural backgrounds.

Participants will listen to short AI-generated music clips and evaluate the sensations they evoke.

⏱️ Takes about 10 minutes

👉 Participate here:
https://matteospanio.github.io/tasty-music-survey/

Your participation will help us better understand how generative music models interact with human perception across cultures. Thanks to everyone who participates or shares!

Matteo Spanio

AI & ML interests

Recent Activity

Organizations

yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF

Publish models from CI without HF_TOKEN

matteospanio/mule

matteospanio/mule

matteospanio/mule

Stable Audio 3

Stable Audio 3

liujiafeng/Khala-MusicGeneration-v1.0

BMdataset: A Musicologically Curated LilyPond Dataset

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

nvidia/audio-flamingo-next-hf

BMdataset: A Musicologically Curated LilyPond Dataset

BMdataset: A Musicologically Curated LilyPond Dataset

csc-unipd/lilybert

csc-unipd/lilybert

csc-unipd/lilybert

projectlosangeles/Discover-MIDI-Dataset

Matteo Spanio

AI & ML interests

Recent Activity

Organizations

matteospanio's activity

Publish models from CI without HF_TOKEN

Stable Audio 3