š¶ Released mule-torch ā an unofficial PyTorch port of MULE (SF-NFNet-F0), SiriusXM/Pandora's music-audio embedding model (McCallum et al., ISMIR 2022).
No retraining: I re-implemented the architecture in pure PyTorch and transferred the original TensorFlow weights, then checked it layer by layer against the genuine TF pipeline.
ā End-to-end clip-embedding cosine 0.9999999 vs the original ā ONNX backbone parity < 1e-6 ā 62.35M params (paper: ~62.4M) ā Batched, GPU-native, ONNX-exportable ā none of which the original Analysis pipeline does
The fun bug: parity was perfect through every conv but the block output was anti-correlated (cos = ā1). Cause: the learnable skip-init gains couldn't be mapped by layer name (Keras scrambles the order) ā they had to be recovered from the graph.
ā ļø Unofficial, community port ā not affiliated with or endorsed by the original authors. All credit to them; please cite the paper. Weights inherit CC-BY-NC-4.0.
š¶ Released mule-torch ā an unofficial PyTorch port of MULE (SF-NFNet-F0), SiriusXM/Pandora's music-audio embedding model (McCallum et al., ISMIR 2022).
No retraining: I re-implemented the architecture in pure PyTorch and transferred the original TensorFlow weights, then checked it layer by layer against the genuine TF pipeline.
ā End-to-end clip-embedding cosine 0.9999999 vs the original ā ONNX backbone parity < 1e-6 ā 62.35M params (paper: ~62.4M) ā Batched, GPU-native, ONNX-exportable ā none of which the original Analysis pipeline does
The fun bug: parity was perfect through every conv but the block output was anti-correlated (cos = ā1). Cause: the learnable skip-init gains couldn't be mapped by layer name (Keras scrambles the order) ā they had to be recovered from the graph.
ā ļø Unofficial, community port ā not affiliated with or endorsed by the original authors. All credit to them; please cite the paper. Weights inherit CC-BY-NC-4.0.
š§ Help us evaluate AI-generated music across cultures
We are running a new online survey to collect data for a follow-up study on AI-generated music and sonic seasoning: the phenomenon where sound can influence the perception of taste.
In the previous study we introduced Tasty MusicGen, a model designed to generate music associated with taste descriptors, and showed that it can effectively produce music that evokes specific taste-related sensations.
With this new survey, we aim to expand the participant pool to create a more inclusive and cross-cultural evaluation, helping us understand how these musical cues are perceived across different linguistic and cultural backgrounds.
Participants will listen to short AI-generated music clips and evaluate the sensations they evoke.
Your participation will help us better understand how generative music models interact with human perception across cultures. Thanks to everyone who participates or shares!