You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

SIREN-FX

Neural Audio Effects Processing with State Space Models

SIREN-FX is part of the SIREN Audio Suite - a family of neural audio processing models designed for professional music production workflows.

Model Description

SIREN-FX learns to model and apply complex audio effects using a State Space Model (S4) architecture. Unlike traditional neural audio effects that are limited to simple nonlinearities, SIREN-FX can capture time-dependent effects with theoretically unlimited receptive fields, making it ideal for:

  • Reverb - Room acoustics and spatial effects
  • Delay - Time-based echo and repetition
  • Chorus/Flanger/Phaser - Modulation effects
  • Complex effect chains - Multi-effect processing

Architecture

Component Details
Base Architecture S4 (Structured State Space Sequence Model)
Model Dimension 128
Number of Layers 8
State Size 64
Parameters ~1.5M
Sample Rate 44.1 kHz

The S4 architecture provides:

  • Unlimited receptive field - Can model any length-dependent effect
  • Linear time complexity - Efficient inference
  • Stable training - Proper initialization for long sequences

The SIREN Family

SIREN-FX is part of a suite of audio AI models:

Model Purpose
SIREN-FX Neural audio effects (this model)
SIREN-FIX Audio restoration and repair
SIREN-MASTER Audio enhancement and mastering
SIREN-STEER Steerable audio transformations
SIREN-SEPARATE Source separation
SIREN-TRANSCRIBE Music analysis (key, tempo, beats)

Usage

import torch
import torchaudio

# Load model
checkpoint = torch.load('siren_fx.pt', map_location='cpu')
model_state = checkpoint['model_state_dict']

# Model expects mono audio at 44.1kHz
# Input shape: (batch, 1, samples)
# Output shape: (batch, 1, samples)

Training Details

  • Training Data: Large-scale audio effects dataset
  • Training Duration: 200 epochs
  • Hardware: NVIDIA B200 GPUs
  • Final Validation Loss: 1.0482

Intended Use

SIREN-FX is designed for:

  • Music production and post-production
  • Audio effect modeling and emulation
  • Creative sound design
  • Research in neural audio processing

Limitations

  • Optimized for 44.1kHz sample rate
  • Best results with mono input
  • Effects learned from training distribution

License

Apache 2.0

Citation

If you use SIREN-FX in your research, please cite:

@software{siren_fx_2026,
  title={SIREN-FX: Neural Audio Effects with State Space Models},
  author={SIREN Team},
  year={2026},
  url={https://huggingface.co/hilarl/siren-fx}
}

Contact

For questions and feedback, please open an issue on the model repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support