You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SIREN-STEER

Steerable Neural Audio Effects with Temporal Convolutional Networks

SIREN-STEER is part of the SIREN Audio Suite - a family of neural audio processing models designed for professional music production workflows.

Model Description

SIREN-STEER provides controllable audio effect transformations using a Temporal Convolutional Network (TCN) architecture. The model learns to apply effects in a steerable manner, allowing fine-grained control over effect intensity and character.

Key capabilities:

Steerable intensity - Control effect strength from 0% to 100%
Style transfer - Transfer sonic characteristics between tracks
Effect modeling - Learn any audio effect transformation
Real-time capable - Efficient architecture for low-latency processing

Architecture

Component	Details
Base Architecture	TCN (Temporal Convolutional Network)
Receptive Field	Exponentially growing with depth
Parameters	~33K (lightweight)
Sample Rate	44.1 kHz
Latency	< 5ms

The TCN architecture provides:

Causal convolutions - No future information leakage
Dilated convolutions - Large receptive field with few parameters
Residual connections - Stable gradient flow

Supported Effects

SIREN-STEER was trained on the IDMT-SMT-AUDIO-EFFECTS dataset covering:

Distortion / Overdrive
Reverb
Delay / Echo
Chorus
Flanger
Phaser
Tremolo
Vibrato
Compression
EQ

The SIREN Family

SIREN-STEER is part of a suite of audio AI models:

Model	Purpose
SIREN-FX	Neural audio effects
SIREN-FIX	Audio restoration and repair
SIREN-MASTER	Audio enhancement and mastering
SIREN-STEER	Steerable audio transformations (this model)
SIREN-SEPARATE	Source separation
SIREN-TRANSCRIBE	Music analysis (key, tempo, beats)

Usage

import torch
import torchaudio

# Load model
checkpoint = torch.load('siren_steer.pt', map_location='cpu')

# Model expects mono audio at 44.1kHz
# Input shape: (batch, samples)
# Conditioning: effect intensity 0.0-1.0

Training Details

Training Data: IDMT-SMT-AUDIO-EFFECTS (23,352 effect pairs)
Training Duration: 200 epochs
Hardware: NVIDIA B200 GPU
Final Validation Loss: 1.81

Intended Use

SIREN-STEER is designed for:

Controllable audio effect application
Effect intensity interpolation
Style transfer between audio tracks
Real-time audio processing plugins
Research in steerable audio transformations

Lightweight Design

At only ~33K parameters, SIREN-STEER is specifically designed for:

Edge deployment - Runs on CPUs and mobile devices
Plugin integration - Minimal memory footprint
Real-time processing - Sub-5ms latency
Batch processing - Handle many tracks simultaneously

Limitations

Optimized for 44.1kHz sample rate
Best results with mono input
Effect quality depends on training distribution

License

Apache 2.0

Citation

If you use SIREN-STEER in your research, please cite:

@software{siren_steer_2026,
  title={SIREN-STEER: Steerable Neural Audio Effects},
  author={SIREN Team},
  year={2026},
  url={https://huggingface.co/hilarl/siren-steer}
}

Contact

For questions and feedback, please open an issue on the model repository.

Downloads last month: -; Downloads are not tracked for this model. How to track