You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

SIREN-SEPARATE

Two-Stage Source Separation with Query-Based Extraction

SIREN-SEPARATE is part of the SIREN Audio Suite - a family of neural audio processing models designed for professional music production workflows.

Model Description

SIREN-SEPARATE provides two-stage source separation. Stage 1 performs standard 4-stem separation. Stage 2 uses a fine-tuned query-conditioned model for granular extraction from the "Other" stem.

Key capabilities:

  • 4-stem separation - Vocals, Drums, Bass, Other (Stage 1)
  • Query-based extraction - Extract specific instruments from Other (Stage 2)
  • Granular separation - Guitar, piano, synth, strings, any instrument
  • High quality - Fine-tuned from pretrained checkpoint

Architecture

Component Details
Stage 1 Pretrained 4-stem separator (9.2 dB SDR)
Stage 2 Query-conditioned separator (fine-tuned)
Checkpoint Size 616MB
Sample Rate 44.1 kHz

The SIREN Family

Model Purpose
SIREN-FX Neural audio effects
SIREN-FIX Audio restoration and repair
SIREN-MASTER Audio enhancement and mastering
SIREN-STEER Steerable audio transformations
SIREN-SEPARATE Source separation (this model)
SIREN-TRANSCRIBE Music analysis (key, tempo, beats)

Usage

import torch

# Load Stage 2 model (query-based separation)
checkpoint = torch.load('siren_separate.pt', map_location='cpu')

# Stage 1: Use pretrained separator for basic 4-stem separation
# Stage 2: Use this model for granular extraction from Other stem

Training Details

  • Training Data: Multi-track audio dataset
  • Hardware: NVIDIA B200 GPU
  • Fine-tuning: From pretrained separator checkpoint

Intended Use

  • Music source separation
  • Stem extraction for remixing
  • Instrument isolation
  • Karaoke (vocal removal)
  • Research in audio source separation

Limitations

  • Stage 1 limited to 4 stems (vocals, drums, bass, other)
  • Stage 2 quality depends on query audio similarity
  • Optimized for 44.1kHz sample rate

License

Apache 2.0

Citation

@software{siren_separate_2026,
  title={SIREN-SEPARATE: Two-Stage Source Separation},
  author={SIREN Team},
  year={2026},
  url={https://huggingface.co/hilarl/siren-separate}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support