Audio-to-Audio
English
audio
sound-separation
flowsep
FlowSep-hive / README.md
nielsr's picture
nielsr HF Staff
Improve model card: add pipeline tag, links, and usage
635c9ca verified
|
Raw
History Blame
2.54 kB
metadata
datasets:
  - ShandaAI/Hive
language:
  - en
license: apache-2.0
pipeline_tag: audio-to-audio
tags:
  - audio
  - sound-separation
  - flowsep

FlowSep-hive

Model Description

FlowSep-hive is a data-efficient, query-based universal sound separation model trained on the Hive dataset. By leveraging the high-quality, semantically consistent Hive dataset, this model achieves competitive separation accuracy and perceptual quality comparable to state-of-the-art models (such as SAM-Audio) while utilizing only a fraction (~0.2%) of the training data volume.

This model was introduced in the paper: A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation.

Model Details

  • Model Type: Query-Based Universal Sound Separation
  • Language(s): English (for text queries)
  • License: Apache 2.0
  • Trained on: ShandaAI/Hive (2,442 hours of raw audio, 19.6M mixtures)

Uses

The model is intended for universal sound separation tasks, allowing users to extract specific sounds from complex audio mixtures using multimodal prompts (e.g., text descriptions or audio queries).

Usage

You can perform inference using the scripts provided in the official GitHub repository.

1) Install dependencies

pip install torch torchaudio librosa pyyaml pytorch-lightning huggingface_hub

2) FlowSep inference

Clone the repository and use the infer_flowsep.py script, which automatically downloads the configuration and checkpoints:

python infer_flowsep.py \
  --audio_file /path/to/mixture.wav \
  --text "acoustic guitar" \
  --output_file /path/to/flowsep_output.wav

Citation

If you find this model or the Hive dataset useful, please cite:

@article{li2026semantically,
  title={A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation},
  author={Li, Kai and Cheng, Jintao and Zeng, Chang and Yan, Zijun and Wang, Helin and Su, Zixiong and Zheng, Bo and Hu, Xiaolin},
  journal={arXiv preprint arXiv:2601.22599},
  year={2026}
}