datasets:
- ShandaAI/Hive
language:
- en
license: apache-2.0
pipeline_tag: audio-to-audio
tags:
- audio
- sound-separation
- flowsep
FlowSep-hive
Model Description
FlowSep-hive is a data-efficient, query-based universal sound separation model trained on the Hive dataset. By leveraging the high-quality, semantically consistent Hive dataset, this model achieves competitive separation accuracy and perceptual quality comparable to state-of-the-art models (such as SAM-Audio) while utilizing only a fraction (~0.2%) of the training data volume.
This model was introduced in the paper: A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation.
- Developed by: Shanda AI Research Tokyo
- Paper: Hugging Face Papers
- Code Repository: GitHub - ShandaAI/Hive
- Project Page: https://shandaai.github.io/Hive
Model Details
- Model Type: Query-Based Universal Sound Separation
- Language(s): English (for text queries)
- License: Apache 2.0
- Trained on: ShandaAI/Hive (2,442 hours of raw audio, 19.6M mixtures)
Uses
The model is intended for universal sound separation tasks, allowing users to extract specific sounds from complex audio mixtures using multimodal prompts (e.g., text descriptions or audio queries).
Usage
You can perform inference using the scripts provided in the official GitHub repository.
1) Install dependencies
pip install torch torchaudio librosa pyyaml pytorch-lightning huggingface_hub
2) FlowSep inference
Clone the repository and use the infer_flowsep.py script, which automatically downloads the configuration and checkpoints:
python infer_flowsep.py \
--audio_file /path/to/mixture.wav \
--text "acoustic guitar" \
--output_file /path/to/flowsep_output.wav
Citation
If you find this model or the Hive dataset useful, please cite:
@article{li2026semantically,
title={A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation},
author={Li, Kai and Cheng, Jintao and Zeng, Chang and Yan, Zijun and Wang, Helin and Su, Zixiong and Zheng, Bo and Hu, Xiaolin},
journal={arXiv preprint arXiv:2601.22599},
year={2026}
}