Audio-to-Audio
English
audio
sound-separation
audiosep
AudioSep-hive / README.md
nielsr's picture
nielsr HF Staff
Improve model card with links and usage instructions
cdd24f1 verified
|
Raw
History Blame
2.24 kB
---
datasets:
- ShandaAI/Hive
language:
- en
license: apache-2.0
pipeline_tag: audio-to-audio
tags:
- audio
- sound-separation
- audiosep
---
# AudioSep-hive
**AudioSep-hive** is a data-efficient, query-based universal sound separation model trained on the [Hive dataset](https://huggingface.co/datasets/ShandaAI/Hive). By leveraging the high-quality, semantically consistent Hive dataset, this model achieves competitive separation accuracy and perceptual quality comparable to state-of-the-art models (such as SAM-Audio) while utilizing only a fraction (~0.2%) of the training data volume.
- **Paper:** [A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation](https://arxiv.org/abs/2601.22599)
- **Project Page:** https://shandaai.github.io/Hive
- **Code Repository:** https://github.com/ShandaAI/Hive
## Model Details
- **Model Type:** Query-Based Universal Sound Separation
- **Language(s):** English (for text queries)
- **License:** Apache 2.0
- **Trained on:** [ShandaAI/Hive](https://huggingface.co/datasets/ShandaAI/Hive) (2,442 hours of raw audio, 19.6M mixtures)
## Uses
The model is intended for universal sound separation tasks, allowing users to extract specific sounds from complex audio mixtures using multimodal prompts (e.g., text descriptions or audio queries).
## Usage
To use this model, you can use the inference scripts provided in the official GitHub repository.
### 1. Install dependencies
```bash
git clone https://github.com/ShandaAI/Hive
cd Hive
pip install torch torchaudio librosa pyyaml pytorch-lightning huggingface_hub gradio
```
### 2. Run Inference
The following command will automatically download the configuration and checkpoints from this repository:
```bash
python infer_audiosep.py \
--audio_file /path/to/mixture.wav \
--text "acoustic guitar" \
--output_file /path/to/audiosep_output.wav
```
## Citation
```bibtex
@article{li2026semantically,
title={A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation},
author={Li, Kai and Cheng, Jintao and Zeng, Chang and Yan, Zijun and Wang, Helin and Su, Zixiong and Zheng, Bo and Hu, Xiaolin},
journal={arXiv preprint arXiv:2601.22599},
year={2026}
}
```