AudioSep-hive / README.md

nielsr HF Staff

Improve model card with links and usage instructions

cdd24f1 verified about 1 month ago

2.24 kB

datasets:
  - ShandaAI/Hive
language:
  - en
license: apache-2.0
pipeline_tag: audio-to-audio
tags:
  - audio
  - sound-separation
  - audiosep

AudioSep-hive

AudioSep-hive is a data-efficient, query-based universal sound separation model trained on the Hive dataset. By leveraging the high-quality, semantically consistent Hive dataset, this model achieves competitive separation accuracy and perceptual quality comparable to state-of-the-art models (such as SAM-Audio) while utilizing only a fraction (~0.2%) of the training data volume.

Paper: A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation
Project Page: https://shandaai.github.io/Hive
Code Repository: https://github.com/ShandaAI/Hive

Model Details

Model Type: Query-Based Universal Sound Separation
Language(s): English (for text queries)
License: Apache 2.0
Trained on: ShandaAI/Hive (2,442 hours of raw audio, 19.6M mixtures)

Uses

The model is intended for universal sound separation tasks, allowing users to extract specific sounds from complex audio mixtures using multimodal prompts (e.g., text descriptions or audio queries).

Usage

To use this model, you can use the inference scripts provided in the official GitHub repository.

1. Install dependencies

git clone https://github.com/ShandaAI/Hive
cd Hive
pip install torch torchaudio librosa pyyaml pytorch-lightning huggingface_hub gradio

2. Run Inference

The following command will automatically download the configuration and checkpoints from this repository:

python infer_audiosep.py \
  --audio_file /path/to/mixture.wav \
  --text "acoustic guitar" \
  --output_file /path/to/audiosep_output.wav

Citation

@article{li2026semantically,
  title={A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation},
  author={Li, Kai and Cheng, Jintao and Zeng, Chang and Yan, Zijun and Wang, Helin and Su, Zixiong and Zheng, Bo and Hu, Xiaolin},
  journal={arXiv preprint arXiv:2601.22599},
  year={2026}
}