File size: 1,498 Bytes
113d2e4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ---
license: apache-2.0
language:
- en
tags:
- audio
- sound-separation
- audio-to-audio
- audiosep
datasets:
- ShandaAI/Hive
---
# AudioSep-hive
## Model Description
**AudioSep-hive** is a data-efficient, query-based universal sound separation model trained on the [Hive dataset](https://huggingface.co/datasets/ShandaAI/Hive). By leveraging the high-quality, semantically consistent Hive dataset, this model achieves competitive separation accuracy and perceptual quality comparable to state-of-the-art models (such as SAM-Audio) while utilizing only a fraction (~0.2%) of the training data volume.
This model is developed by **Shanda AI Research Tokyo** and is introduced in the paper: [A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation](https://arxiv.org/abs/2601.22599).
## Model Details
- **Model Type:** Query-Based Universal Sound Separation
- **Language(s):** English (for text queries)
- **License:** Apache 2.0 (Please update if different)
- **Trained on:** [ShandaAI/Hive](https://huggingface.co/datasets/ShandaAI/Hive) (2,442 hours of raw audio, 19.6M mixtures)
- **Paper:** [arXiv:2601.22599](https://arxiv.org/abs/2601.22599)
- **Code Repository:** [GitHub - ShandaAI/Hive](https://github.com/ShandaAI/Hive)
## Uses
The model is intended for universal sound separation tasks, allowing users to extract specific sounds from complex audio mixtures using multimodal prompts (e.g., text descriptions or audio queries). |