ShandaAI
/

AudioSep-hive

sound-separation

Model card Files Files and versions

AudioSep-hive / README.md

JusperLee's picture

Create README.md

113d2e4 verified 8 days ago

|

history blame contribute delete

1.5 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- audio
	- sound-separation
	- audio-to-audio
	- audiosep
	datasets:
	- ShandaAI/Hive
	---

	# AudioSep-hive

	## Model Description

	AudioSep-hive is a data-efficient, query-based universal sound separation model trained on the [Hive dataset](https://huggingface.co/datasets/ShandaAI/Hive). By leveraging the high-quality, semantically consistent Hive dataset, this model achieves competitive separation accuracy and perceptual quality comparable to state-of-the-art models (such as SAM-Audio) while utilizing only a fraction (~0.2%) of the training data volume.

	This model is developed by Shanda AI Research Tokyo and is introduced in the paper: [A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation](https://arxiv.org/abs/2601.22599).

	## Model Details

	- Model Type: Query-Based Universal Sound Separation
	- Language(s): English (for text queries)
	- License: Apache 2.0 (Please update if different)
	- Trained on: [ShandaAI/Hive](https://huggingface.co/datasets/ShandaAI/Hive) (2,442 hours of raw audio, 19.6M mixtures)
	- Paper: [arXiv:2601.22599](https://arxiv.org/abs/2601.22599)
	- Code Repository: [GitHub - ShandaAI/Hive](https://github.com/ShandaAI/Hive)

	## Uses

	The model is intended for universal sound separation tasks, allowing users to extract specific sounds from complex audio mixtures using multimodal prompts (e.g., text descriptions or audio queries).