AEmotionStudio
/

sam-audio-models

+---
+license: other
+license_name: sam-license
+license_link: LICENSE
+tags:
+  - audio
+  - audio-separation
+  - sound-separation
+  - sam-audio
+  - meta
+  - pytorch
+  - safetensors
+  - bf16
+pipeline_tag: audio-to-audio
+base_model: facebook/sam-audio-large-tv
+---
+# SAM-Audio Large-TV (BF16)
+This is an **ungated mirror** of Meta's [SAM-Audio Large-TV](https://huggingface.co/facebook/sam-audio-large-tv) model weights, converted to BF16 safetensors format and redistributed under the [SAM License](LICENSE) for easier access.
+## What is SAM-Audio?
+SAM-Audio (Segment Anything Model for Audio) is Meta AI's foundation model for **isolating any sound in audio** using text, visual, or temporal prompts. It can separate specific sounds from complex audio mixtures.
+- **Text prompts** — isolate sounds by describing them (e.g. *"drums"*, *"vocals"*, *"piano"*)
+- **Visual prompts** — point at objects in video to extract their sound
+- **Span prompts** — specify time ranges where the target sound occurs
+The `-tv` variant is optimized for **target correctness** and **visual prompting**.
+## Files
+| File | Description |
+|---|---|
+| `sam-audio-large-tv-bf16.safetensors` | Model weights (BF16 safetensors format) |
+| `config.json` | Model configuration |
+| `LICENSE` | SAM License (required for redistribution) |
+## Model Info
+| Property | Value |
+|---|---|
+| Source | [`facebook/sam-audio-large-tv`](https://huggingface.co/facebook/sam-audio-large-tv) |
+| Dtype | `bf16` (`torch.bfloat16`) |
+| Parameters | 3,715,221,638 |
+| File size | 6.92 GiB (original: 13.84 GiB) |
+| Sample rate | 48,000 Hz |
+## Usage
+```python
+# With ComfyUI-FFMPEGA (automatic download)
+# Set no_llm_mode = "audio_separate" and prompt = "vocals"
+# Or standalone:
+from sam_audio import SAMAudio
+model = SAMAudio.from_pretrained("path/to/this/repo")
+```
+## License
+This model is distributed under the **SAM License** — see the [LICENSE](LICENSE) file. Key points:
+- ✅ Commercial use permitted
+- ✅ Redistribution permitted (with license included)
+- ✅ Derivative works permitted
+- ❌ No military/warfare, nuclear, or espionage use
+- ❌ No reverse engineering
+## Credits
+- **Original model by**: [Meta AI (FAIR)](https://github.com/facebookresearch/sam-audio)
+- **Original HuggingFace repo**: [facebook/sam-audio-large-tv](https://huggingface.co/facebook/sam-audio-large-tv)
+- **Paper**: *SAM-Audio: Segment Anything in Audio*
+- **Redistributed by**: [Æmotion Studio](https://huggingface.co/AEmotionStudio) for use with [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA)