AEmotionStudio commited on
Commit
f3889cd
·
verified ·
1 Parent(s): 580ddb1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +12 -15
README.md CHANGED
@@ -15,9 +15,9 @@ pipeline_tag: audio-to-audio
15
  base_model: facebook/sam-audio-large-tv
16
  ---
17
 
18
- # SAM-Audio Large-TV (BF16)
19
 
20
- This is an **ungated mirror** of Meta's [SAM-Audio Large-TV](https://huggingface.co/facebook/sam-audio-large-tv) model weights, converted to BF16 safetensors format and redistributed under the [SAM License](LICENSE) for easier access.
21
 
22
  ## What is SAM-Audio?
23
 
@@ -27,26 +27,24 @@ SAM-Audio (Segment Anything Model for Audio) is Meta AI's foundation model for *
27
  - **Visual prompts** — point at objects in video to extract their sound
28
  - **Span prompts** — specify time ranges where the target sound occurs
29
 
30
- The `-tv` variant is optimized for **target correctness** and **visual prompting**.
 
 
 
 
 
 
 
31
 
32
  ## Files
33
 
34
  | File | Description |
35
  |---|---|
36
- | `sam-audio-large-tv-bf16.safetensors` | Model weights (BF16 safetensors format) |
 
37
  | `config.json` | Model configuration |
38
  | `LICENSE` | SAM License (required for redistribution) |
39
 
40
- ## Model Info
41
-
42
- | Property | Value |
43
- |---|---|
44
- | Source | [`facebook/sam-audio-large-tv`](https://huggingface.co/facebook/sam-audio-large-tv) |
45
- | Dtype | `bf16` (`torch.bfloat16`) |
46
- | Parameters | 3,715,221,638 |
47
- | File size | 6.92 GiB (original: 13.84 GiB) |
48
- | Sample rate | 48,000 Hz |
49
-
50
  ## Usage
51
 
52
  ```python
@@ -71,6 +69,5 @@ This model is distributed under the **SAM License** — see the [LICENSE](LICENS
71
  ## Credits
72
 
73
  - **Original model by**: [Meta AI (FAIR)](https://github.com/facebookresearch/sam-audio)
74
- - **Original HuggingFace repo**: [facebook/sam-audio-large-tv](https://huggingface.co/facebook/sam-audio-large-tv)
75
  - **Paper**: *SAM-Audio: Segment Anything in Audio*
76
  - **Redistributed by**: [Æmotion Studio](https://huggingface.co/AEmotionStudio) for use with [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA)
 
15
  base_model: facebook/sam-audio-large-tv
16
  ---
17
 
18
+ # SAM-Audio Models (BF16 Safetensors)
19
 
20
+ **Ungated mirrors** of Meta's [SAM-Audio](https://github.com/facebookresearch/sam-audio) model weights, converted to BF16 safetensors format and redistributed under the [SAM License](LICENSE) for easier access.
21
 
22
  ## What is SAM-Audio?
23
 
 
27
  - **Visual prompts** — point at objects in video to extract their sound
28
  - **Span prompts** — specify time ranges where the target sound occurs
29
 
30
+ The `-tv` variants are optimized for **target correctness** and **visual prompting**.
31
+
32
+ ## Available Models
33
+
34
+ | Model | Parameters | File Size | Original |
35
+ |---|---|---|---|
36
+ | `sam-audio-large-tv-bf16.safetensors` | 3,715,221,638 | 6.92 GiB | [facebook/sam-audio-large-tv](https://huggingface.co/facebook/sam-audio-large-tv) |
37
+ | `sam-audio-base-tv-bf16.safetensors` | 1,931,243,654 | 3.60 GiB | [facebook/sam-audio-base-tv](https://huggingface.co/facebook/sam-audio-base-tv) |
38
 
39
  ## Files
40
 
41
  | File | Description |
42
  |---|---|
43
+ | `sam-audio-large-tv-bf16.safetensors` | Large-TV model weights (BF16) |
44
+ | `sam-audio-base-tv-bf16.safetensors` | Base-TV model weights (BF16) |
45
  | `config.json` | Model configuration |
46
  | `LICENSE` | SAM License (required for redistribution) |
47
 
 
 
 
 
 
 
 
 
 
 
48
  ## Usage
49
 
50
  ```python
 
69
  ## Credits
70
 
71
  - **Original model by**: [Meta AI (FAIR)](https://github.com/facebookresearch/sam-audio)
 
72
  - **Paper**: *SAM-Audio: Segment Anything in Audio*
73
  - **Redistributed by**: [Æmotion Studio](https://huggingface.co/AEmotionStudio) for use with [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA)