Peter Shi commited on
Commit
1b3117a
·
1 Parent(s): d4c742d

feat: Migrate the deployment to the Gradio SDK, integrate the `spaces.GPU` decorator, and remove the Dockerfile.

Browse files
Files changed (4) hide show
  1. Dockerfile +0 -27
  2. README.md +7 -36
  3. app.py +13 -1
  4. requirements.txt +7 -4
Dockerfile DELETED
@@ -1,27 +0,0 @@
1
- # Use Python 3.12 to satisfy the 'perception-models' requirement
2
- FROM python:3.12
3
-
4
- # Set the working directory
5
- WORKDIR /code
6
-
7
- # Install system dependencies (ffmpeg is required for audio)
8
- RUN apt-get update && apt-get install -y ffmpeg && rm -rf /var/lib/apt/lists/*
9
-
10
- # Copy requirements and install Python dependencies
11
- COPY requirements.txt .
12
- RUN pip install --no-cache-dir --upgrade pip
13
- RUN pip install --no-cache-dir -r requirements.txt
14
-
15
- # Set up a user (Required by HF Spaces security)
16
- RUN useradd -m -u 1000 user
17
- USER user
18
- ENV HOME=/home/user \
19
- PATH=/home/user/.local/bin:$PATH
20
-
21
- WORKDIR $HOME/app
22
-
23
- # Copy application files
24
- COPY --chown=user . $HOME/app
25
-
26
- # Start the app
27
- CMD ["python", "app.py"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -3,10 +3,13 @@ title: Sam Audio Webui
3
  emoji: 🎵
4
  colorFrom: indigo
5
  colorTo: pink
6
- sdk: docker
7
- app_port: 7860
 
8
  pinned: false
9
  license: apache-2.0
 
 
10
  ---
11
 
12
  # SAM Audio WebUI
@@ -17,43 +20,11 @@ This Space hosts a WebUI for the **SAM Audio** model by Meta (Facebook), designe
17
 
18
  - **Model**: Uses `facebook/sam-audio-small` for a balance of performance and resource usage.
19
  - **ZeroGPU Support**: Optimized to run on Hugging Face ZeroGPU (A100/A10G) with automatic GPU handling.
20
- - **Dynamic Fallback**:
21
- - Attempts to load the model in `float16` for best quality.
22
- - Falls back to **8-bit quantization** (`bitsandbytes`) if VRAM is insufficient.
23
- - **Audio Reconstruction**: Converts model masks to audio using STFT/ISTFT processing.
24
-
25
- ## Local Development
26
-
27
- To run this application locally on your machine:
28
-
29
- 1. **Clone the repository:**
30
- ```bash
31
- git clone https://huggingface.co/spaces/lpeterl/sam-audio-webui
32
- cd sam-audio-webui
33
- ```
34
-
35
- 2. **Create a virtual environment (Recommended):**
36
- ```bash
37
- python3 -m venv venv
38
- source venv/bin/activate
39
- ```
40
-
41
- 3. **Install dependencies:**
42
- ```bash
43
- pip install -r requirements.txt
44
- pip install gradio
45
- ```
46
-
47
- 4. **Run the app:**
48
- ```bash
49
- python3 app.py
50
- ```
51
- *Note: `spaces` GPU decorators are mocked locally, so you don't need a ZeroGPU environment.*
52
 
53
  ## System Requirements
54
 
55
- - **VRAM**: ~21.6 GB for standard loading. ~12 GB with 8-bit quantization.
56
- - **Platform**: CUDA (NVIDIA GPU) required for quantization. Mac (MPS) supported for standard loading (requires high unified memory).
57
 
58
  ## Acknowledgements
59
 
 
3
  emoji: 🎵
4
  colorFrom: indigo
5
  colorTo: pink
6
+ sdk: gradio
7
+ sdk_version: 6.2.0
8
+ app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
+ fullWidth: true
12
+ python_version: 3.11
13
  ---
14
 
15
  # SAM Audio WebUI
 
20
 
21
  - **Model**: Uses `facebook/sam-audio-small` for a balance of performance and resource usage.
22
  - **ZeroGPU Support**: Optimized to run on Hugging Face ZeroGPU (A100/A10G) with automatic GPU handling.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  ## System Requirements
25
 
26
+ - **VRAM**: ~21.6 GB for standard loading.
27
+ - **Python**: >= 3.11 required by `perception-models` dependency.
28
 
29
  ## Acknowledgements
30
 
app.py CHANGED
@@ -2,6 +2,17 @@ import gradio as gr
2
  import torch
3
  import torchaudio
4
  import tempfile
 
 
 
 
 
 
 
 
 
 
 
5
  from sam_audio import SAMAudio, SAMAudioProcessor
6
 
7
  # Configuration
@@ -29,6 +40,7 @@ def save_audio(tensor, sample_rate):
29
  torchaudio.save(tmp.name, tensor, sample_rate)
30
  return tmp.name
31
 
 
32
  def separate_audio(audio_path, text_prompt):
33
  if not audio_path:
34
  return None, None
@@ -88,4 +100,4 @@ with gr.Blocks(title="SAM-Audio Demo") as demo:
88
  )
89
 
90
  # Launch
91
- demo.queue().launch(server_name="0.0.0.0", server_port=7860)
 
2
  import torch
3
  import torchaudio
4
  import tempfile
5
+
6
+ try:
7
+ import spaces
8
+ except ImportError:
9
+ class spaces:
10
+ @staticmethod
11
+ def GPU(duration=60):
12
+ def decorator(func):
13
+ return func
14
+ return decorator
15
+
16
  from sam_audio import SAMAudio, SAMAudioProcessor
17
 
18
  # Configuration
 
40
  torchaudio.save(tmp.name, tensor, sample_rate)
41
  return tmp.name
42
 
43
+ @spaces.GPU(duration=120)
44
  def separate_audio(audio_path, text_prompt):
45
  if not audio_path:
46
  return None, None
 
100
  )
101
 
102
  # Launch
103
+ demo.queue().launch()
requirements.txt CHANGED
@@ -1,6 +1,9 @@
 
 
 
 
 
 
 
1
  git+https://github.com/facebookresearch/sam-audio.git
2
- torch
3
  torchaudio
4
- gradio
5
- numpy
6
- scipy
 
1
+ gradio>=4.0.0
2
+ torch>=2.0.0
3
+ transformers>=4.38.0
4
+ accelerate>=0.27.0
5
+ scipy
6
+ librosa
7
+ spaces
8
  git+https://github.com/facebookresearch/sam-audio.git
 
9
  torchaudio