Spaces:

jcudit
/

voice-tools

Paused

jcudit HF Staff commited on Dec 29, 2025

Commit

7b704bc

1 Parent(s): 1a6b8d0

fix: ensure audio tensor is on same device as VAD model in voice denoising

The Silero VAD model was moved to GPU but the input audio tensor remained
on CPU, causing a device mismatch error:
'Expected all tensors to be on the same device, but got weight is on cuda:0,
different from other tensors on cpu'

Solution:
---------
Move the audio tensor to the same device as the VAD model before calling
get_speech_timestamps() by adding .to(device) after creating the tensor.

This ensures both the model and its inputs are on the same device (GPU or CPU)
preventing RuntimeError during convolution operations in the VAD model.

Files changed (1) hide show

src/services/voice_denoising.py +2 -2

src/services/voice_denoising.py CHANGED Viewed

@@ -106,8 +106,8 @@ def _denoise_audio_on_gpu(
         if len(audio) == 0:
             voice_segments = []
         else:
-            # Convert to torch tensor
-            audio_tensor = torch.from_numpy(audio).float()
             # Get speech timestamps
             speech_timestamps = get_speech_timestamps(

         if len(audio) == 0:
             voice_segments = []
         else:
+            # Convert to torch tensor and move to same device as model
+            audio_tensor = torch.from_numpy(audio).float().to(device)
             # Get speech timestamps
             speech_timestamps = get_speech_timestamps(