Spaces:
Paused
Paused
fix: ensure audio tensor is on same device as VAD model in voice denoising
Browse filesThe Silero VAD model was moved to GPU but the input audio tensor remained
on CPU, causing a device mismatch error:
'Expected all tensors to be on the same device, but got weight is on cuda:0,
different from other tensors on cpu'
Solution:
---------
Move the audio tensor to the same device as the VAD model before calling
get_speech_timestamps() by adding .to(device) after creating the tensor.
This ensures both the model and its inputs are on the same device (GPU or CPU)
preventing RuntimeError during convolution operations in the VAD model.
src/services/voice_denoising.py
CHANGED
|
@@ -106,8 +106,8 @@ def _denoise_audio_on_gpu(
|
|
| 106 |
if len(audio) == 0:
|
| 107 |
voice_segments = []
|
| 108 |
else:
|
| 109 |
-
# Convert to torch tensor
|
| 110 |
-
audio_tensor = torch.from_numpy(audio).float()
|
| 111 |
|
| 112 |
# Get speech timestamps
|
| 113 |
speech_timestamps = get_speech_timestamps(
|
|
|
|
| 106 |
if len(audio) == 0:
|
| 107 |
voice_segments = []
|
| 108 |
else:
|
| 109 |
+
# Convert to torch tensor and move to same device as model
|
| 110 |
+
audio_tensor = torch.from_numpy(audio).float().to(device)
|
| 111 |
|
| 112 |
# Get speech timestamps
|
| 113 |
speech_timestamps = get_speech_timestamps(
|