reconstruction test not work as expected

#11

by infilify - opened Mar 5, 2025

Mar 5, 2025

Hi, thank you for the great work.

Unfotunately, the reconstruction test I tried produced incorrect output.

The test code I used was taken from https://huggingface.co/HKUSTAudio/xcodec2, and following is my steps, could you please shine some light on me.

ffmpeg -i test.flac -ar 16000 -c:a pcm_s16le test.wav convert the test.flac to 16hz wav file

ffprobe test.wav
Input #0, wav, from 'test.wav':
  Metadata:
    encoder         : Lavf59.27.100
  Duration: 00:00:04.91, bitrate: 256 kb/s
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels, s16, 256 kb/s

run with the following test code

import torch
import soundfile as sf
from transformers import AutoConfig
import sys
import os

# Add the model path to system path to make the xcodec2 module importable
model_path = "models/HKUSTAudio/xcodec2"
sys.path.append(os.path.abspath(model_path))

# Now import from the module
# from modeling_xcodec2 import XCodec2Model
from models.HKUSTAudio.xcodec2.modeling_xcodec2 import XCodec2Model

model = XCodec2Model.from_pretrained(model_path)
model.eval().cuda()

# wav, sr = sf.read("sample.wav")
# wav, sr = sf.read("sample-short3.wav")
wav, sr = sf.read("test.wav")
wav_tensor = torch.from_numpy(wav).float().unsqueeze(0)  # Shape: (1, T)

with torch.no_grad():
   # Only 16khz speech
   # Only supports single input. For batch inference, please refer to the link below.
    vq_code = model.encode_code(input_waveform=wav_tensor)
    print("Code:", vq_code )

    recon_wav = model.decode_code(vq_code).cpu()       # Shape: (1, 1, T')

sf.write("output/reconstructed.wav", recon_wav[0, 0, :].numpy(), sr)
print("Done! Check reconstructed.wav")

Attached is the output/reconstructed.wav file. It's noisy and incorrect.

Many thanks

infilify

Mar 5, 2025

solved by reinstall xcodec in conda env

infilify changed discussion status to closed Mar 5, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment