balacoon
/

vq4_50fps_24khz_vocoder

Model card Files Files and versions

clementruhm commited on Mar 2, 2025

Commit

8a26df1

·

verified ·

1 Parent(s): e8a6043

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -29,11 +29,11 @@ decoder = torch.jit.load(decoder_path)
 # read the audio
 orig_audio_npy, sr = sf.read(path, dtype="int16")
 assert sr == 24000
-orig_audio = torch.tensor(orig_audio_npy).to(device).unsqueeze(0)  # 1 x samples
 # extract audio tokens from the audio
 tokens = encoder(orig_audio)  # batch x frames x 4
 # synthesize audio from audio tokens
-resynthesized_audio = decoder(tokens)  # 1 x samples
 ```
 See performance of the codec on `vocoder` leaderboard: [TTSLeaderboard](https://huggingface.co/spaces/balacoon/TTSLeaderboard)

 # read the audio
 orig_audio_npy, sr = sf.read(path, dtype="int16")
 assert sr == 24000
+orig_audio = torch.tensor(orig_audio_npy).to(device).unsqueeze(0)  # batch x samples
 # extract audio tokens from the audio
 tokens = encoder(orig_audio)  # batch x frames x 4
 # synthesize audio from audio tokens
+resynthesized_audio = decoder(tokens)  # batch x samples
 ```
 See performance of the codec on `vocoder` leaderboard: [TTSLeaderboard](https://huggingface.co/spaces/balacoon/TTSLeaderboard)