OmniAICreator commited on
Commit
58a5080
·
verified ·
1 Parent(s): 1429e97

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -99,7 +99,7 @@ Compared to the first version, this v2 model includes the following key updates
99
 
100
  1. **RoPE Bug Fix**: Corrected a RoPE (Rotary Position Embedding) bug present in the original XCodec2 implementation (See [Issue #36](https://github.com/zhenye234/X-Codec-2.0/issues/36)).
101
  2. **Upsampler Parameters**: The upsampler settings were changed to `hop_length=98`, `upsample_factors=[3, 3]`, and `kernel_sizes=[9, 9]`.
102
- 3. **Perceptual Loss Model**: The model used for calculating perceptual loss was switched from `[facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)` to `[imprt/kushinada-hubert-large](https://huggingface.co/imprt/kushinada-hubert-large)`.
103
  4. **Spectral Discriminator Tuning**: The STFT (Short-Time Fourier Transform) settings for the spectral discriminator were adjusted to be more suitable for 44.1kHz high-sampling-rate audio.
104
 
105
  ---
 
99
 
100
  1. **RoPE Bug Fix**: Corrected a RoPE (Rotary Position Embedding) bug present in the original XCodec2 implementation (See [Issue #36](https://github.com/zhenye234/X-Codec-2.0/issues/36)).
101
  2. **Upsampler Parameters**: The upsampler settings were changed to `hop_length=98`, `upsample_factors=[3, 3]`, and `kernel_sizes=[9, 9]`.
102
+ 3. **Perceptual Loss Model**: The model used for calculating perceptual loss was switched from [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) to [imprt/kushinada-hubert-large](https://huggingface.co/imprt/kushinada-hubert-large).
103
  4. **Spectral Discriminator Tuning**: The STFT (Short-Time Fourier Transform) settings for the spectral discriminator were adjusted to be more suitable for 44.1kHz high-sampling-rate audio.
104
 
105
  ---