NandemoGHS
/

Anime-XCodec2-44.1kHz-v2

Model card Files Files and versions

OmniAICreator commited on Oct 28, 2025

Commit

58a5080

·

verified ·

1 Parent(s): 1429e97

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -99,7 +99,7 @@ Compared to the first version, this v2 model includes the following key updates
 1.  **RoPE Bug Fix**: Corrected a RoPE (Rotary Position Embedding) bug present in the original XCodec2 implementation (See [Issue #36](https://github.com/zhenye234/X-Codec-2.0/issues/36)).
 2.  **Upsampler Parameters**: The upsampler settings were changed to `hop_length=98`, `upsample_factors=[3, 3]`, and `kernel_sizes=[9, 9]`.
-3.  **Perceptual Loss Model**: The model used for calculating perceptual loss was switched from `[facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)` to `[imprt/kushinada-hubert-large](https://huggingface.co/imprt/kushinada-hubert-large)`.
 4.  **Spectral Discriminator Tuning**: The STFT (Short-Time Fourier Transform) settings for the spectral discriminator were adjusted to be more suitable for 44.1kHz high-sampling-rate audio.
 ---

 1.  **RoPE Bug Fix**: Corrected a RoPE (Rotary Position Embedding) bug present in the original XCodec2 implementation (See [Issue #36](https://github.com/zhenye234/X-Codec-2.0/issues/36)).
 2.  **Upsampler Parameters**: The upsampler settings were changed to `hop_length=98`, `upsample_factors=[3, 3]`, and `kernel_sizes=[9, 9]`.
+3.  **Perceptual Loss Model**: The model used for calculating perceptual loss was switched from [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) to [imprt/kushinada-hubert-large](https://huggingface.co/imprt/kushinada-hubert-large).
 4.  **Spectral Discriminator Tuning**: The STFT (Short-Time Fourier Transform) settings for the spectral discriminator were adjusted to be more suitable for 44.1kHz high-sampling-rate audio.
 ---