Missing VAE / Text Encoder files for inference (only model.pt provided)
#1
by rikunarita - opened
Iโm trying to use SongGeneration-v2-medium, but I noticed that the repository only provides a model.pt file.
For running inference, it seems that additional components are required, such as:
VAE (for audio latent decoding)
Text encoder (for prompt processing)
Possibly tokenizer or config files
However, I could not find these files in the repository.
Could you clarify:
Are these components included inside model.pt, or should they be provided separately?
If they are separate, where can I download the correct versions?
Is there an official inference pipeline or example (e.g., with ACE-Step-like workflow)?
Any guidance would be greatly appreciated.
Thank you!