Compatibility with LoRA Adapter trained from originalmodel
Hi and thank you for porting this model to be HF compatible.
I noticed that the layer names and param count (as listed in the model.safetensors.index.json) differ from the original repo (eg param count of 8.33b vs 8.67b in the original). Just wondering, is the lora adapter trained from the original model (using the script provided by the repo authors) likely to be compatible with this release?
Thanks again
Hi @deathknight0 thanks for your message! Yes this model is slightly smaller because I've removed the decoder of the acoustic tokenizer as it isn't needed by the ASR model.
If you are speaking about this script, I don't think it will work, as the state dict has changed to be consistent with Transformers conventions. Someone on our team is looking into adapting that script for Transformers-compatible checkpoint, but if you already have something that would be great! By peaking into the modeling code, you should see how the state dict has changed (ASR and tokenizer) from the original (ASR and tokenizer).
FYI: this checkpoint was a draft. The final one is available under the Microsoft account here and VibeVoice ASR has been released inTransformers v5.3.0.