Model-At-checkPoint-20000

This is a fully independent, merged deployment of the VibeVoice acoustic model fine-tuned for Egyptian Arabic at checkpoint 20000. It includes the integrated Qwen tokenizer configs and base weights side-by-side.

πŸ“¦ Repository Structure

  • model.safetensors: Independent merged model weights.
  • voices/: Reference speech audio samples.
  • Tokenizer and configuration JSON files.

πŸš€ How to Use (Inference API / Colab)

import torch
from peft import PeftModel
# Load this repository directly using standard HuggingFace or VibeVoice modules:
# model = VibeVoiceForConditionalGeneration.from_pretrained("MohammedEhab20/Model-At-checkPoint-20000", trust_remote_code=True)

πŸŽ›οΈ Runtime Parameters (CFG Scale)

The Classifier-Free Guidance (CFG) scale is a runtime parameter and not baked into these weights. You can dynamically adjust it during inference calls:

  • Higher CFG (e.g., 5.0): Strict alignment with the prompt.
  • Lower CFG (e.g., 3.5): More natural flow and creativity.
Downloads last month
-
Safetensors
Model size
3B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support