Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

voidful
/
SRFD-VoxCPM2

Text-to-Speech
VoxCPM
English
VoxCPM2
voice-cloning
flow-matching
lora
srfd
speech
Model card Files Files and versions
xet
Community

Instructions to use voidful/SRFD-VoxCPM2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • VoxCPM

    How to use voidful/SRFD-VoxCPM2 with VoxCPM:

    import soundfile as sf
    from voxcpm import VoxCPM
    
    model = VoxCPM.from_pretrained("voidful/SRFD-VoxCPM2")
    
    wav = model.generate(
        text="VoxCPM is an innovative end-to-end TTS model from ModelBest, designed to generate highly expressive speech.",
        prompt_wav_path=None,      # optional: path to a prompt speech for voice cloning
        prompt_text=None,          # optional: reference text
        cfg_value=2.0,             # LM guidance on LocDiT, higher for better adherence to the prompt, but maybe worse
        inference_timesteps=10,   # LocDiT inference timesteps, higher for better result, lower for fast speed
        normalize=True,           # enable external TN tool
        denoise=True,             # enable external Denoise tool
        retry_badcase=True,        # enable retrying mode for some bad cases (unstoppable)
        retry_badcase_max_times=3,  # maximum retrying times
        retry_badcase_ratio_threshold=6.0, # maximum length restriction for bad case detection (simple but effective), it could be adjusted for slow pace speech
    )
    
    sf.write("output.wav", wav, 16000)
    print("saved: output.wav")
  • Notebooks
  • Google Colab
  • Kaggle

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Gated model
You can list files but not access them

Preview of files found in this repository
  • ablations
    Upload SRFD-VoxCPM2 LoRA adapters and model card 18 days ago
  • adapters
    Upload SRFD-VoxCPM2 LoRA adapters and model card 18 days ago
  • configs
    Upload SRFD-VoxCPM2 LoRA adapters and model card 18 days ago
  • metadata
    Upload SRFD-VoxCPM2 LoRA adapters and model card 18 days ago
  • reports
    Upload SRFD-VoxCPM2 LoRA adapters and model card 18 days ago
  • .gitattributes
    134 Bytes
    Upload SRFD-VoxCPM2 LoRA adapters and model card 18 days ago
  • LICENSE
    11.3 kB
    Upload SRFD-VoxCPM2 LoRA adapters and model card 22 days ago
  • README.md
    4.88 kB
    Upload SRFD-VoxCPM2 LoRA adapters and model card 18 days ago
  • lora_config.json
    852 Bytes
    Upload compact 3-target SR-FD VoxCPM2 LoRA adapter 22 days ago
  • lora_weights.safetensors
    72.4 MB
    xet
    Upload compact 3-target SR-FD VoxCPM2 LoRA adapter 22 days ago
  • training_state.json
    14 Bytes
    Upload compact 3-target SR-FD VoxCPM2 LoRA adapter 22 days ago