Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

jspaulsen
/
unmute-encoder

Transformers
Safetensors
Moshi
English
French
audio
speaker-embedding
voice-cloning
tts
Model card Files Files and versions
xet
Community

Instructions to use jspaulsen/unmute-encoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • Transformers

    How to use jspaulsen/unmute-encoder with Transformers:

    # Load model directly
    from transformers import AutoModel
    model = AutoModel.from_pretrained("jspaulsen/unmute-encoder", dtype="auto")
  • Moshi

    How to use jspaulsen/unmute-encoder with Moshi:

    # pip install moshi
    # Run the interactive web server
    python -m moshi.server --hf-repo "jspaulsen/unmute-encoder"
    # Then open https://localhost:8998 in your browser
    # pip install moshi
    import torch
    from moshi.models import loaders
    
    # Load checkpoint info from HuggingFace
    checkpoint = loaders.CheckpointInfo.from_hf_repo("jspaulsen/unmute-encoder")
    
    # Load the Mimi audio codec
    mimi = checkpoint.get_mimi(device="cuda")
    mimi.set_num_codebooks(8)
    
    # Encode audio (24kHz, mono)
    wav = torch.randn(1, 1, 24000 * 10)  # [batch, channels, samples]
    with torch.no_grad():
        codes = mimi.encode(wav.cuda())
        decoded = mimi.decode(codes)
  • Notebooks
  • Google Colab
  • Kaggle
unmute-encoder
460 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 5 commits
jspaulsen's picture
jspaulsen
Upload unmute encoder checkpoint: README.md
657c4b1 verified 3 months ago
  • .gitattributes
    1.52 kB
    initial commit 4 months ago
  • README.md
    1.6 kB
    Upload unmute encoder checkpoint: README.md 3 months ago
  • model.safetensors
    460 MB
    xet
    Upload unmute encoder checkpoint: model.safetensors 3 months ago