Orphues_Q4_K_M

This repository contains a quantized GGUF version (Q4_K_M) of an Orpheus-based Text-to-Speech model, fine-tuned using Unsloth for efficient training and inference.

⚠️ Note:
This model is NOT trained or fine-tuned on Persian (Farsi) data.
It is intended for non-Persian (primarily English) text-to-speech use cases.


πŸ”Š Model Overview

  • Base architecture: Orpheus TTS (LLM-based Audio Generation)
  • Backbone: LLaMA-style Transformer
  • Fine-tuning framework: Unsloth
  • Quantization: Q4_K_M (GGUF)
  • Inference target: llama.cpp / GGUF-compatible runtimes
  • Language support: Non-Persian (English-focused)

🧠 About Orpheus TTS

Orpheus is a modern LLM-based Text-to-Speech architecture that generates audio by predicting audio tokens instead of mel-spectrograms.
It enables:

  • Natural prosody
  • Expressive speech
  • Low-latency generation
  • Zero-shot or few-shot voice adaptation (depending on setup)

This model follows that paradigm and has been adapted via fine-tuning.


πŸ› οΈ Fine-Tuning Details

  • Fine-tuning method: LoRA-based fine-tuning via Unsloth
  • Training focus: Improving speech naturalness and stability
  • Tokenizer: Original LLaMA-compatible tokenizer
  • No Persian data used

βš™οΈ Quantization Details

  • Format: GGUF
  • Quantization type: Q4_K_M
  • Optimized for:
    • Reduced VRAM usage
    • Fast CPU/GPU inference
  • Recommended runtime: llama.cpp (latest version)

πŸš€ Usage Example (llama.cpp)

./main \
  -m Orphues_Q4_K_M.gguf \
  --temp 0.7 \
  --ctx-size 4096
Downloads last month
12
GGUF
Model size
3B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support