You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

CredResolve Voice (Swar)

CredResolve Voice is a production-style multilingual TTS application with reusable voice cloning, voice versioning, and a polished web interface.

This repository includes: cr

  • a CredResolve-branded web app
  • live HTTP audio streaming
  • live WebSocket audio streaming
  • Prometheus metrics for HTTP, WebSocket, and TTS lifecycle monitoring
  • a ready-to-run Grafana and Prometheus monitoring stack
  • reusable public voice IDs such as voice_cr_a91f0c2d
  • voice version management under the same voice ID
  • a developer-facing Integrate page
  • a FastAPI backend with typed request handling

Product Overview

CredResolve Voice is built as a full application layer for real-world voice operations. The app is designed for product teams, internal tools, and API-based integrations that need:

  • text-to-speech generation
  • saved voice cloning
  • versioned voice updates
  • playback and download
  • developer-friendly streaming endpoints

UI Included

The current UI includes these sections:

  • Studio

    • text input
    • language selection
    • auto, cloned, and design voice modes
    • male and female voice selection
    • sample rate selection for 24000 Hz and 8000 Hz
    • diffusion controls
    • expression tag shortcuts such as [laughter]
    • streamed audio output with metrics
  • Voice Clone

    • reference audio upload
    • optional reference text
    • display name
    • preprocessing toggle
    • reusable saved voice creation
  • Voices

    • saved voice library
    • stable public voice IDs
    • active version tracking
    • create new versions under the same voice ID
    • commit a version live for TTS
    • delete old versions safely
  • Integrate

    • HTTP stream endpoint
    • WebSocket stream endpoint
    • request field guide
    • cURL example
    • WebSocket URL and first-message example

Voice Registry

Each cloned voice gets a stable public voice ID.

Example:

voice_cr_a91f0c2d

Each voice can have multiple versions. When a new version is committed:

  • the voice_id stays the same
  • the active version changes
  • future TTS requests use the committed version

This makes it possible to improve or replace a saved voice without breaking integrations that already use the public voice ID.

Streaming Behavior

The application supports two streaming paths:

  • HTTP

    • POST /api/v1/tts/stream
    • sends a WAV stream in bytes
    • starts with a WAV header
    • then streams audio chunk by chunk as chunks are generated
  • WebSocket

    • WS /api/v1/tts/stream/ws
    • sends metadata first
    • then sends binary audio frames progressively
    • ends with a completion message

API Endpoints

Main app endpoints:

  • GET /api/v1/capabilities
  • GET /api/v1/languages
  • GET /api/v1/voices
  • POST /api/v1/voices/clone
  • POST /api/v1/voices/{voice_id}/versions
  • POST /api/v1/voices/{voice_id}/versions/{version_id}/commit
  • DELETE /api/v1/voices/{voice_id}
  • DELETE /api/v1/voices/{voice_id}/versions/{version_id}
  • POST /api/v1/tts/synthesize
  • POST /api/v1/tts/stream
  • WS /api/v1/tts/stream/ws

Run Locally

Create or activate your virtual environment, install dependencies, and launch the app:

uv sync
source .venv/bin/activate
python3 -m omnivoice.cli.demo --ip 0.0.0.0 --port 8001

You can also use the console script:

credresolve-voice --ip 0.0.0.0 --port 8001

Common Launch Options

python3 -m omnivoice.cli.demo \
  --ip 0.0.0.0 \
  --port 8001 \
  --device cuda \
  --data-dir /home/ubuntu/credresolve_Multi/.credresolve_voice

Useful flags:

  • --model
  • --device
  • --ip
  • --port
  • --root-path
  • --no-asr
  • --data-dir

Data Storage

By default, application data is stored under .credresolve_voice/.

This includes:

  • saved voice assets
  • voice registry metadata
  • uploaded reference audio
  • generated output WAV files

These runtime artifacts are ignored by Git through .gitignore.

Example HTTP Stream Request

curl -X POST "http://127.0.0.1:8001/api/v1/tts/stream" \
  -H "Content-Type: application/json" \
  --output credresolve-stream.wav \
  -d '{
    "text": "This is a CredResolve Voice streaming request.",
    "language": "auto",
    "voice_mode": "auto",
    "gender": "female",
    "sample_rate": 8000,
    "speed": 1.0,
    "diffusion_steps": 32,
    "guidance_scale": 2.0,
    "denoise": true,
    "preprocess_prompt": true,
    "postprocess_output": true
  }'

Example WebSocket Request

Connect to:

ws://127.0.0.1:8001/api/v1/tts/stream/ws

Then send:

{
  "text": "Streaming over WebSocket",
  "language": "auto",
  "voice_mode": "auto",
  "gender": "auto",
  "sample_rate": 24000
}

Notes

  • 8000 Hz output is supported for telephony-style use cases.
  • Expression tags are supported directly in text, for example [laughter].
  • Saved voices can be selected in Studio or through the API using voice_id.
  • Prometheus metrics are exposed at GET /metrics.
  • Monitoring setup is documented in docs/monitoring.md.
  • The web UI is under omnivoice/app/templates/ and omnivoice/app/static/.
  • The FastAPI server is defined in omnivoice/app/server.py.
  • The main application service layer is in omnivoice/app/service.py.

Repository Structure

Key files and folders:

  • omnivoice/app/server.py
  • omnivoice/app/service.py
  • omnivoice/app/registry.py
  • omnivoice/app/generation.py
  • omnivoice/app/schemas.py
  • omnivoice/app/templates/index.html
  • omnivoice/app/static/app.js
  • omnivoice/app/static/styles.css
  • omnivoice/cli/demo.py

Status

This repository is currently set up as the CredResolve Voice application codebase, with branding, UI, streaming APIs, saved voice registry support, and voice version management in place.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support