How to use MoYoYoTech/VoiceDialogue with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-to-speech", model="MoYoYoTech/VoiceDialogue")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("MoYoYoTech/VoiceDialogue", dtype="auto")
How to use MoYoYoTech/VoiceDialogue with llama.cpp:
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
./llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
./build/bin/llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Use Docker
docker model run hf.co/MoYoYoTech/VoiceDialogue:Q6_K
How to use MoYoYoTech/VoiceDialogue with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoYoYoTech/VoiceDialogue to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoYoYoTech/VoiceDialogue to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for MoYoYoTech/VoiceDialogue to start chatting
How to use MoYoYoTech/VoiceDialogue with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
Configure Hermes
# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default MoYoYoTech/VoiceDialogue:Q6_K
Integrate `SileroVAD` into `SpeechMonitor` for optional voice activity detection. Add `_detect_speech()` method and update queue handling logic. Implement `SileroVAD` as a singleton for efficient model management.
4e2e3d8
liumaolincommited on
Increase queue timeout in audio and text processing services for smoother task handling
b446464
liumaolincommited on
Refactor audio processing pipeline to normalize data in `SpeechMonitor` and streamline queuing in `AudioCapture`
57b0084
liumaolincommited on
Update `AudioCapture` to support both PyAudio and macOS native AEC+VAD libraries
99e8988
liumaolincommited on
Refactor to replace `EchoCancellingAudioCapture` with `AudioCapture` across the codebase for improved clarity and flexibility
7437d6d
liumaolincommited on
Clean input text in MoYoYo TTS by removing punctuation for better processing
8587958
liumaolincommited on
Refactor LlamaCpp initialization to simplify parameter handling and remove unused callback manager
941bf07
liumaolincommited on
Enable debug mode with global configuration and detailed task logging when active
e0f42b2
liumaolincommited on
Reset task ID in speech recognizer for empty transcriptions to prevent errors
2291ed2
liumaolincommited on
Add new voice model "Doubao" to MoYoYo configuration
7b003c4
liumaolincommited on
Update performance logging format in TTS player for improved structure and readability
d0c1c61
liumaolincommited on
Enhance launcher startup log formatting for improved readability and visual appeal
fa296dd
liumaolincommited on
Remove commented-out performance logging code from TTS player
c3e85a2
liumaolincommited on
Add new voice model "Ellen" to MoYoYo configuration
b5b48f0
liumaolincommited on
Update MoYoYo TTS prompt text for improved relevance and clarity
8228973
liumaolincommited on
Improve speech recognizer to handle empty transcriptions
0cbda14
liumaolincommited on
Simplify system prompts for text generation in Chinese and English
c545fd9
liumaolincommited on
Add new voice model "Juniper" to MoYoYo configuration
469433f
liumaolincommited on
Standardize punctuation for system prompts in both Chinese and English text generation modules.
bedd7b8
liumaolincommited on
Enhance WebSocket handling for connection management and reliability
b115e26
liumaolincommited on
Add session validation checks to `player.py` and `generator.py`
29766c6
liumaolincommited on
Refactor WebSocket handling with connection manager
300d567
liumaolincommited on
Replace `logging` with centralized `loguru`-based logger across all modules.
851495c
liumaolincommited on
Refactor response generation logic in `generator.py`
ce3d9e5
liumaolincommited on
Remove unused conditional logic for second answer handling in `player.py` and `generator.py`
c1b24fd
liumaolincommited on
Adjust context window allocation logic based on memory tiers in `apple_silicon.py`
fd3c30a
liumaolincommited on
Comment out unused Kokoro TTS voice configurations
23c146f
liumaolincommited on
Add Maple and Cove voice models to MoYoYo TTS configuration
7e92ad3
liumaolincommited on
Introduce Apple Silicon hardware optimization and dynamic LLM configuration
bdc3b7b
liumaolincommited on
Update LLM response generator and system prompts
6f77a29
liumaolincommited on
Update static file routing and root endpoint for frontend integration
f7b034a
liumaolincommited on
Add robust lifecycle management for `audio_player` service in system routes
627c3e7
liumaolincommited on
Standardize service lifecycle management by replacing `stop` with `exit` and introducing `is_exited` check
f5226c0
liumaolincommited on
Remove `voice_schemas.py` and refactor schema imports for TTS and ASR modules in `__init__.py`
4895dc2
liumaolincommited on
Refactor speech recognizer, audio capture, and system routes for improved clarity and functionality
037e5ae
liumaolincommited on
Add pause and resume functionality to voice dialogue system
d701b8a
liumaolincommited on
Refactor project: split `main.py` functionality into modular components under `cli`, `core`, and `config`.
d08a15b
liumaolincommited on
Increase service startup timeouts and set daemon mode for services.
61524a8
liumaolincommited on
Refactor imports in `whisper.py` and `funasr.py` to use absolute paths for `ensure_minimum_audio_duration`.
d673573
liumaolincommited on
Update `moyoyo.py`: add fallback for `utils` to ensure `HParams` availability in runtime.
bd3673b
liumaolincommited on
Refactor imports for consistency in `kokoro.py` and `processor.py`. Use absolute paths for better readability and maintainability.
8630353
liumaolincommited on
Update `paths.py`: improve PROJECT_ROOT resolution with `_MEIPASS` support and enhance third-party path handling.
664d767
liumaolincommited on
Rename 'src/VoiceDialogue' to 'src/voice_dialogue'.
511ff0c
liumaolincommited on
Revamp API core description: expand feature details for ASR, LLMs, TTS, system control, and real-time communication; improve clarity and structure of documentation.
c57de2a
liumaolincommited on
Update project requirements.
ccdd95f
liumaolincommited on
Integrate WebSocket support: add `/api/v1/ws` endpoint, enable real-time message handling via `websocket_message_queue`, and refactor services and models to support WebSocket-based question and answer updates.
2534744
liumaolincommited on
Refactor `SessionIdManager` module: rename `session_id_manager.py` to `session_manager.py` and update imports accordingly.
83ef092
liumaolincommited on
Refactor session ID handling: replace `current_session_id` with `SessionIdManager` for thread-safe management, update related imports and references.
92bb56d
liumaolincommited on
Refactor `__init__.py` in TTS runtime: streamline `__all__` handling, improve logging for import failures, and enhance maintainability of module exports.
6f036c6
liumaolincommited on
Serve static frontend assets through FastAPI: mount static files and replace root endpoint response with `index.html`.