Instructions to use MoYoYoTech/VoiceDialogue with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MoYoYoTech/VoiceDialogue with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-to-speech", model="MoYoYoTech/VoiceDialogue")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("MoYoYoTech/VoiceDialogue", dtype="auto")

llama-cpp-python

How to use MoYoYoTech/VoiceDialogue with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="MoYoYoTech/VoiceDialogue",
	filename="assets/models/llm/qwen/Qwen3-8B-Q6_K.gguf",
)

llm.create_chat_completion(
	messages = "\"The answer to the universe is 42\""
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use MoYoYoTech/VoiceDialogue with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
./llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K
# Run inference directly in the terminal:
./build/bin/llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K

Use Docker

docker model run hf.co/MoYoYoTech/VoiceDialogue:Q6_K

LM Studio
Jan
Ollama
How to use MoYoYoTech/VoiceDialogue with Ollama:
```
ollama run hf.co/MoYoYoTech/VoiceDialogue:Q6_K
```

Unsloth Studio new

How to use MoYoYoTech/VoiceDialogue with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoYoYoTech/VoiceDialogue to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for MoYoYoTech/VoiceDialogue to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for MoYoYoTech/VoiceDialogue to start chatting

Pi new

How to use MoYoYoTech/VoiceDialogue with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "MoYoYoTech/VoiceDialogue:Q6_K"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use MoYoYoTech/VoiceDialogue with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default MoYoYoTech/VoiceDialogue:Q6_K

Run Hermes

hermes

Docker Model Runner
How to use MoYoYoTech/VoiceDialogue with Docker Model Runner:
```
docker model run hf.co/MoYoYoTech/VoiceDialogue:Q6_K
```

Lemonade

How to use MoYoYoTech/VoiceDialogue with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull MoYoYoTech/VoiceDialogue:Q6_K

Run and chat with the model

lemonade run user.VoiceDialogue-Q6_K

List all available models

lemonade list

liumaolin commited on Jun 5, 2025

Commit

cf355e6

1 Parent(s): 1ae18a4

Update TTS speaker configuration: replace static mapping with dynamic retrieval, add available speaker listing, and update CLI argument parsing for improved flexibility and maintainability.

Browse files

Files changed (1) hide show

src/VoiceDialogue/main.py +116 -34

src/VoiceDialogue/main.py CHANGED Viewed

@@ -87,19 +87,13 @@ def launch_system(
     answer_generator_worker.start()
     threads.append(answer_generator_worker)
-    speaker_mapping = {
-        '罗翔': 'Luo Xiang',
-        '马保国': 'Ma Baoguo',
-        '沈逸': 'Shen Yi',
-        '杨幂': 'Yang Mi',
-        '周杰伦': 'Jay Zhou',
-        '马云': 'Ma Yun',
-    }
-    role = speaker_mapping.get(speaker)
-    if role is None:
-        raise ValueError(f"不支持的TTS配置: {speaker}")
-    tts_speaker_config = tts_config_registry.get_config(TTSConfigType.MOYOYO, role)
     audio_generator_worker = TTSAudioGenerator(
         text_input_queue=text_input_queue,
         audio_output_queue=audio_output_queue,
@@ -121,37 +115,100 @@ def launch_system(
         thread.join()
-def launch_api_server(host: str = "0.0.0.0", port: int = 8000, reload: bool = False):
     """
-    启动API服务器
     Args:
-        host (str): 服务器主机地址，默认为 "0.0.0.0"
-        port (int): 服务器端口，默认为 8000
-        reload (bool): 是否启用热重载，默认为 False
     """
-    print(f'{"=" * 80}\n正在启动API服务器...\n{"=" * 80}')
-    print(f"服务器地址: http://{host}:{port}")
-    print(f"API文档: http://{host}:{port}/docs")
-    print(f"热重载: {'启用' if reload else '禁用'}")
-    print(f'{"=" * 80}')
-    # 导入并启动FastAPI应用
-    uvicorn.run(
-        "api.app:app",
-        host=host,
-        port=port,
-        reload=reload,
-        log_level="info"
-    )
 def create_argument_parser():
     """创建命令行参数解析器"""
     parser = argparse.ArgumentParser(
         description="VoiceDialogue - 语音对话系统",
         formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
 示例用法:
   # 启动命令行模式（默认）
   python main.py
@@ -169,7 +226,7 @@ def create_argument_parser():
   python main.py --mode api --port 8000 --reload
 支持的说话人:
-  罗翔, 马保国, 沈逸, 杨幂, 周杰伦, 马云
         """
     )
@@ -191,8 +248,8 @@ def create_argument_parser():
     )
     cli_group.add_argument(
         '--speaker', '-s',
-        choices=['罗翔', '马保国', '沈逸', '杨幂', '周杰伦', '马云'],
-        default='沈逸',
         help='TTS说话人 (默认: 沈逸)'
     )
@@ -218,6 +275,31 @@ def create_argument_parser():
     return parser
 def main():
     """
     主程序入口函数

     answer_generator_worker.start()
     threads.append(answer_generator_worker)
+    # 动态获取TTS配置，而不是使用固定映射
+    tts_speaker_config = _get_tts_config_by_speaker_name(speaker)
+    if tts_speaker_config is None:
+        # 如果找不到指定说话人，列出所有可用说话人并抛出异常
+        available_speakers = _get_available_speaker_names()
+        raise ValueError(f"不支持的TTS说话人: {speaker}。可用说话人: {', '.join(available_speakers)}")
     audio_generator_worker = TTSAudioGenerator(
         text_input_queue=text_input_queue,
         audio_output_queue=audio_output_queue,
         thread.join()
+def _get_tts_config_by_speaker_name(speaker_name: str):
     """
+    根据说话人名称获取TTS配置
+    支持中文名称和英文名称，优先匹配中文名称映射，
+    如果找不到则直接使用英文名称搜索
     Args:
+        speaker_name (str): 说话人名称
+    Returns:
+        BaseTTSConfig: TTS配置，如果找不到则返回None
     """
+    # 中文名称到英文名称的映射（保持向后兼容）
+    chinese_to_english_mapping = {
+        '罗翔': 'Luo Xiang',
+        '马保国': 'Ma Baoguo',
+        '沈逸': 'Shen Yi',
+        '杨幂': 'Yang Mi',
+        '周杰伦': 'Zhou Jielun',
+        '马云': 'Ma Yun',
+    }
+    # 首先尝试中文名称映射
+    english_name = chinese_to_english_mapping.get(speaker_name, speaker_name)
+    # 获取所有可用配置
+    all_configs = tts_config_registry.get_all_configs()
+    # 搜索匹配的配置
+    for config in all_configs:
+        if config.character_name == english_name:
+            return config
+    # 如果通过映射找不到，尝试直接匹配输入的名称
+    if speaker_name != english_name:
+        for config in all_configs:
+            if config.character_name == speaker_name:
+                return config
+    return None
+def _get_available_speaker_names():
+    """
+    获取所有可用的说话人名称列表
+    Returns:
+        list[str]: 包含中文显示名称和英文原始名称的列表
+    """
+    # 中文显示名称映射
+    english_to_chinese_mapping = {
+        'Luo Xiang': '罗翔',
+        'Ma Baoguo': '马保国',
+        'Shen Yi': '沈逸',
+        'Yang Mi': '杨幂',
+        'Zhou Jielun': '周杰伦',
+        'Ma Yun': '马云',
+    }
+    all_configs = tts_config_registry.get_all_configs()
+    speaker_names = []
+    for config in all_configs:
+        # 优先显示中文名称
+        chinese_name = english_to_chinese_mapping.get(config.character_name)
+        if chinese_name:
+            speaker_names.append(chinese_name)
+        else:
+            # 如果没有中文映射，使用英文原名
+            speaker_names.append(config.character_name)
+    return sorted(speaker_names)
+def _update_argument_parser_speaker_choices():
+    """
+    动态更新命令行参数解析器中的说话人选项
+    Returns:
+        list[str]: 可用的说话人选择列表
+    """
+    return _get_available_speaker_names()
 def create_argument_parser():
     """创建命令行参数解析器"""
+    # 动态获取可用说话人列表
+    available_speakers = _update_argument_parser_speaker_choices()
     parser = argparse.ArgumentParser(
         description="VoiceDialogue - 语音对话系统",
         formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog=f"""
 示例用法:
   # 启动命令行模式（默认）
   python main.py
   python main.py --mode api --port 8000 --reload
 支持的说话人:
+  {', '.join(available_speakers)}
         """
     )
     )
     cli_group.add_argument(
         '--speaker', '-s',
+        choices=available_speakers,
+        default='沈逸' if '沈逸' in available_speakers else (available_speakers[0] if available_speakers else '沈逸'),
         help='TTS说话人 (默认: 沈逸)'
     )
     return parser
+def launch_api_server(host: str = "0.0.0.0", port: int = 8000, reload: bool = False):
+    """
+    启动API服务器
+    Args:
+        host (str): 服务器主机地址，默认为 "0.0.0.0"
+        port (int): 服务器端口，默认为 8000
+        reload (bool): 是否启用热重载，默认为 False
+    """
+    print(f'{"=" * 80}\n正在启动API服务器...\n{"=" * 80}')
+    print(f"服务器地址: http://{host}:{port}")
+    print(f"API文档: http://{host}:{port}/docs")
+    print(f"热重载: {'启用' if reload else '禁用'}")
+    print(f'{"=" * 80}')
+    # 导入并启动FastAPI应用
+    uvicorn.run(
+        "api.app:app",
+        host=host,
+        port=port,
+        reload=reload,
+        log_level="info"
+    )
 def main():
     """
     主程序入口函数