Text-to-Speech
Transformers
ONNX
GGUF
Chinese
English
voice-dialogue
speech-recognition
large-language-model
asr
tts
llm
chinese
english
real-time
conversational
Instructions to use MoYoYoTech/VoiceDialogue with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MoYoYoTech/VoiceDialogue with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="MoYoYoTech/VoiceDialogue") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("MoYoYoTech/VoiceDialogue", dtype="auto") - llama-cpp-python
How to use MoYoYoTech/VoiceDialogue with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="MoYoYoTech/VoiceDialogue", filename="assets/models/llm/qwen/Qwen3-8B-Q6_K.gguf", )
llm.create_chat_completion( messages = "\"The answer to the universe is 42\"" )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use MoYoYoTech/VoiceDialogue with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf MoYoYoTech/VoiceDialogue:Q6_K # Run inference directly in the terminal: llama cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf MoYoYoTech/VoiceDialogue:Q6_K # Run inference directly in the terminal: llama cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K # Run inference directly in the terminal: ./llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf MoYoYoTech/VoiceDialogue:Q6_K # Run inference directly in the terminal: ./build/bin/llama-cli -hf MoYoYoTech/VoiceDialogue:Q6_K
Use Docker
docker model run hf.co/MoYoYoTech/VoiceDialogue:Q6_K
- LM Studio
- Jan
- Ollama
How to use MoYoYoTech/VoiceDialogue with Ollama:
ollama run hf.co/MoYoYoTech/VoiceDialogue:Q6_K
- Unsloth Studio
How to use MoYoYoTech/VoiceDialogue with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MoYoYoTech/VoiceDialogue to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MoYoYoTech/VoiceDialogue to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MoYoYoTech/VoiceDialogue to start chatting
- Pi
How to use MoYoYoTech/VoiceDialogue with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf MoYoYoTech/VoiceDialogue:Q6_K
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "MoYoYoTech/VoiceDialogue:Q6_K" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use MoYoYoTech/VoiceDialogue with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf MoYoYoTech/VoiceDialogue:Q6_K
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default MoYoYoTech/VoiceDialogue:Q6_K
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use MoYoYoTech/VoiceDialogue with Docker Model Runner:
docker model run hf.co/MoYoYoTech/VoiceDialogue:Q6_K
- Lemonade
How to use MoYoYoTech/VoiceDialogue with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull MoYoYoTech/VoiceDialogue:Q6_K
Run and chat with the model
lemonade run user.VoiceDialogue-Q6_K
List all available models
lemonade list
feat: Implement MCP integration for tool discovery and execution
#1
by heyong4725 - opened
- README.md +20 -52
- assets/www/assets/{index-CGlMbARk.js → index-ByqsFGbw.js} +2 -2
- assets/www/assets/{index-BPAUWo8W.css → index-CCuJ1lip.css} +1 -1
- assets/www/index.html +2 -2
- build/pyinstaller/hooks/hook-voice_dialogue.py +1 -24
- electron-app/main.js +2 -2
- frontend/src/App.vue +2 -13
- frontend/src/assets/ball.json +2 -2
- frontend/src/config/client_config.ts +1 -1
- frontend/src/i18n/index.ts +0 -35
- frontend/src/i18n/locales/en.ts +0 -74
- frontend/src/i18n/locales/zh.ts +0 -74
- frontend/src/main.ts +0 -2
- frontend/src/stores/config.ts +0 -3
- frontend/src/style.scss +0 -65
- frontend/src/views/Home/Components/ChatText.vue +7 -15
- frontend/src/views/Home/index.vue +1 -12
- frontend/src/views/Welcome/Components/SettingsModal.vue +0 -581
- frontend/src/views/Welcome/index.vue +418 -72
- main.py +1 -16
- pyproject.toml +4 -5
- scripts/convert_tts_weights_to_safetensors.py +0 -47
- src/voice_dialogue/api/app.py +1 -2
- src/voice_dialogue/api/core/lifespan.py +2 -2
- src/voice_dialogue/api/core/service_factories.py +4 -11
- src/voice_dialogue/api/routes/system_routes.py +5 -107
- src/voice_dialogue/api/schemas/system_schemas.py +2 -43
- src/voice_dialogue/asr/manager.py +5 -24
- src/voice_dialogue/asr/models/__init__.py +0 -9
- src/voice_dialogue/asr/models/qwen.py +0 -76
- src/voice_dialogue/audio/capture/__init__.py +7 -53
- src/voice_dialogue/audio/capture/pyaudio_capture.py +10 -104
- src/voice_dialogue/audio/devices.py +0 -167
- src/voice_dialogue/audio/player.py +1 -69
- src/voice_dialogue/cli/args.py +0 -14
- src/voice_dialogue/config/audio_config.py +0 -77
- src/voice_dialogue/config/paths.py +0 -1
- src/voice_dialogue/core/launcher.py +4 -8
- src/voice_dialogue/services/asr_service.py +1 -1
- src/voice_dialogue/services/audio_player_service.py +1 -3
- src/voice_dialogue/tts/runtime/moyoyo.py +0 -3
- src/voice_dialogue/tts/weights_migration.py +0 -45
- uv.lock +0 -0
README.md
CHANGED
|
@@ -26,7 +26,7 @@ library_name: transformers
|
|
| 26 |

|
| 27 |

|
| 28 |

|
| 29 |
-
、大语言模型(LLM)和文本转语音(TTS)的实时语音对话系统
|
| 32 |
|
|
@@ -38,9 +38,8 @@ library_name: transformers
|
|
| 38 |
|
| 39 |
VoiceDialogue 是一个基于 Python 的完整语音对话系统,实现了端到端的语音交互体验。系统采用模块化设计,具备实时、高精度、多角色的特点。
|
| 40 |
|
| 41 |
-
-
|
| 42 |
-
-
|
| 43 |
-
- 🤖 **智能对话生成**: 集成 Qwen3 等大语言模型
|
| 44 |
- 🔊 **高质量语音合成**: 支持多角色、多风格的语音输出
|
| 45 |
- 🌐 **Web API 服务**: 提供 HTTP 接口,方便集成
|
| 46 |
- ⚡ **低延迟处理**: 优化的音频流处理管道
|
|
@@ -49,78 +48,47 @@ VoiceDialogue 是一个基于 Python 的完整语音对话系统,实现了端
|
|
| 49 |
|
| 50 |
## 🚀 快速开始
|
| 51 |
|
| 52 |
-
|
| 53 |
-
> 目前仅支持 **macOS(Apple Silicon)**。
|
| 54 |
-
|
| 55 |
-
### 1. 克隆并安装
|
| 56 |
-
|
| 57 |
-
> **模型分两部分**:
|
| 58 |
-
> - **随仓库下载(约 12GB,Git LFS)**:大语言模型、语音合成、参考音色等。
|
| 59 |
-
> - **首次启动自动下载(约 4.4GB)**:语音识别引擎 **Qwen3-ASR**,由程序在第一次运行时从 HuggingFace 拉取并缓存到 `~/.cache/huggingface`,之后无需重复下载。
|
| 60 |
-
>
|
| 61 |
-
> ⚠️ **必须先安装 [Git LFS](https://git-lfs.com)**,否则克隆下来的模型只是几百字节的占位指针,应用无法启动。
|
| 62 |
|
| 63 |
```bash
|
| 64 |
-
#
|
| 65 |
-
brew install git-lfs # 如未安装 Homebrew,见 https://git-lfs.com
|
| 66 |
-
git lfs install
|
| 67 |
-
|
| 68 |
-
# 2) 克隆项目(包含约 12GB 模型,体积较大,请耐心等待)
|
| 69 |
git clone https://huggingface.co/MoYoYoTech/VoiceDialogue
|
| 70 |
cd VoiceDialogue
|
| 71 |
|
| 72 |
-
#
|
| 73 |
-
# 若显示很小,说明 Git LFS 未生效,执行:git lfs pull
|
| 74 |
-
ls -lh assets/models/llm/qwen/Qwen3-8B-Q6_K.gguf
|
| 75 |
-
|
| 76 |
-
# 4) 安装依赖(推荐使用 uv)
|
| 77 |
pip install uv
|
| 78 |
uv venv
|
| 79 |
source .venv/bin/activate
|
| 80 |
|
| 81 |
WHISPER_COREML=1 CMAKE_ARGS="-DGGML_METAL=on" uv sync
|
| 82 |
|
| 83 |
-
#
|
| 84 |
-
|
| 85 |
-
uv pip install
|
|
|
|
|
|
|
| 86 |
```
|
| 87 |
|
| 88 |
> 📖 需要更详细的步骤?请查阅 [安装指南](docs/installation.md),其中包含系统要求和常见问题。
|
| 89 |
|
| 90 |
-
### 2.
|
| 91 |
-
|
| 92 |
-
```bash
|
| 93 |
-
python main.py --mode api
|
| 94 |
-
```
|
| 95 |
-
|
| 96 |
-
启动后,在浏览器中打开:**http://localhost:8000/app/**
|
| 97 |
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
- 点击右下角 **⚙️ 设置**,选择**麦克风、回音消除、识别语言、音色**,也可切换**中 / 英界面语言**;
|
| 101 |
-
- 点击 **「开始对话」**,即可与 AI 实时语音对话,**字幕会实时显示**。
|
| 102 |
-
|
| 103 |
-
> **首次启动较慢,属正常现象**:程序会自动下载 Qwen3-ASR 模型(约 4.4GB,需联网,下载进度会打印在终端)并转换一次 TTS 权重格式。全部完成后才会就绪,整个过程约几分钟(取决于网速);之后每次启动只需数十秒。
|
| 104 |
-
> 若终端长时间停在下载步骤,请检查网络是否能访问 `huggingface.co`。
|
| 105 |
-
|
| 106 |
-
### 3. 命令行模式(CLI)
|
| 107 |
-
|
| 108 |
-
如果不需要图形界面,也可以直接在终端运行语音对话:
|
| 109 |
|
| 110 |
```bash
|
| 111 |
-
# 启动语音对话
|
| 112 |
python main.py
|
| 113 |
|
| 114 |
-
# 指定语言
|
| 115 |
python main.py --language en --speaker Heart
|
|
|
|
| 116 |
|
| 117 |
-
#
|
| 118 |
-
python main.py --list-audio-devices
|
| 119 |
|
| 120 |
-
|
| 121 |
-
|
|
|
|
| 122 |
```
|
| 123 |
-
|
| 124 |
> 详细使用方法请参考 [配置指南](docs/configuration.md) 和 [API 服务指南](docs/api-guide.md)。
|
| 125 |
|
| 126 |
## 📚 文档导航
|
|
|
|
| 26 |

|
| 27 |

|
| 28 |

|
| 29 |
+

|
| 30 |
|
| 31 |
一个集成了语音识别(ASR)、大语言模型(LLM)和文本转语音(TTS)的实时语音对话系统
|
| 32 |
|
|
|
|
| 38 |
|
| 39 |
VoiceDialogue 是一个基于 Python 的完整语音对话系统,实现了端到端的语音交互体验。系统采用模块化设计,具备实时、高精度、多角色的特点。
|
| 40 |
|
| 41 |
+
- 🎤 **实时语音识别**: 高精度中英文语音转录
|
| 42 |
+
- 🤖 **智能对话生成**: 集成 Qwen2.5 等大语言模型
|
|
|
|
| 43 |
- 🔊 **高质量语音合成**: 支持多角色、多风格的语音输出
|
| 44 |
- 🌐 **Web API 服务**: 提供 HTTP 接口,方便集成
|
| 45 |
- ⚡ **低延迟处理**: 优化的音频流处理管道
|
|
|
|
| 48 |
|
| 49 |
## 🚀 快速开始
|
| 50 |
|
| 51 |
+
### 1. 安装
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
|
| 53 |
```bash
|
| 54 |
+
# 克隆项目
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
git clone https://huggingface.co/MoYoYoTech/VoiceDialogue
|
| 56 |
cd VoiceDialogue
|
| 57 |
|
| 58 |
+
# 安装依赖 (推荐使用 uv)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
pip install uv
|
| 60 |
uv venv
|
| 61 |
source .venv/bin/activate
|
| 62 |
|
| 63 |
WHISPER_COREML=1 CMAKE_ARGS="-DGGML_METAL=on" uv sync
|
| 64 |
|
| 65 |
+
# 安装额外的依赖
|
| 66 |
+
## 1. 安装 kokoro-onnx
|
| 67 |
+
uv pip install kokoro-onnx
|
| 68 |
+
## 2. 重新安装指定版本的 numpy
|
| 69 |
+
uv pip install numpy==1.26.4
|
| 70 |
```
|
| 71 |
|
| 72 |
> 📖 需要更详细的步骤?请查阅 [安装指南](docs/installation.md),其中包含系统要求和常见问题。
|
| 73 |
|
| 74 |
+
### 2. 运行
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
|
| 76 |
+
#### 命令行模式 (CLI)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
```bash
|
| 79 |
+
# 启动语音对话 (默认中文)
|
| 80 |
python main.py
|
| 81 |
|
| 82 |
+
# 启动并指定语言和角色
|
| 83 |
python main.py --language en --speaker Heart
|
| 84 |
+
```
|
| 85 |
|
| 86 |
+
#### API 服务模式
|
|
|
|
| 87 |
|
| 88 |
+
```bash
|
| 89 |
+
# 启动 API 服务器
|
| 90 |
+
python main.py --mode api
|
| 91 |
```
|
|
|
|
| 92 |
> 详细使用方法请参考 [配置指南](docs/configuration.md) 和 [API 服务指南](docs/api-guide.md)。
|
| 93 |
|
| 94 |
## 📚 文档导航
|
assets/www/assets/{index-CGlMbARk.js → index-ByqsFGbw.js}
RENAMED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:215e0b4a6eee243715941860012a0d3bbee778f8880df45b0ddc8b090993405b
|
| 3 |
+
size 2215701
|
assets/www/assets/{index-BPAUWo8W.css → index-CCuJ1lip.css}
RENAMED
|
@@ -1 +1 @@
|
|
| 1 |
-
@charset "UTF-8";html,body{width:100%;height:100%}input::-ms-clear,input::-ms-reveal{display:none}*,*:before,*:after{box-sizing:border-box}html{font-family:sans-serif;line-height:1.15;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%;-ms-overflow-style:scrollbar;-webkit-tap-highlight-color:rgba(0,0,0,0)}body{margin:0}[tabindex="-1"]:focus{outline:none}hr{box-sizing:content-box;height:0;overflow:visible}h1,h2,h3,h4,h5,h6{margin-top:0;margin-bottom:.5em;font-weight:500}p{margin-top:0;margin-bottom:1em}abbr[title],abbr[data-original-title]{-webkit-text-decoration:underline dotted;text-decoration:underline;text-decoration:underline dotted;border-bottom:0;cursor:help}address{margin-bottom:1em;font-style:normal;line-height:inherit}input[type=text],input[type=password],input[type=number],textarea{-webkit-appearance:none}ol,ul,dl{margin-top:0;margin-bottom:1em}ol ol,ul ul,ol ul,ul ol{margin-bottom:0}dt{font-weight:500}dd{margin-bottom:.5em;margin-left:0}blockquote{margin:0 0 1em}dfn{font-style:italic}b,strong{font-weight:bolder}small{font-size:80%}sub,sup{position:relative;font-size:75%;line-height:0;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}pre,code,kbd,samp{font-size:1em;font-family:SFMono-Regular,Consolas,Liberation Mono,Menlo,Courier,monospace}pre{margin-top:0;margin-bottom:1em;overflow:auto}figure{margin:0 0 1em}img{vertical-align:middle;border-style:none}a,area,button,[role=button],input:not([type=range]),label,select,summary,textarea{touch-action:manipulation}table{border-collapse:collapse}caption{padding-top:.75em;padding-bottom:.3em;text-align:left;caption-side:bottom}input,button,select,optgroup,textarea{margin:0;color:inherit;font-size:inherit;font-family:inherit;line-height:inherit}button,input{overflow:visible}button,select{text-transform:none}button,html [type=button],[type=reset],[type=submit]{-webkit-appearance:button}button::-moz-focus-inner,[type=button]::-moz-focus-inner,[type=reset]::-moz-focus-inner,[type=submit]::-moz-focus-inner{padding:0;border-style:none}input[type=radio],input[type=checkbox]{box-sizing:border-box;padding:0}input[type=date],input[type=time],input[type=datetime-local],input[type=month]{-webkit-appearance:listbox}textarea{overflow:auto;resize:vertical}fieldset{min-width:0;margin:0;padding:0;border:0}legend{display:block;width:100%;max-width:100%;margin-bottom:.5em;padding:0;color:inherit;font-size:1.5em;line-height:inherit;white-space:normal}progress{vertical-align:baseline}[type=number]::-webkit-inner-spin-button,[type=number]::-webkit-outer-spin-button{height:auto}[type=search]{outline-offset:-2px;-webkit-appearance:none}[type=search]::-webkit-search-cancel-button,[type=search]::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{font:inherit;-webkit-appearance:button}output{display:inline-block}summary{display:list-item}template{display:none}[hidden]{display:none!important}mark{padding:.2em;background-color:#feffe6}:root{font-family:Inter,system-ui,Avenir,Helvetica,Arial,sans-serif;line-height:1.5;font-weight:400;color-scheme:light dark;color:#ffffffde;background-color:#242424;font-synthesis:none;text-rendering:optimizeLegibility;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;-webkit-text-size-adjust:100%}a{font-weight:500;color:#646cff;text-decoration:inherit}a:hover{color:#535bf2}body{margin:0;display:flex;place-items:center;min-width:320px;height:100%;min-height:auto;color:#333;background:#fff}h1{font-size:3.2em;line-height:1.1}button{border-radius:8px;border:1px solid transparent;padding:.6em 1.2em;font-size:1em;font-weight:500;font-family:inherit;background-color:#1a1a1a;cursor:pointer;transition:border-color .25s}.card{border-bottom:solid 2px lightgray;align-items:center;justify-content:center;margin-top:40px;display:flex;max-width:1024px;width:100%}.seg-title{margin:24px 0;font-size:20px;font-weight:500}.seg-co{width:1022px;text-align:left;border-left:solid 6px midnightblue;padding-left:8px;margin-left:2px;margin-top:36px;line-height:24px}#app{margin:0 auto;padding:0;text-align:center;width:100%;height:100%}.ant-btn{padding:4px 12px}@media (prefers-color-scheme: light){:root{color:#213547;background-color:#fff}a:hover{color:#747bff}button{background-color:#f9f9f9}}.ant-card{background:#f5f6fa;height:100%}.ant-card-body{padding:24px 36px 12px!important;border-radius:0 0 8px 8px}.ant-card .ant-card-actions{background-color:#e8e8f8cc!important}.ant-popover{max-width:800px!important}.ant-form-item{background:transparent;margin-bottom:40px!important}.ant-form-item .ant-form-item-explain-error{color:#ff4d4f;text-align:left!important}.ant-form-item-label label{font-size:18px!important;color:#1a1a1a!important;font-weight:500!important}.ant-tooltip{max-width:1022px!important}.ant-page-header-heading{width:1022px!important}.highlight{background:#f8f8ff}.ant-layout-sider-collapsed{width:0!important;min-width:0!important;overflow:hidden}.ant-layout-sider-collapsed .ant-menu-item,.ant-layout-sider-collapsed .ant-menu-submenu-title{display:none}.ant-modal .ant-modal-content{background:#ffffff9e!important;backdrop-filter:blur(28px) saturate(140%);-webkit-backdrop-filter:blur(28px) saturate(140%);border:1px solid rgba(255,255,255,.6);border-radius:22px!important;box-shadow:0 16px 48px #1f26872e}.ant-modal .ant-modal-header{background:transparent!important}.ant-modal-mask{background:#14161e1f!important;backdrop-filter:blur(14px) saturate(120%);-webkit-backdrop-filter:blur(14px) saturate(120%)}.ant-select .ant-select-selector,.ant-input,textarea.ant-input,.ant-input-affix-wrapper{background:#ffffff73!important;backdrop-filter:blur(8px);-webkit-backdrop-filter:blur(8px);border:1px solid rgba(255,255,255,.7)!important}.ant-btn:not(.ant-btn-text):not(.ant-btn-link){box-shadow:0 2px 10px #1f26871a}.ant-btn-default{background:#ffffff80!important;border:1px solid rgba(255,255,255,.75)!important;backdrop-filter:blur(8px);-webkit-backdrop-filter:blur(8px)}.ant-btn-text{box-shadow:none!important;background:transparent!important}.ant-radio-group-solid .ant-radio-button-wrapper:first-child{border-top-left-radius:12px;border-bottom-left-radius:12px}.ant-radio-group-solid .ant-radio-button-wrapper:last-child{border-top-right-radius:12px;border-bottom-right-radius:12px}.header-nav[data-v-07594418]{display:flex;align-items:flex-start;justify-content:space-between;width:100vw;height:40px;align-items:center;position:absolute;top:0;left:0;z-index:99;-webkit-app-region:drag;cursor:move}.header-nav .window-controls[data-v-07594418],.header-nav button[data-v-07594418],.header-nav .ant-input-search[data-v-07594418],.header-nav img[data-v-07594418],.header-nav .anticon[data-v-07594418]{-webkit-app-region:no-drag;cursor:pointer}.header-nav .window-controls[data-v-07594418]{top:0;right:0;display:flex;z-index:1000;margin-left:12px}.header-nav .window-controls .window-control-btn[data-v-07594418]{width:46px;height:32px;border:none;background:transparent;color:#666;font-size:16px;cursor:pointer;display:flex;align-items:center;justify-content:center;transition:background-color .2s}.header-nav .window-controls .window-control-btn[data-v-07594418]:hover{background-color:#0000001a}.header-nav .window-controls .window-control-btn.close[data-v-07594418]:hover{background-color:#e81123;color:#fff}.header-nav .window-controls .close-icon.focus[data-v-07594418]{display:none}.header-nav .window-controls:hover .close-icon.default[data-v-07594418],.header-nav .window-controls:focus-within .close-icon.default[data-v-07594418]{display:none}.header-nav .window-controls:hover .close-icon.focus[data-v-07594418],.header-nav .window-controls:focus-within .close-icon.focus[data-v-07594418]{display:inline}.content[data-v-b8a456cb]{background-color:#fff;margin:0 auto;display:flex;flex-direction:column;align-items:center;justify-content:space-between}.not-found-wrapper[data-v-aef52a59]{height:calc(100vh - 104px)}.tab-body[data-v-a48e843b]{height:360px;overflow-y:auto;padding:4px 8px 4px 2px}.setting-row[data-v-a48e843b]{margin-bottom:20px}.setting-row>label[data-v-a48e843b]{display:block;font-size:15px;font-weight:500;margin-bottom:8px}.setting-row>label .label-icon[data-v-a48e843b]{margin-right:6px;color:#1890ff}.setting-row .hint[data-v-a48e843b]{font-size:12px;color:#999;margin:8px 0 0}.setting-row .row-inline[data-v-a48e843b]{display:flex;align-items:center;justify-content:space-between}.voice-group[data-v-a48e843b]{display:flex;flex-direction:column;margin-top:8px}.about .about-head[data-v-a48e843b]{text-align:center;margin-bottom:24px}.about .about-head .about-name[data-v-a48e843b]{font-size:20px;font-weight:600}.about .about-head .about-ver[data-v-a48e843b]{font-size:13px;color:#888;margin-top:2px}.about .about-head .about-tagline[data-v-a48e843b]{font-size:12px;color:#999;margin-top:4px}.about .about-section[data-v-a48e843b]{margin-bottom:20px}.about .about-section .about-section-title[data-v-a48e843b]{font-size:13px;font-weight:600;color:#666;margin-bottom:10px}.about .about-item[data-v-a48e843b]{margin-bottom:12px}.about .about-item .about-item-label[data-v-a48e843b]{font-size:14px;font-weight:500}.about .about-item .about-item-desc[data-v-a48e843b]{font-size:12px;color:#777;margin-top:2px;line-height:1.6}.about .about-item .about-item-desc a[data-v-a48e843b]{margin-left:6px}.about a[data-v-a48e843b]{color:#1677ff;text-decoration:none}.about a[data-v-a48e843b]:hover{text-decoration:underline}.about .about-link[data-v-a48e843b]{font-size:13px;word-break:break-all}.about .about-copyright[data-v-a48e843b]{margin-top:16px;font-size:11px;color:#aaa;text-align:center}.voice-radio[data-v-a48e843b]{display:flex;align-items:center;height:40px;line-height:40px}.voice-radio .voice-name[data-v-a48e843b]{margin-right:8px}.audio-play-btn[data-v-a48e843b]{padding:0 6px;border-radius:4px}.audio-play-btn.playing[data-v-a48e843b]{background-color:#f6ffed}.asr-chip[data-v-ca2e1f17]{display:flex;align-items:center;gap:8px;height:38px;padding:0 18px;margin-right:16px;border-radius:19px;color:#000000a6;font-size:13px;background:#ffffff80;border:1px solid rgba(255,255,255,.7);backdrop-filter:blur(10px);-webkit-backdrop-filter:blur(10px);box-shadow:0 4px 16px #1f26871f}.settings-btn[data-v-ca2e1f17]{width:60px;height:60px;margin-right:24px;border-radius:50%!important;background:#ffffff80!important;border:1px solid rgba(255,255,255,.7)!important;backdrop-filter:blur(10px);-webkit-backdrop-filter:blur(10px);box-shadow:0 4px 16px #1f26871f;display:flex;align-items:center;justify-content:center}.welcome-wrapper[data-v-ca2e1f17]{width:100%;height:100%;background-image:url(./bg-BmnA8p_e.png);background-repeat:no-repeat;background-attachment:fixed;background-size:cover;background-position:center;display:flex;flex-direction:column;align-items:center;justify-content:space-between;color:#fff}.welcome-wrapper .content[data-v-ca2e1f17]{width:100%;height:80vh;display:flex;flex-direction:column;justify-content:space-around;margin-top:64px}.welcome-wrapper .content .inner-content[data-v-ca2e1f17]{display:flex;flex-direction:column;align-items:center;justify-content:center;text-align:center;padding:20px}.welcome-wrapper .content .inner-content .text-box[data-v-ca2e1f17]{color:#000;margin-bottom:36px}.welcome-wrapper .content .inner-content .text-box .title[data-v-ca2e1f17]{font-size:24px;font-weight:600;margin-bottom:24px}.welcome-wrapper .content .inner-content .text-box .sub-title[data-v-ca2e1f17]{font-size:15px;margin-top:10px}.welcome-wrapper .content .inner-content .btn-box[data-v-ca2e1f17]{width:224px;height:80px}.welcome-wrapper .actions[data-v-ca2e1f17]{width:100%;height:100px;margin-bottom:32px;display:flex;align-items:center;justify-content:flex-end}.ball-wrapper[data-v-34c8e583]{width:100%;height:calc(100vh - 100px);display:flex;flex-direction:column;align-items:center;justify-content:space-around}.talk-wrapper[data-v-05da84ae]{width:auto;width:100%;max-width:1000px;margin:0 auto;box-sizing:border-box;height:calc(100vh - 150px);overflow-y:auto;padding:20px 32px 0;display:flex;flex-direction:column;align-items:flex-start;justify-content:flex-start}.talk-wrapper .cont-left[data-v-05da84ae]{width:100%;margin:24px 0;display:flex;justify-content:flex-start;align-items:flex-start}.talk-wrapper .cont-left .text-left[data-v-05da84ae]{max-width:88%;color:#222;font-size:16px;font-weight:400;text-align:left;line-height:1.8;margin-left:12px;margin-top:6px;word-break:break-word}.talk-wrapper .cont-right[data-v-05da84ae]{width:100%;margin:24px 0;display:flex;justify-content:flex-end;align-items:flex-start}.talk-wrapper .cont-right .text-right[data-v-05da84ae]{max-width:80%;color:#444;font-size:16px;font-weight:400;text-align:start;line-height:1.8;margin-right:12px;background:#ccc;border-radius:8px 0 8px 8px;padding:8px 12px;word-break:break-word}.chat-wrapper[data-v-8b035bf4]{width:100%;height:100%;background-image:url(./bg-BmnA8p_e.png);background-repeat:no-repeat;background-attachment:fixed;background-size:cover;background-position:center;display:flex;flex-direction:column;align-items:center;justify-content:space-between;color:#fff}.chat-wrapper .content[data-v-8b035bf4]{width:100%;height:auto;display:flex;flex-direction:column;justify-content:space-around}.chat-wrapper .content .inner-content[data-v-8b035bf4]{display:flex;flex-direction:column;align-items:center;justify-content:center;text-align:center;padding:20px}.chat-wrapper .content .inner-content .text-box[data-v-8b035bf4]{color:#000;margin-bottom:36px}.chat-wrapper .content .inner-content .text-box .title[data-v-8b035bf4]{font-size:24px;font-weight:600;margin-bottom:24px}.chat-wrapper .content .inner-content .text-box .sub-title[data-v-8b035bf4]{font-size:15px;margin-top:10px}.chat-wrapper .content .inner-content .btn-box[data-v-8b035bf4]{width:224px;height:80px}.chat-wrapper .actions[data-v-8b035bf4]{width:100%;height:100px;margin-bottom:32px;display:flex;justify-content:space-between;align-items:center}.chat-wrapper .actions .holder[data-v-8b035bf4]{width:64px;height:48px}.chat-wrapper .actions .btns[data-v-8b035bf4]{width:450px;height:96px;display:flex;justify-content:space-around;align-items:center}.chat-wrapper .actions .btns[data-v-8b035bf4] .ant-btn{border-radius:50%!important;background:#ffffff80!important;border:1px solid rgba(255,255,255,.7)!important;backdrop-filter:blur(10px);-webkit-backdrop-filter:blur(10px);box-shadow:0 4px 16px #1f26871f}.chat-wrapper .actions .download-wrapper[data-v-8b035bf4]{width:64px;height:64px;display:flex;justify-content:flex-start;align-items:center;margin-right:0}.chat-wrapper .actions .download-wrapper img[data-v-8b035bf4]{width:24px;height:24px}.content-wrapper[data-v-d41c9ce7]{text-align:left;max-width:800px;min-width:320px;margin-bottom:64px;min-height:calc(100vh - 438px)}.content-wrapper .content-box[data-v-d41c9ce7]{padding:24px;height:240px;background-color:#e8e8e8;border-radius:16px;width:50%;margin:48px auto;min-width:300px}.content-wrapper .video-box[data-v-d41c9ce7]{max-width:800px;min-width:320px;width:90vw;height:auto}
|
|
|
|
| 1 |
+
@charset "UTF-8";html,body{width:100%;height:100%}input::-ms-clear,input::-ms-reveal{display:none}*,*:before,*:after{box-sizing:border-box}html{font-family:sans-serif;line-height:1.15;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%;-ms-overflow-style:scrollbar;-webkit-tap-highlight-color:rgba(0,0,0,0)}body{margin:0}[tabindex="-1"]:focus{outline:none}hr{box-sizing:content-box;height:0;overflow:visible}h1,h2,h3,h4,h5,h6{margin-top:0;margin-bottom:.5em;font-weight:500}p{margin-top:0;margin-bottom:1em}abbr[title],abbr[data-original-title]{-webkit-text-decoration:underline dotted;text-decoration:underline;text-decoration:underline dotted;border-bottom:0;cursor:help}address{margin-bottom:1em;font-style:normal;line-height:inherit}input[type=text],input[type=password],input[type=number],textarea{-webkit-appearance:none}ol,ul,dl{margin-top:0;margin-bottom:1em}ol ol,ul ul,ol ul,ul ol{margin-bottom:0}dt{font-weight:500}dd{margin-bottom:.5em;margin-left:0}blockquote{margin:0 0 1em}dfn{font-style:italic}b,strong{font-weight:bolder}small{font-size:80%}sub,sup{position:relative;font-size:75%;line-height:0;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}pre,code,kbd,samp{font-size:1em;font-family:SFMono-Regular,Consolas,Liberation Mono,Menlo,Courier,monospace}pre{margin-top:0;margin-bottom:1em;overflow:auto}figure{margin:0 0 1em}img{vertical-align:middle;border-style:none}a,area,button,[role=button],input:not([type=range]),label,select,summary,textarea{touch-action:manipulation}table{border-collapse:collapse}caption{padding-top:.75em;padding-bottom:.3em;text-align:left;caption-side:bottom}input,button,select,optgroup,textarea{margin:0;color:inherit;font-size:inherit;font-family:inherit;line-height:inherit}button,input{overflow:visible}button,select{text-transform:none}button,html [type=button],[type=reset],[type=submit]{-webkit-appearance:button}button::-moz-focus-inner,[type=button]::-moz-focus-inner,[type=reset]::-moz-focus-inner,[type=submit]::-moz-focus-inner{padding:0;border-style:none}input[type=radio],input[type=checkbox]{box-sizing:border-box;padding:0}input[type=date],input[type=time],input[type=datetime-local],input[type=month]{-webkit-appearance:listbox}textarea{overflow:auto;resize:vertical}fieldset{min-width:0;margin:0;padding:0;border:0}legend{display:block;width:100%;max-width:100%;margin-bottom:.5em;padding:0;color:inherit;font-size:1.5em;line-height:inherit;white-space:normal}progress{vertical-align:baseline}[type=number]::-webkit-inner-spin-button,[type=number]::-webkit-outer-spin-button{height:auto}[type=search]{outline-offset:-2px;-webkit-appearance:none}[type=search]::-webkit-search-cancel-button,[type=search]::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{font:inherit;-webkit-appearance:button}output{display:inline-block}summary{display:list-item}template{display:none}[hidden]{display:none!important}mark{padding:.2em;background-color:#feffe6}:root{font-family:Inter,system-ui,Avenir,Helvetica,Arial,sans-serif;line-height:1.5;font-weight:400;color-scheme:light dark;color:#ffffffde;background-color:#242424;font-synthesis:none;text-rendering:optimizeLegibility;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;-webkit-text-size-adjust:100%}a{font-weight:500;color:#646cff;text-decoration:inherit}a:hover{color:#535bf2}body{margin:0;display:flex;place-items:center;min-width:320px;height:100%;min-height:auto;color:#333;background:#fff}h1{font-size:3.2em;line-height:1.1}button{border-radius:8px;border:1px solid transparent;padding:.6em 1.2em;font-size:1em;font-weight:500;font-family:inherit;background-color:#1a1a1a;cursor:pointer;transition:border-color .25s}.card{border-bottom:solid 2px lightgray;align-items:center;justify-content:center;margin-top:40px;display:flex;max-width:1024px;width:100%}.seg-title{margin:24px 0;font-size:20px;font-weight:500}.seg-co{width:1022px;text-align:left;border-left:solid 6px midnightblue;padding-left:8px;margin-left:2px;margin-top:36px;line-height:24px}#app{margin:0 auto;padding:0;text-align:center;width:100%;height:100%}.ant-btn{padding:4px 12px}@media (prefers-color-scheme: light){:root{color:#213547;background-color:#fff}a:hover{color:#747bff}button{background-color:#f9f9f9}}.ant-card{background:#f5f6fa;height:100%}.ant-card-body{padding:24px 36px 12px!important;border-radius:0 0 8px 8px}.ant-card .ant-card-actions{background-color:#e8e8f8cc!important}.ant-popover{max-width:800px!important}.ant-form-item{background:transparent;margin-bottom:40px!important}.ant-form-item .ant-form-item-explain-error{color:#ff4d4f;text-align:left!important}.ant-form-item-label label{font-size:18px!important;color:#1a1a1a!important;font-weight:500!important}.ant-tooltip{max-width:1022px!important}.ant-page-header-heading{width:1022px!important}.highlight{background:#f8f8ff}.ant-layout-sider-collapsed{width:0!important;min-width:0!important;overflow:hidden}.ant-layout-sider-collapsed .ant-menu-item,.ant-layout-sider-collapsed .ant-menu-submenu-title{display:none}.header-nav[data-v-07594418]{display:flex;align-items:flex-start;justify-content:space-between;width:100vw;height:40px;align-items:center;position:absolute;top:0;left:0;z-index:99;-webkit-app-region:drag;cursor:move}.header-nav .window-controls[data-v-07594418],.header-nav button[data-v-07594418],.header-nav .ant-input-search[data-v-07594418],.header-nav img[data-v-07594418],.header-nav .anticon[data-v-07594418]{-webkit-app-region:no-drag;cursor:pointer}.header-nav .window-controls[data-v-07594418]{top:0;right:0;display:flex;z-index:1000;margin-left:12px}.header-nav .window-controls .window-control-btn[data-v-07594418]{width:46px;height:32px;border:none;background:transparent;color:#666;font-size:16px;cursor:pointer;display:flex;align-items:center;justify-content:center;transition:background-color .2s}.header-nav .window-controls .window-control-btn[data-v-07594418]:hover{background-color:#0000001a}.header-nav .window-controls .window-control-btn.close[data-v-07594418]:hover{background-color:#e81123;color:#fff}.header-nav .window-controls .close-icon.focus[data-v-07594418]{display:none}.header-nav .window-controls:hover .close-icon.default[data-v-07594418],.header-nav .window-controls:focus-within .close-icon.default[data-v-07594418]{display:none}.header-nav .window-controls:hover .close-icon.focus[data-v-07594418],.header-nav .window-controls:focus-within .close-icon.focus[data-v-07594418]{display:inline}.content[data-v-874ca48f]{background-color:#fff;margin:0 auto;display:flex;flex-direction:column;align-items:center;justify-content:space-between}.not-found-wrapper[data-v-aef52a59]{height:calc(100vh - 104px)}.btn-groups[data-v-839398ff]{margin-top:36px;display:flex;justify-content:flex-end;align-items:center}.prompt-title p[data-v-839398ff]{margin:0;font-size:16px;font-weight:500}.prompt-content[data-v-839398ff]{margin-top:16px}.prompt-content .prompt-title[data-v-839398ff]{margin-bottom:24px;font-size:22px;font-weight:500;text-align:center}.prompt-content .language-segment[data-v-839398ff]{display:flex;justify-content:center;margin-bottom:16px}.prompt-content .prompt-item[data-v-839398ff]{margin-top:16px}.languages[data-v-cd713caa]{margin-top:24px;margin-bottom:24px}.languages p[data-v-cd713caa]{font-size:16px;font-weight:500;margin-bottom:8px}.audio-play-btn[data-v-cd713caa]{padding:2px 8px 0;border-radius:4px;transition:all .2s;height:40px}.audio-play-btn[data-v-cd713caa]:hover{background-color:#f0f0f0}.audio-play-btn.playing[data-v-cd713caa]{background-color:#f6ffed;border-color:#1890ff}.audio-play-btn.playing .playing-icon[data-v-cd713caa]{animation:pulse-cd713caa 1.5s infinite}@keyframes pulse-cd713caa{0%{opacity:1;transform:scale(1)}50%{opacity:.7;transform:scale(1.1)}to{opacity:1;transform:scale(1)}}.btn-groups[data-v-cd713caa]{margin-top:36px;display:flex;justify-content:space-between;align-items:center}.custom-popover-list[data-v-cd713caa]{width:92px;margin:0}.custom-popover-list .custom-popover-item[data-v-cd713caa]{font-size:14px;line-height:36px;font-weight:500;color:#1e1e1e;cursor:pointer;border-radius:4px;padding:0 8px;margin:0 -8px;transition:background .2s}.custom-popover-list .custom-popover-item[data-v-cd713caa]:hover,.custom-popover-list .custom-popover-item[data-v-cd713caa]:focus{background:#e5e7eb}.welcome-wrapper[data-v-cd713caa]{width:100%;height:100%;background-image:url(./bg-BmnA8p_e.png);background-repeat:no-repeat;background-attachment:fixed;background-size:cover;background-position:center;display:flex;flex-direction:column;align-items:center;justify-content:space-between;color:#fff}.welcome-wrapper .content[data-v-cd713caa]{width:100%;height:80vh;display:flex;flex-direction:column;justify-content:space-around;margin-top:64px}.welcome-wrapper .content .inner-content[data-v-cd713caa]{display:flex;flex-direction:column;align-items:center;justify-content:center;text-align:center;padding:20px}.welcome-wrapper .content .inner-content .text-box[data-v-cd713caa]{color:#000;margin-bottom:36px}.welcome-wrapper .content .inner-content .text-box .title[data-v-cd713caa]{font-size:24px;font-weight:600;margin-bottom:24px}.welcome-wrapper .content .inner-content .text-box .sub-title[data-v-cd713caa]{font-size:15px;margin-top:10px}.welcome-wrapper .content .inner-content .btn-box[data-v-cd713caa]{width:224px;height:80px}.welcome-wrapper .actions[data-v-cd713caa]{width:100%;height:64px;display:flex;justify-content:flex-end}.ball-wrapper[data-v-34c8e583]{width:100%;height:calc(100vh - 100px);display:flex;flex-direction:column;align-items:center;justify-content:space-around}.talk-wrapper[data-v-1f502814]{width:auto;height:calc(100vh - 100px);overflow-y:scroll;padding:20px 240px 0;display:flex;flex-direction:column;align-items:flex-start;justify-content:flex-start}.talk-wrapper .cont-left[data-v-1f502814]{width:100%;margin:24px 0;display:flex;justify-content:flex-start;align-items:flex-start}.talk-wrapper .cont-left .text-left[data-v-1f502814]{color:#222;font-size:16px;font-weight:400;text-align:left;line-height:2;margin-left:12px;margin-top:6px}.talk-wrapper .cont-right[data-v-1f502814]{width:100%;margin:24px 0;display:flex;justify-content:flex-end;align-items:flex-start}.talk-wrapper .cont-right .text-right[data-v-1f502814]{color:#444;font-size:16px;font-weight:400;text-align:end;line-height:2;margin-right:12px;background:#ccc;border-radius:8px 0 8px 8px;padding:8px}.chat-wrapper[data-v-803600aa]{width:100%;height:100%;background-image:url(./bg-BmnA8p_e.png);background-repeat:no-repeat;background-attachment:fixed;background-size:cover;background-position:center;display:flex;flex-direction:column;align-items:center;justify-content:space-between;color:#fff}.chat-wrapper .content[data-v-803600aa]{width:100%;height:auto;display:flex;flex-direction:column;justify-content:space-around}.chat-wrapper .content .inner-content[data-v-803600aa]{display:flex;flex-direction:column;align-items:center;justify-content:center;text-align:center;padding:20px}.chat-wrapper .content .inner-content .text-box[data-v-803600aa]{color:#000;margin-bottom:36px}.chat-wrapper .content .inner-content .text-box .title[data-v-803600aa]{font-size:24px;font-weight:600;margin-bottom:24px}.chat-wrapper .content .inner-content .text-box .sub-title[data-v-803600aa]{font-size:15px;margin-top:10px}.chat-wrapper .content .inner-content .btn-box[data-v-803600aa]{width:224px;height:80px}.chat-wrapper .actions[data-v-803600aa]{width:100%;height:100px;display:flex;justify-content:space-between;align-items:center}.chat-wrapper .actions .holder[data-v-803600aa]{width:64px;height:48px}.chat-wrapper .actions .btns[data-v-803600aa]{width:450px;height:96px;display:flex;justify-content:space-around;align-items:flex-start}.chat-wrapper .actions .download-wrapper[data-v-803600aa]{width:64px;height:64px;display:flex;justify-content:flex-start;align-items:center;margin-right:0}.chat-wrapper .actions .download-wrapper img[data-v-803600aa]{width:24px;height:24px}.content-wrapper[data-v-d41c9ce7]{text-align:left;max-width:800px;min-width:320px;margin-bottom:64px;min-height:calc(100vh - 438px)}.content-wrapper .content-box[data-v-d41c9ce7]{padding:24px;height:240px;background-color:#e8e8e8;border-radius:16px;width:50%;margin:48px auto;min-width:300px}.content-wrapper .video-box[data-v-d41c9ce7]{max-width:800px;min-width:320px;width:90vw;height:auto}
|
assets/www/index.html
CHANGED
|
@@ -5,8 +5,8 @@
|
|
| 5 |
<link rel="icon" type="image/svg+xml" href="./favicon.ico" />
|
| 6 |
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
| 7 |
<title>VoiceDialogue</title>
|
| 8 |
-
<script type="module" crossorigin src="./assets/index-
|
| 9 |
-
<link rel="stylesheet" crossorigin href="./assets/index-
|
| 10 |
</head>
|
| 11 |
<body>
|
| 12 |
<div id="app"></div>
|
|
|
|
| 5 |
<link rel="icon" type="image/svg+xml" href="./favicon.ico" />
|
| 6 |
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
| 7 |
<title>VoiceDialogue</title>
|
| 8 |
+
<script type="module" crossorigin src="./assets/index-ByqsFGbw.js"></script>
|
| 9 |
+
<link rel="stylesheet" crossorigin href="./assets/index-CCuJ1lip.css">
|
| 10 |
</head>
|
| 11 |
<body>
|
| 12 |
<div id="app"></div>
|
build/pyinstaller/hooks/hook-voice_dialogue.py
CHANGED
|
@@ -24,29 +24,8 @@ ASSETS_ROOT = PROJECT_ROOT / "assets"
|
|
| 24 |
# 收集主模块的所有子模块
|
| 25 |
hiddenimports = collect_submodules('voice_dialogue')
|
| 26 |
datas = collect_data_files('moyoyo_tts', include_py_files=True)
|
| 27 |
-
|
| 28 |
-
# 不打包的资源:
|
| 29 |
-
# - 旧版 FunASR/Whisper 模型(默认引擎为内置的 Qwen3-ASR)
|
| 30 |
-
# - TTS 预训练权重的 .bin(已内置等价的 model.safetensors)
|
| 31 |
-
EXCLUDED_ASSET_PATTERNS = [
|
| 32 |
-
"assets/models/asr/funasr/",
|
| 33 |
-
"assets/models/asr/whisper/",
|
| 34 |
-
"chinese-roberta-wwm-ext-large/pytorch_model.bin",
|
| 35 |
-
"chinese-hubert-base/pytorch_model.bin",
|
| 36 |
-
]
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
def _is_excluded(source_path: str) -> bool:
|
| 40 |
-
normalized = source_path.replace("\\", "/")
|
| 41 |
-
return any(pattern in normalized for pattern in EXCLUDED_ASSET_PATTERNS)
|
| 42 |
-
|
| 43 |
-
|
| 44 |
# 收集系统资源文件
|
| 45 |
-
datas +=
|
| 46 |
-
(source, dest)
|
| 47 |
-
for source, dest in collect_system_data_files(ASSETS_ROOT.as_posix(), "assets")
|
| 48 |
-
if not _is_excluded(source)
|
| 49 |
-
]
|
| 50 |
|
| 51 |
# ============================================================================
|
| 52 |
# 第三方依赖配置
|
|
@@ -60,7 +39,6 @@ ML_DEPENDENCIES = [
|
|
| 60 |
"pytorch_lightning",
|
| 61 |
"huggingface_hub",
|
| 62 |
"einops",
|
| 63 |
-
"qwen_asr",
|
| 64 |
]
|
| 65 |
|
| 66 |
# 语音处理相关依赖
|
|
@@ -139,7 +117,6 @@ DATA_PACKAGES = [
|
|
| 139 |
("spacy", {"include_py_files": True}),
|
| 140 |
("misaki", {}),
|
| 141 |
("silero_vad", {}),
|
| 142 |
-
("qwen_asr", {}),
|
| 143 |
]
|
| 144 |
|
| 145 |
# 收集数据文件
|
|
|
|
| 24 |
# 收集主模块的所有子模块
|
| 25 |
hiddenimports = collect_submodules('voice_dialogue')
|
| 26 |
datas = collect_data_files('moyoyo_tts', include_py_files=True)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
# 收集系统资源文件
|
| 28 |
+
datas += collect_system_data_files(ASSETS_ROOT.as_posix(), "assets")
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
# ============================================================================
|
| 31 |
# 第三方依赖配置
|
|
|
|
| 39 |
"pytorch_lightning",
|
| 40 |
"huggingface_hub",
|
| 41 |
"einops",
|
|
|
|
| 42 |
]
|
| 43 |
|
| 44 |
# 语音处理相关依赖
|
|
|
|
| 117 |
("spacy", {"include_py_files": True}),
|
| 118 |
("misaki", {}),
|
| 119 |
("silero_vad", {}),
|
|
|
|
| 120 |
]
|
| 121 |
|
| 122 |
# 收集数据文件
|
electron-app/main.js
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4b10113a08513b026f9207f21db221f368a1f242a821dd54d2d002354a6a2ec2
|
| 3 |
+
size 7039
|
frontend/src/App.vue
CHANGED
|
@@ -1,8 +1,6 @@
|
|
| 1 |
<template>
|
| 2 |
-
<
|
| 3 |
-
|
| 4 |
-
<router-view class="content" />
|
| 5 |
-
</a-config-provider>
|
| 6 |
<!-- <Footer/> -->
|
| 7 |
|
| 8 |
<!-- <a-layout>
|
|
@@ -21,15 +19,6 @@
|
|
| 21 |
import Header from "@/views/Header.vue";
|
| 22 |
import Footer from "@/views/Footer.vue";
|
| 23 |
|
| 24 |
-
// 全局主题:统一圆角与控件高度,配合玻璃拟态(Liquid Glass)
|
| 25 |
-
const appTheme = {
|
| 26 |
-
token: {
|
| 27 |
-
colorPrimary: '#1677ff',
|
| 28 |
-
borderRadius: 14,
|
| 29 |
-
controlHeight: 38,
|
| 30 |
-
},
|
| 31 |
-
};
|
| 32 |
-
|
| 33 |
// import * as api from "@/client";
|
| 34 |
import { onBeforeMount, onMounted, watch, CSSProperties, ref} from "vue";
|
| 35 |
import {useSettingsStore} from "@/stores/config.ts";
|
|
|
|
| 1 |
<template>
|
| 2 |
+
<Header/>
|
| 3 |
+
<router-view class="content" />
|
|
|
|
|
|
|
| 4 |
<!-- <Footer/> -->
|
| 5 |
|
| 6 |
<!-- <a-layout>
|
|
|
|
| 19 |
import Header from "@/views/Header.vue";
|
| 20 |
import Footer from "@/views/Footer.vue";
|
| 21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
// import * as api from "@/client";
|
| 23 |
import { onBeforeMount, onMounted, watch, CSSProperties, ref} from "vue";
|
| 24 |
import {useSettingsStore} from "@/stores/config.ts";
|
frontend/src/assets/ball.json
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:edd650ec984e26b5fde217f273e6758d0862fc856b5333e678fa0b578374e8b9
|
| 3 |
+
size 23084
|
frontend/src/config/client_config.ts
CHANGED
|
@@ -5,7 +5,7 @@ import router from "@/router";
|
|
| 5 |
|
| 6 |
const { wsCache } = useCache();
|
| 7 |
|
| 8 |
-
export const test_server = '127.0.0.1:
|
| 9 |
// export const test_server = '59.110.18.232:19001'
|
| 10 |
|
| 11 |
axios.defaults.baseURL = import.meta.env.PROD ? '/api/v1' : `http://${test_server}/api/v1`;
|
|
|
|
| 5 |
|
| 6 |
const { wsCache } = useCache();
|
| 7 |
|
| 8 |
+
export const test_server = '127.0.0.1:8848'
|
| 9 |
// export const test_server = '59.110.18.232:19001'
|
| 10 |
|
| 11 |
axios.defaults.baseURL = import.meta.env.PROD ? '/api/v1' : `http://${test_server}/api/v1`;
|
frontend/src/i18n/index.ts
DELETED
|
@@ -1,35 +0,0 @@
|
|
| 1 |
-
import { createI18n } from 'vue-i18n'
|
| 2 |
-
|
| 3 |
-
import en from './locales/en'
|
| 4 |
-
import zh from './locales/zh'
|
| 5 |
-
|
| 6 |
-
export type UiLocale = 'en' | 'zh'
|
| 7 |
-
|
| 8 |
-
// 从持久化的 pinia 设置中读取界面语言,默认英文
|
| 9 |
-
function getInitialLocale(): UiLocale {
|
| 10 |
-
try {
|
| 11 |
-
const raw = localStorage.getItem('settings')
|
| 12 |
-
if (raw) {
|
| 13 |
-
const parsed = JSON.parse(raw)
|
| 14 |
-
const ui = parsed?.uiLanguage
|
| 15 |
-
if (ui === 'en' || ui === 'zh') return ui
|
| 16 |
-
}
|
| 17 |
-
} catch (e) {
|
| 18 |
-
// ignore
|
| 19 |
-
}
|
| 20 |
-
return 'zh'
|
| 21 |
-
}
|
| 22 |
-
|
| 23 |
-
const i18n = createI18n({
|
| 24 |
-
legacy: false,
|
| 25 |
-
globalInjection: true,
|
| 26 |
-
locale: getInitialLocale(),
|
| 27 |
-
fallbackLocale: 'en',
|
| 28 |
-
messages: { en, zh },
|
| 29 |
-
})
|
| 30 |
-
|
| 31 |
-
export function setUiLocale(locale: UiLocale) {
|
| 32 |
-
i18n.global.locale.value = locale
|
| 33 |
-
}
|
| 34 |
-
|
| 35 |
-
export default i18n
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
frontend/src/i18n/locales/en.ts
DELETED
|
@@ -1,74 +0,0 @@
|
|
| 1 |
-
export default {
|
| 2 |
-
common: {
|
| 3 |
-
cancel: 'Cancel',
|
| 4 |
-
confirm: 'Confirm',
|
| 5 |
-
reset: 'Reset',
|
| 6 |
-
save: 'Save',
|
| 7 |
-
error: 'Error',
|
| 8 |
-
},
|
| 9 |
-
lang: {
|
| 10 |
-
zh: 'Chinese',
|
| 11 |
-
en: 'English',
|
| 12 |
-
auto: 'Auto',
|
| 13 |
-
},
|
| 14 |
-
welcome: {
|
| 15 |
-
title: 'Welcome',
|
| 16 |
-
subtitle: 'Click the button below to start a conversation',
|
| 17 |
-
start: 'Start Conversation',
|
| 18 |
-
startFailed: 'Failed to start the voice dialogue system',
|
| 19 |
-
},
|
| 20 |
-
settings: {
|
| 21 |
-
title: 'Settings',
|
| 22 |
-
entry: 'Settings',
|
| 23 |
-
tabs: {
|
| 24 |
-
main: 'Main',
|
| 25 |
-
language: 'Language',
|
| 26 |
-
advanced: 'Prompt',
|
| 27 |
-
about: 'About',
|
| 28 |
-
},
|
| 29 |
-
about: {
|
| 30 |
-
tagline: 'A real-time AI voice dialogue system',
|
| 31 |
-
version: 'Version',
|
| 32 |
-
modelsTitle: 'Models',
|
| 33 |
-
llm: 'Language Model (LLM)',
|
| 34 |
-
llmDesc: 'Qwen3-8B (Q6_K, GGUF) · via llama.cpp',
|
| 35 |
-
asr: 'Speech Recognition (ASR)',
|
| 36 |
-
asrDesc: 'Whisper medium (English) · FunASR SeACo-Paraformer + CT-Transformer (Chinese)',
|
| 37 |
-
tts: 'Speech Synthesis (TTS)',
|
| 38 |
-
ttsDesc: 'MoYoYo TTS (GPT-SoVITS) · Kokoro (English)',
|
| 39 |
-
linksTitle: 'Repositories',
|
| 40 |
-
repoApp: 'App & source code',
|
| 41 |
-
repoVoices: 'Voice (tone) models',
|
| 42 |
-
copyright: '© 2025 MoYoYo · Models belong to their respective owners',
|
| 43 |
-
},
|
| 44 |
-
general: {
|
| 45 |
-
interfaceLanguage: 'Interface Language',
|
| 46 |
-
interfaceLanguageHint: 'Language of the application interface.',
|
| 47 |
-
},
|
| 48 |
-
audio: {
|
| 49 |
-
microphone: 'Microphone (Input Device)',
|
| 50 |
-
microphoneHint: 'Choose the input device, e.g. an external microphone array.',
|
| 51 |
-
systemDefault: 'System Default',
|
| 52 |
-
channelsSuffix: 'ch',
|
| 53 |
-
defaultSuffix: 'default',
|
| 54 |
-
speaker: 'Speaker (Output Device)',
|
| 55 |
-
speakerHint: 'Choose the output device for voice playback, e.g. an external speaker.',
|
| 56 |
-
echoCancellation: 'Echo Cancellation',
|
| 57 |
-
echoCancellationHint: 'Uses the system AEC on the default device. For an external array, echo is handled by the array hardware.',
|
| 58 |
-
},
|
| 59 |
-
recognition: {
|
| 60 |
-
language: 'Recognition Language',
|
| 61 |
-
languageHint: 'Language used for speech recognition (ASR).',
|
| 62 |
-
},
|
| 63 |
-
voice: {
|
| 64 |
-
role: 'Voice',
|
| 65 |
-
roleHint: 'The voice used for speech synthesis (TTS).',
|
| 66 |
-
playSample: 'Play sample',
|
| 67 |
-
},
|
| 68 |
-
prompt: {
|
| 69 |
-
title: 'System Prompt',
|
| 70 |
-
hint: 'Customize the system prompt for each language.',
|
| 71 |
-
},
|
| 72 |
-
applyFailed: 'Failed to apply settings',
|
| 73 |
-
},
|
| 74 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
frontend/src/i18n/locales/zh.ts
DELETED
|
@@ -1,74 +0,0 @@
|
|
| 1 |
-
export default {
|
| 2 |
-
common: {
|
| 3 |
-
cancel: '取消',
|
| 4 |
-
confirm: '确认',
|
| 5 |
-
reset: '重置',
|
| 6 |
-
save: '保存',
|
| 7 |
-
error: '错误',
|
| 8 |
-
},
|
| 9 |
-
lang: {
|
| 10 |
-
zh: '中文',
|
| 11 |
-
en: '英文',
|
| 12 |
-
auto: '自动',
|
| 13 |
-
},
|
| 14 |
-
welcome: {
|
| 15 |
-
title: '欢迎使用',
|
| 16 |
-
subtitle: '点击下方按钮开始对话',
|
| 17 |
-
start: '开始对话',
|
| 18 |
-
startFailed: '启动语音对话系统失败',
|
| 19 |
-
},
|
| 20 |
-
settings: {
|
| 21 |
-
title: '设置',
|
| 22 |
-
entry: '设置',
|
| 23 |
-
tabs: {
|
| 24 |
-
main: '常用',
|
| 25 |
-
language: '语言',
|
| 26 |
-
advanced: 'Prompt',
|
| 27 |
-
about: '关于',
|
| 28 |
-
},
|
| 29 |
-
about: {
|
| 30 |
-
tagline: '实时 AI 语音对话系统',
|
| 31 |
-
version: '版本',
|
| 32 |
-
modelsTitle: '使用的模型',
|
| 33 |
-
llm: '大语言模型 (LLM)',
|
| 34 |
-
llmDesc: 'Qwen3-8B(Q6_K,GGUF)· 基于 llama.cpp',
|
| 35 |
-
asr: '语音识别 (ASR)',
|
| 36 |
-
asrDesc: 'Whisper medium(英文)· FunASR SeACo-Paraformer + CT-Transformer(中文)',
|
| 37 |
-
tts: '语音合成 (TTS)',
|
| 38 |
-
ttsDesc: 'MoYoYo TTS(GPT-SoVITS)· Kokoro(英文)',
|
| 39 |
-
linksTitle: '开源仓库',
|
| 40 |
-
repoApp: '应用与源码',
|
| 41 |
-
repoVoices: '音色模型',
|
| 42 |
-
copyright: '© 2025 MoYoYo · 各模型版权归原作者所有',
|
| 43 |
-
},
|
| 44 |
-
general: {
|
| 45 |
-
interfaceLanguage: '界面语言',
|
| 46 |
-
interfaceLanguageHint: '应用界面所使用的语言。',
|
| 47 |
-
},
|
| 48 |
-
audio: {
|
| 49 |
-
microphone: '麦克风(输入设备)',
|
| 50 |
-
microphoneHint: '选择输入设备,例如外置麦克风阵列。',
|
| 51 |
-
systemDefault: '系统默认',
|
| 52 |
-
channelsSuffix: '声道',
|
| 53 |
-
defaultSuffix: '默认',
|
| 54 |
-
speaker: '扬声器(输出设备)',
|
| 55 |
-
speakerHint: '选择语音播放的输出设备,例如外置扬声器。',
|
| 56 |
-
echoCancellation: '回音消除',
|
| 57 |
-
echoCancellationHint: '默认设备使用系统 AEC;选择外置阵列时,回音由阵列硬件处理。',
|
| 58 |
-
},
|
| 59 |
-
recognition: {
|
| 60 |
-
language: '识别语言',
|
| 61 |
-
languageHint: '语音识别(ASR)所使用的语言。',
|
| 62 |
-
},
|
| 63 |
-
voice: {
|
| 64 |
-
role: '音色',
|
| 65 |
-
roleHint: '语音合成(TTS)所使用的音色。',
|
| 66 |
-
playSample: '试听',
|
| 67 |
-
},
|
| 68 |
-
prompt: {
|
| 69 |
-
title: '系统提示词',
|
| 70 |
-
hint: '为每种语言自定义系统提示词。',
|
| 71 |
-
},
|
| 72 |
-
applyFailed: '应用设置失败',
|
| 73 |
-
},
|
| 74 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
frontend/src/main.ts
CHANGED
|
@@ -9,7 +9,6 @@ import './style.scss'
|
|
| 9 |
|
| 10 |
import App from './App.vue'
|
| 11 |
import router from './router'
|
| 12 |
-
import i18n from './i18n'
|
| 13 |
|
| 14 |
|
| 15 |
// import * as Sentry from "@sentry/browser";
|
|
@@ -29,5 +28,4 @@ createApp(App)
|
|
| 29 |
.use(router)
|
| 30 |
.use(Antd)
|
| 31 |
.use(Vue3Lottie)
|
| 32 |
-
.use(i18n)
|
| 33 |
.mount('#app')
|
|
|
|
| 9 |
|
| 10 |
import App from './App.vue'
|
| 11 |
import router from './router'
|
|
|
|
| 12 |
|
| 13 |
|
| 14 |
// import * as Sentry from "@sentry/browser";
|
|
|
|
| 28 |
.use(router)
|
| 29 |
.use(Antd)
|
| 30 |
.use(Vue3Lottie)
|
|
|
|
| 31 |
.mount('#app')
|
frontend/src/stores/config.ts
CHANGED
|
@@ -8,11 +8,8 @@ export const useSettingsStore = defineStore({
|
|
| 8 |
return {
|
| 9 |
role: '',
|
| 10 |
language: 'zh',
|
| 11 |
-
uiLanguage: 'zh' as 'en' | 'zh',
|
| 12 |
sider_open: true,
|
| 13 |
echoCancel: true,
|
| 14 |
-
inputDeviceIndex: null as number | null,
|
| 15 |
-
outputDeviceIndex: null as number | null,
|
| 16 |
}
|
| 17 |
},
|
| 18 |
actions: {
|
|
|
|
| 8 |
return {
|
| 9 |
role: '',
|
| 10 |
language: 'zh',
|
|
|
|
| 11 |
sider_open: true,
|
| 12 |
echoCancel: true,
|
|
|
|
|
|
|
| 13 |
}
|
| 14 |
},
|
| 15 |
actions: {
|
frontend/src/style.scss
CHANGED
|
@@ -173,68 +173,3 @@ $FormItemWidth: 1022px;
|
|
| 173 |
.ant-layout-sider-collapsed .ant-menu-submenu-title {
|
| 174 |
display: none;
|
| 175 |
}
|
| 176 |
-
|
| 177 |
-
/* ============================================================
|
| 178 |
-
Liquid Glass —— 苹果风格玻璃拟态(全局)
|
| 179 |
-
半透明 + 背景模糊 + 柔和描边/阴影;圆角由主题 token 统一
|
| 180 |
-
============================================================ */
|
| 181 |
-
|
| 182 |
-
/* 弹窗使用 Ant 内置 fade 过渡(纯 opacity 动画、无 transform),
|
| 183 |
-
避免 transform 动画期间 backdrop-filter 失效导致的闪烁;面板与其模糊一起平滑淡入 */
|
| 184 |
-
|
| 185 |
-
/* 弹窗:磨砂玻璃面板 */
|
| 186 |
-
.ant-modal .ant-modal-content {
|
| 187 |
-
background: rgba(255, 255, 255, 0.62) !important;
|
| 188 |
-
backdrop-filter: blur(28px) saturate(140%);
|
| 189 |
-
-webkit-backdrop-filter: blur(28px) saturate(140%);
|
| 190 |
-
border: 1px solid rgba(255, 255, 255, 0.6);
|
| 191 |
-
border-radius: 22px !important;
|
| 192 |
-
box-shadow: 0 16px 48px rgba(31, 38, 135, 0.18);
|
| 193 |
-
}
|
| 194 |
-
.ant-modal .ant-modal-header {
|
| 195 |
-
background: transparent !important;
|
| 196 |
-
}
|
| 197 |
-
/* 遮罩:整屏磨砂——轻微变暗 + 背景模糊。
|
| 198 |
-
遮罩用 ant-fade(opacity)淡入,模糊随之平滑出现,背景文字与画面一起糊掉,不再"闪出去" */
|
| 199 |
-
.ant-modal-mask {
|
| 200 |
-
background: rgba(20, 22, 30, 0.12) !important;
|
| 201 |
-
backdrop-filter: blur(14px) saturate(120%);
|
| 202 |
-
-webkit-backdrop-filter: blur(14px) saturate(120%);
|
| 203 |
-
}
|
| 204 |
-
|
| 205 |
-
/* 输入类控件:半透明玻璃 */
|
| 206 |
-
.ant-select .ant-select-selector,
|
| 207 |
-
.ant-input,
|
| 208 |
-
textarea.ant-input,
|
| 209 |
-
.ant-input-affix-wrapper {
|
| 210 |
-
background: rgba(255, 255, 255, 0.45) !important;
|
| 211 |
-
backdrop-filter: blur(8px);
|
| 212 |
-
-webkit-backdrop-filter: blur(8px);
|
| 213 |
-
border: 1px solid rgba(255, 255, 255, 0.7) !important;
|
| 214 |
-
}
|
| 215 |
-
|
| 216 |
-
/* 按钮:统一形状(圆角来自 token)+ 柔和阴影;默认按钮做玻璃质感,主按钮保持实色
|
| 217 |
-
文本/链接按钮(如音色试听的小喇叭)保持透明无阴影 */
|
| 218 |
-
.ant-btn:not(.ant-btn-text):not(.ant-btn-link) {
|
| 219 |
-
box-shadow: 0 2px 10px rgba(31, 38, 135, 0.10);
|
| 220 |
-
}
|
| 221 |
-
.ant-btn-default {
|
| 222 |
-
background: rgba(255, 255, 255, 0.5) !important;
|
| 223 |
-
border: 1px solid rgba(255, 255, 255, 0.75) !important;
|
| 224 |
-
backdrop-filter: blur(8px);
|
| 225 |
-
-webkit-backdrop-filter: blur(8px);
|
| 226 |
-
}
|
| 227 |
-
.ant-btn-text {
|
| 228 |
-
box-shadow: none !important;
|
| 229 |
-
background: transparent !important;
|
| 230 |
-
}
|
| 231 |
-
|
| 232 |
-
/* 分段单选(中文/英文 等)两端圆角,去掉方正感 */
|
| 233 |
-
.ant-radio-group-solid .ant-radio-button-wrapper:first-child {
|
| 234 |
-
border-top-left-radius: 12px;
|
| 235 |
-
border-bottom-left-radius: 12px;
|
| 236 |
-
}
|
| 237 |
-
.ant-radio-group-solid .ant-radio-button-wrapper:last-child {
|
| 238 |
-
border-top-right-radius: 12px;
|
| 239 |
-
border-bottom-right-radius: 12px;
|
| 240 |
-
}
|
|
|
|
| 173 |
.ant-layout-sider-collapsed .ant-menu-submenu-title {
|
| 174 |
display: none;
|
| 175 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
frontend/src/views/Home/Components/ChatText.vue
CHANGED
|
@@ -69,13 +69,9 @@ watch(() => props.chatContent, (newVal, oldVal) => {
|
|
| 69 |
<style lang="scss" scoped>
|
| 70 |
.talk-wrapper {
|
| 71 |
width: auto;
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
box-sizing: border-box;
|
| 76 |
-
height: calc(100vh - 150px);
|
| 77 |
-
overflow-y: auto;
|
| 78 |
-
padding: 20px 32px 0;
|
| 79 |
display: flex;
|
| 80 |
flex-direction: column;
|
| 81 |
align-items: flex-start;
|
|
@@ -89,15 +85,13 @@ watch(() => props.chatContent, (newVal, oldVal) => {
|
|
| 89 |
justify-content: flex-start;
|
| 90 |
align-items: flex-start;
|
| 91 |
.text-left {
|
| 92 |
-
max-width: 88%;
|
| 93 |
color: #222;
|
| 94 |
font-size: 16px;
|
| 95 |
font-weight: 400;
|
| 96 |
text-align: left;
|
| 97 |
-
line-height:
|
| 98 |
margin-left: 12px;
|
| 99 |
margin-top: 6px;
|
| 100 |
-
word-break: break-word;
|
| 101 |
}
|
| 102 |
}
|
| 103 |
|
|
@@ -109,18 +103,16 @@ watch(() => props.chatContent, (newVal, oldVal) => {
|
|
| 109 |
align-items: flex-start;
|
| 110 |
|
| 111 |
.text-right {
|
| 112 |
-
max-width: 80%;
|
| 113 |
color: #444;
|
| 114 |
font-size: 16px;
|
| 115 |
font-weight: 400;
|
| 116 |
-
text-align:
|
| 117 |
-
line-height:
|
| 118 |
margin-right: 12px;
|
| 119 |
background: #ccc;
|
| 120 |
border-radius: 8px;
|
| 121 |
border-top-right-radius: 0;
|
| 122 |
-
padding: 8px
|
| 123 |
-
word-break: break-word;
|
| 124 |
}
|
| 125 |
}
|
| 126 |
}
|
|
|
|
| 69 |
<style lang="scss" scoped>
|
| 70 |
.talk-wrapper {
|
| 71 |
width: auto;
|
| 72 |
+
height: calc(100vh - 100px);
|
| 73 |
+
overflow-y: scroll;
|
| 74 |
+
padding: 20px 240px 0 240px;
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
display: flex;
|
| 76 |
flex-direction: column;
|
| 77 |
align-items: flex-start;
|
|
|
|
| 85 |
justify-content: flex-start;
|
| 86 |
align-items: flex-start;
|
| 87 |
.text-left {
|
|
|
|
| 88 |
color: #222;
|
| 89 |
font-size: 16px;
|
| 90 |
font-weight: 400;
|
| 91 |
text-align: left;
|
| 92 |
+
line-height: 2;
|
| 93 |
margin-left: 12px;
|
| 94 |
margin-top: 6px;
|
|
|
|
| 95 |
}
|
| 96 |
}
|
| 97 |
|
|
|
|
| 103 |
align-items: flex-start;
|
| 104 |
|
| 105 |
.text-right {
|
|
|
|
| 106 |
color: #444;
|
| 107 |
font-size: 16px;
|
| 108 |
font-weight: 400;
|
| 109 |
+
text-align: end;
|
| 110 |
+
line-height: 2;
|
| 111 |
margin-right: 12px;
|
| 112 |
background: #ccc;
|
| 113 |
border-radius: 8px;
|
| 114 |
border-top-right-radius: 0;
|
| 115 |
+
padding: 8px;
|
|
|
|
| 116 |
}
|
| 117 |
}
|
| 118 |
}
|
frontend/src/views/Home/index.vue
CHANGED
|
@@ -387,7 +387,6 @@ const toggleText = () => {
|
|
| 387 |
.actions {
|
| 388 |
width: 100%;
|
| 389 |
height: 100px;
|
| 390 |
-
margin-bottom: 32px;
|
| 391 |
|
| 392 |
display: flex;
|
| 393 |
justify-content: space-between;
|
|
@@ -402,17 +401,7 @@ const toggleText = () => {
|
|
| 402 |
height: 96px;
|
| 403 |
display: flex;
|
| 404 |
justify-content: space-around;
|
| 405 |
-
align-items:
|
| 406 |
-
|
| 407 |
-
// Liquid Glass 圆形按钮(与 Welcome 设置按钮统一)
|
| 408 |
-
:deep(.ant-btn) {
|
| 409 |
-
border-radius: 50% !important;
|
| 410 |
-
background: rgba(255, 255, 255, 0.5) !important;
|
| 411 |
-
border: 1px solid rgba(255, 255, 255, 0.7) !important;
|
| 412 |
-
backdrop-filter: blur(10px);
|
| 413 |
-
-webkit-backdrop-filter: blur(10px);
|
| 414 |
-
box-shadow: 0 4px 16px rgba(31, 38, 135, 0.12);
|
| 415 |
-
}
|
| 416 |
}
|
| 417 |
.download-wrapper {
|
| 418 |
width: 64px;
|
|
|
|
| 387 |
.actions {
|
| 388 |
width: 100%;
|
| 389 |
height: 100px;
|
|
|
|
| 390 |
|
| 391 |
display: flex;
|
| 392 |
justify-content: space-between;
|
|
|
|
| 401 |
height: 96px;
|
| 402 |
display: flex;
|
| 403 |
justify-content: space-around;
|
| 404 |
+
align-items: flex-start;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 405 |
}
|
| 406 |
.download-wrapper {
|
| 407 |
width: 64px;
|
frontend/src/views/Welcome/Components/SettingsModal.vue
DELETED
|
@@ -1,581 +0,0 @@
|
|
| 1 |
-
<script setup lang="ts">
|
| 2 |
-
import { ref, reactive, computed, watch, onUnmounted } from "vue";
|
| 3 |
-
import { Modal } from "ant-design-vue";
|
| 4 |
-
import { SoundTwoTone, SoundOutlined, TranslationOutlined, AudioOutlined } from "@ant-design/icons-vue";
|
| 5 |
-
import { useI18n } from "vue-i18n";
|
| 6 |
-
import axios from "axios";
|
| 7 |
-
import { useSettingsStore } from "@/stores/config.ts";
|
| 8 |
-
import { setUiLocale, UiLocale } from "@/i18n";
|
| 9 |
-
|
| 10 |
-
const props = defineProps({
|
| 11 |
-
open: { type: Boolean, default: false },
|
| 12 |
-
});
|
| 13 |
-
const emit = defineEmits(["update:open"]);
|
| 14 |
-
|
| 15 |
-
const { t } = useI18n();
|
| 16 |
-
const base_url = axios.defaults.baseURL;
|
| 17 |
-
const settingsStore = useSettingsStore();
|
| 18 |
-
|
| 19 |
-
const activeTab = ref<string>("main");
|
| 20 |
-
const loading = ref<boolean>(false);
|
| 21 |
-
const appVersion = "1.2.0";
|
| 22 |
-
|
| 23 |
-
// ---- 各项设置的本地状态(打开时从 store / 后端同步)----
|
| 24 |
-
const uiLanguage = ref<UiLocale>((settingsStore.$state.uiLanguage as UiLocale) ?? "en");
|
| 25 |
-
const recognitionLanguage = ref<string>(settingsStore.$state.language || "zh");
|
| 26 |
-
const echoCancel = ref<boolean>(settingsStore.$state.echoCancel ?? true);
|
| 27 |
-
const inputDeviceIndex = ref<number | null>(settingsStore.$state.inputDeviceIndex ?? null);
|
| 28 |
-
const outputDeviceIndex = ref<number | null>(settingsStore.$state.outputDeviceIndex ?? null);
|
| 29 |
-
const role = ref<string>(settingsStore.$state.role || "");
|
| 30 |
-
|
| 31 |
-
const languages = reactive<string[]>([]);
|
| 32 |
-
const inputDevices = reactive<any[]>([]);
|
| 33 |
-
const outputDevices = reactive<any[]>([]);
|
| 34 |
-
const roles = reactive<any[]>([]);
|
| 35 |
-
|
| 36 |
-
// ---- Prompt ----
|
| 37 |
-
const promptLang = ref<string>("zh");
|
| 38 |
-
const default_prompt_en = ref<string>("");
|
| 39 |
-
const default_prompt_zh = ref<string>("");
|
| 40 |
-
const current_prompt_en = ref<string>("");
|
| 41 |
-
const current_prompt_zh = ref<string>("");
|
| 42 |
-
|
| 43 |
-
const filteredRoles = computed(() => {
|
| 44 |
-
const is_chinese = recognitionLanguage.value === "zh";
|
| 45 |
-
return roles.filter((r) => r["is_chinese_voice"] === is_chinese);
|
| 46 |
-
});
|
| 47 |
-
|
| 48 |
-
// 切换识别语言后,自动选中第一个匹配音色
|
| 49 |
-
watch(
|
| 50 |
-
() => recognitionLanguage.value,
|
| 51 |
-
() => {
|
| 52 |
-
if (filteredRoles.value.length > 0) {
|
| 53 |
-
const exists = filteredRoles.value.find((r) => r["id"] === role.value);
|
| 54 |
-
role.value = exists ? role.value : filteredRoles.value[0]["id"];
|
| 55 |
-
} else {
|
| 56 |
-
role.value = "";
|
| 57 |
-
}
|
| 58 |
-
}
|
| 59 |
-
);
|
| 60 |
-
|
| 61 |
-
// 界面语言即时生效(让用户立刻看到切换效果)
|
| 62 |
-
watch(uiLanguage, (v) => setUiLocale(v));
|
| 63 |
-
|
| 64 |
-
// ---- 数据加载 ----
|
| 65 |
-
const fetchASRLanguages = async () => {
|
| 66 |
-
try {
|
| 67 |
-
const res = await fetch(`${base_url}/asr/languages`);
|
| 68 |
-
const data = await res.json();
|
| 69 |
-
if (data?.languages) {
|
| 70 |
-
languages.splice(0, languages.length, ...data.languages);
|
| 71 |
-
// 优先沿用本地已保存/默认的识别语言(默认中文),不被后端当前值覆盖
|
| 72 |
-
const saved = settingsStore.$state.language;
|
| 73 |
-
recognitionLanguage.value = saved && data.languages.includes(saved)
|
| 74 |
-
? saved
|
| 75 |
-
: (data.languages.includes('zh') ? 'zh' : data.languages[0]);
|
| 76 |
-
}
|
| 77 |
-
} catch (e) {
|
| 78 |
-
console.error("Error fetching ASR languages:", e);
|
| 79 |
-
}
|
| 80 |
-
};
|
| 81 |
-
|
| 82 |
-
const fetchTTSRoles = async () => {
|
| 83 |
-
try {
|
| 84 |
-
const res = await fetch(`${base_url}/tts/models`);
|
| 85 |
-
const data = await res.json();
|
| 86 |
-
if (data?.models) {
|
| 87 |
-
roles.splice(0, roles.length, ...data.models);
|
| 88 |
-
if (data.current_model_id) role.value = data.current_model_id;
|
| 89 |
-
}
|
| 90 |
-
} catch (e) {
|
| 91 |
-
console.error("Error fetching TTS roles:", e);
|
| 92 |
-
}
|
| 93 |
-
};
|
| 94 |
-
|
| 95 |
-
const fetchInputDevices = async () => {
|
| 96 |
-
try {
|
| 97 |
-
const res = await fetch(`${base_url}/system/audio-devices`);
|
| 98 |
-
const data = await res.json();
|
| 99 |
-
if (data?.devices) {
|
| 100 |
-
inputDevices.splice(0, inputDevices.length, ...data.devices);
|
| 101 |
-
const saved = settingsStore.$state.inputDeviceIndex;
|
| 102 |
-
const exists = saved != null && data.devices.some((d: any) => d.index === saved);
|
| 103 |
-
inputDeviceIndex.value = exists ? saved : (data.current_device_index ?? null);
|
| 104 |
-
}
|
| 105 |
-
if (data?.output_devices) {
|
| 106 |
-
outputDevices.splice(0, outputDevices.length, ...data.output_devices);
|
| 107 |
-
const saved = settingsStore.$state.outputDeviceIndex;
|
| 108 |
-
const exists = saved != null && data.output_devices.some((d: any) => d.index === saved);
|
| 109 |
-
outputDeviceIndex.value = exists ? saved : (data.current_output_device_index ?? null);
|
| 110 |
-
}
|
| 111 |
-
} catch (e) {
|
| 112 |
-
console.error("Error fetching input devices:", e);
|
| 113 |
-
}
|
| 114 |
-
};
|
| 115 |
-
|
| 116 |
-
// 当前实际生效的 ASR 引擎(由后端返回,区分 Qwen / FunASR+Whisper 等)
|
| 117 |
-
const asrEngineName = ref<string>("");
|
| 118 |
-
const asrEngineKeys = ref<string[]>([]);
|
| 119 |
-
const ASR_ENGINE_LINKS: Record<string, { name: string; url: string }> = {
|
| 120 |
-
qwen: { name: "Qwen3-ASR", url: "https://huggingface.co/Qwen/Qwen3-ASR-1.7B" },
|
| 121 |
-
whisper: { name: "whisper.cpp", url: "https://github.com/ggerganov/whisper.cpp" },
|
| 122 |
-
funasr: { name: "FunASR", url: "https://github.com/modelscope/FunASR" },
|
| 123 |
-
};
|
| 124 |
-
const asrEngineLinks = computed(() => {
|
| 125 |
-
const keys = asrEngineKeys.value.length ? asrEngineKeys.value : ["whisper", "funasr"];
|
| 126 |
-
return keys.map((k) => ASR_ENGINE_LINKS[k]).filter(Boolean);
|
| 127 |
-
});
|
| 128 |
-
const fetchAsrEngine = async () => {
|
| 129 |
-
try {
|
| 130 |
-
const res = await fetch(`${base_url}/system/asr-engine`);
|
| 131 |
-
const data = await res.json();
|
| 132 |
-
if (data?.display_name) asrEngineName.value = data.display_name;
|
| 133 |
-
if (data?.mappings) asrEngineKeys.value = [...new Set(Object.values(data.mappings) as string[])].sort();
|
| 134 |
-
} catch (e) {
|
| 135 |
-
console.error("Error fetching ASR engine:", e);
|
| 136 |
-
}
|
| 137 |
-
};
|
| 138 |
-
|
| 139 |
-
const fetchPrompts = async () => {
|
| 140 |
-
try {
|
| 141 |
-
const [cur, def] = await Promise.all([
|
| 142 |
-
fetch(`${base_url}/settings/settings/prompts`).then((r) => r.json()),
|
| 143 |
-
fetch(`${base_url}/settings/settings/prompts/default`).then((r) => r.json()),
|
| 144 |
-
]);
|
| 145 |
-
if (cur) {
|
| 146 |
-
current_prompt_en.value = cur.english_prompt;
|
| 147 |
-
current_prompt_zh.value = cur.chinese_prompt;
|
| 148 |
-
}
|
| 149 |
-
if (def) {
|
| 150 |
-
default_prompt_en.value = def.english_prompt;
|
| 151 |
-
default_prompt_zh.value = def.chinese_prompt;
|
| 152 |
-
}
|
| 153 |
-
} catch (e) {
|
| 154 |
-
console.error("Error fetching prompts:", e);
|
| 155 |
-
}
|
| 156 |
-
};
|
| 157 |
-
|
| 158 |
-
const resetPrompt = (lang: string) => {
|
| 159 |
-
if (lang === "en") current_prompt_en.value = default_prompt_en.value;
|
| 160 |
-
else current_prompt_zh.value = default_prompt_zh.value;
|
| 161 |
-
};
|
| 162 |
-
|
| 163 |
-
// ---- 提交 / 取消 ----
|
| 164 |
-
const applySettings = async () => {
|
| 165 |
-
loading.value = true;
|
| 166 |
-
try {
|
| 167 |
-
// 1. 持久化到本地 store
|
| 168 |
-
settingsStore.$state.uiLanguage = uiLanguage.value;
|
| 169 |
-
settingsStore.$state.language = recognitionLanguage.value;
|
| 170 |
-
settingsStore.$state.role = role.value || "";
|
| 171 |
-
settingsStore.$state.echoCancel = echoCancel.value;
|
| 172 |
-
settingsStore.$state.inputDeviceIndex = inputDeviceIndex.value;
|
| 173 |
-
settingsStore.$state.outputDeviceIndex = outputDeviceIndex.value;
|
| 174 |
-
|
| 175 |
-
// 输出设备保存即生效(会话中修改下一句生效)
|
| 176 |
-
await fetch(`${base_url}/system/audio-output-device`, {
|
| 177 |
-
method: "POST",
|
| 178 |
-
headers: { "Content-Type": "application/json" },
|
| 179 |
-
body: JSON.stringify({ output_device_index: outputDeviceIndex.value }),
|
| 180 |
-
});
|
| 181 |
-
|
| 182 |
-
// 2. 下发 TTS 音色 + ASR 语言
|
| 183 |
-
if (role.value) {
|
| 184 |
-
const r1 = await fetch(`${base_url}/tts/models/load`, {
|
| 185 |
-
method: "POST",
|
| 186 |
-
headers: { "Content-Type": "application/json" },
|
| 187 |
-
body: JSON.stringify({ model_id: role.value }),
|
| 188 |
-
});
|
| 189 |
-
if (!r1.ok) throw new Error(`TTS load failed: ${r1.status}`);
|
| 190 |
-
}
|
| 191 |
-
const r2 = await fetch(`${base_url}/asr/instance/create`, {
|
| 192 |
-
method: "POST",
|
| 193 |
-
headers: { "Content-Type": "application/json" },
|
| 194 |
-
body: JSON.stringify({ language: recognitionLanguage.value }),
|
| 195 |
-
});
|
| 196 |
-
if (!r2.ok) throw new Error(`ASR set failed: ${r2.status}`);
|
| 197 |
-
|
| 198 |
-
// 3. 保存 Prompt
|
| 199 |
-
await fetch(`${base_url}/settings/settings/prompts`, {
|
| 200 |
-
method: "POST",
|
| 201 |
-
headers: { "Content-Type": "application/json" },
|
| 202 |
-
body: JSON.stringify({
|
| 203 |
-
chinese_prompt: current_prompt_zh.value,
|
| 204 |
-
english_prompt: current_prompt_en.value,
|
| 205 |
-
}),
|
| 206 |
-
});
|
| 207 |
-
|
| 208 |
-
emit("update:open", false);
|
| 209 |
-
} catch (err) {
|
| 210 |
-
console.error("Error applying settings:", err);
|
| 211 |
-
Modal.error({ title: t("common.error"), content: t("settings.applyFailed") });
|
| 212 |
-
} finally {
|
| 213 |
-
loading.value = false;
|
| 214 |
-
}
|
| 215 |
-
};
|
| 216 |
-
|
| 217 |
-
const handleCancel = () => {
|
| 218 |
-
// 还原本地状态与界面语言
|
| 219 |
-
uiLanguage.value = (settingsStore.$state.uiLanguage as UiLocale) ?? "en";
|
| 220 |
-
setUiLocale(uiLanguage.value);
|
| 221 |
-
recognitionLanguage.value = settingsStore.$state.language || "zh";
|
| 222 |
-
echoCancel.value = settingsStore.$state.echoCancel ?? true;
|
| 223 |
-
inputDeviceIndex.value = settingsStore.$state.inputDeviceIndex ?? null;
|
| 224 |
-
outputDeviceIndex.value = settingsStore.$state.outputDeviceIndex ?? null;
|
| 225 |
-
role.value = settingsStore.$state.role || "";
|
| 226 |
-
emit("update:open", false);
|
| 227 |
-
};
|
| 228 |
-
|
| 229 |
-
watch(
|
| 230 |
-
() => props.open,
|
| 231 |
-
(isOpen) => {
|
| 232 |
-
if (isOpen) {
|
| 233 |
-
activeTab.value = "main";
|
| 234 |
-
uiLanguage.value = (settingsStore.$state.uiLanguage as UiLocale) ?? "en";
|
| 235 |
-
fetchASRLanguages();
|
| 236 |
-
fetchTTSRoles();
|
| 237 |
-
fetchInputDevices();
|
| 238 |
-
fetchPrompts();
|
| 239 |
-
fetchAsrEngine();
|
| 240 |
-
}
|
| 241 |
-
}
|
| 242 |
-
);
|
| 243 |
-
|
| 244 |
-
// ---- 音色试听 ----
|
| 245 |
-
const currentPlayingId = ref<string | null>(null);
|
| 246 |
-
const currentAudio = ref<HTMLAudioElement | null>(null);
|
| 247 |
-
const isPlaying = (id: string) => currentPlayingId.value === id;
|
| 248 |
-
|
| 249 |
-
const playRefAudio = async (id: string, e: Event) => {
|
| 250 |
-
e.stopPropagation();
|
| 251 |
-
e.preventDefault();
|
| 252 |
-
try {
|
| 253 |
-
if (currentPlayingId.value === id && currentAudio.value) {
|
| 254 |
-
currentAudio.value.pause();
|
| 255 |
-
currentAudio.value = null;
|
| 256 |
-
currentPlayingId.value = null;
|
| 257 |
-
return;
|
| 258 |
-
}
|
| 259 |
-
if (currentAudio.value) {
|
| 260 |
-
currentAudio.value.pause();
|
| 261 |
-
currentAudio.value = null;
|
| 262 |
-
}
|
| 263 |
-
const audio = new Audio(`${base_url}/tts/models/${id}/reference-audio`);
|
| 264 |
-
audio.addEventListener("ended", () => {
|
| 265 |
-
currentPlayingId.value = null;
|
| 266 |
-
currentAudio.value = null;
|
| 267 |
-
});
|
| 268 |
-
await audio.play();
|
| 269 |
-
currentPlayingId.value = id;
|
| 270 |
-
currentAudio.value = audio;
|
| 271 |
-
} catch (err) {
|
| 272 |
-
currentPlayingId.value = null;
|
| 273 |
-
currentAudio.value = null;
|
| 274 |
-
}
|
| 275 |
-
};
|
| 276 |
-
|
| 277 |
-
onUnmounted(() => {
|
| 278 |
-
if (currentAudio.value) currentAudio.value.pause();
|
| 279 |
-
});
|
| 280 |
-
</script>
|
| 281 |
-
|
| 282 |
-
<template>
|
| 283 |
-
<a-modal
|
| 284 |
-
:open="props.open"
|
| 285 |
-
:title="t('settings.title')"
|
| 286 |
-
:mask-closable="false"
|
| 287 |
-
:closable="true"
|
| 288 |
-
:width="600"
|
| 289 |
-
centered
|
| 290 |
-
transition-name="ant-fade"
|
| 291 |
-
@cancel="handleCancel"
|
| 292 |
-
@update:open="(v: boolean) => emit('update:open', v)"
|
| 293 |
-
>
|
| 294 |
-
<template #footer>
|
| 295 |
-
<a-button key="back" @click="handleCancel">{{ t('common.cancel') }}</a-button>
|
| 296 |
-
<a-button key="confirm" type="primary" :loading="loading" @click="applySettings">
|
| 297 |
-
{{ t('common.confirm') }}
|
| 298 |
-
</a-button>
|
| 299 |
-
</template>
|
| 300 |
-
|
| 301 |
-
<a-tabs v-model:activeKey="activeTab" class="settings-tabs">
|
| 302 |
-
<!-- 常用:输入源 + 回音消除 + 音色(大家最关心的) -->
|
| 303 |
-
<a-tab-pane key="main" :tab="t('settings.tabs.main')">
|
| 304 |
-
<div class="tab-body">
|
| 305 |
-
<div class="setting-row">
|
| 306 |
-
<label>{{ t('settings.audio.microphone') }}</label>
|
| 307 |
-
<a-select v-model:value="inputDeviceIndex" style="width: 100%;">
|
| 308 |
-
<a-select-option :value="null">{{ t('settings.audio.systemDefault') }}</a-select-option>
|
| 309 |
-
<a-select-option v-for="dev in inputDevices" :value="dev.index" :key="dev.index">
|
| 310 |
-
{{ dev.name }}<template v-if="dev.max_input_channels > 1"> ({{ dev.max_input_channels }}{{ t('settings.audio.channelsSuffix') }})</template><template v-if="dev.is_default"> · {{ t('settings.audio.defaultSuffix') }}</template>
|
| 311 |
-
</a-select-option>
|
| 312 |
-
</a-select>
|
| 313 |
-
</div>
|
| 314 |
-
<div class="setting-row">
|
| 315 |
-
<label>{{ t('settings.audio.speaker') }}</label>
|
| 316 |
-
<a-select v-model:value="outputDeviceIndex" style="width: 100%;">
|
| 317 |
-
<a-select-option :value="null">{{ t('settings.audio.systemDefault') }}</a-select-option>
|
| 318 |
-
<a-select-option v-for="dev in outputDevices" :value="dev.index" :key="dev.index">
|
| 319 |
-
{{ dev.name }}<template v-if="dev.is_default"> · {{ t('settings.audio.defaultSuffix') }}</template>
|
| 320 |
-
</a-select-option>
|
| 321 |
-
</a-select>
|
| 322 |
-
</div>
|
| 323 |
-
<div class="setting-row">
|
| 324 |
-
<div class="row-inline">
|
| 325 |
-
<label>{{ t('settings.audio.echoCancellation') }}</label>
|
| 326 |
-
<a-switch v-model:checked="echoCancel" />
|
| 327 |
-
</div>
|
| 328 |
-
</div>
|
| 329 |
-
<div class="setting-row">
|
| 330 |
-
<label>{{ t('settings.voice.role') }}</label>
|
| 331 |
-
<a-radio-group v-model:value="role" class="voice-group">
|
| 332 |
-
<a-radio v-for="r in filteredRoles" :value="r['id']" :key="r['id']" class="voice-radio">
|
| 333 |
-
<span class="voice-name">{{ r['character_name'] }}</span>
|
| 334 |
-
<a-button
|
| 335 |
-
type="text"
|
| 336 |
-
class="audio-play-btn"
|
| 337 |
-
:class="{ playing: isPlaying(r['id']) }"
|
| 338 |
-
@click="playRefAudio(r['id'], $event)"
|
| 339 |
-
>
|
| 340 |
-
<SoundTwoTone v-if="isPlaying(r['id'])" style="font-size: 16px; color: #52c41a;" />
|
| 341 |
-
<SoundOutlined v-else style="font-size: 16px; color: #1890ff;" />
|
| 342 |
-
</a-button>
|
| 343 |
-
</a-radio>
|
| 344 |
-
</a-radio-group>
|
| 345 |
-
</div>
|
| 346 |
-
</div>
|
| 347 |
-
</a-tab-pane>
|
| 348 |
-
|
| 349 |
-
<!-- 语言:界面语言 + 识别语言 -->
|
| 350 |
-
<a-tab-pane key="language" :tab="t('settings.tabs.language')">
|
| 351 |
-
<div class="tab-body">
|
| 352 |
-
<div class="setting-row">
|
| 353 |
-
<label><TranslationOutlined class="label-icon" />{{ t('settings.general.interfaceLanguage') }}</label>
|
| 354 |
-
<a-select v-model:value="uiLanguage" style="width: 100%;">
|
| 355 |
-
<a-select-option value="zh">{{ t('lang.zh') }}</a-select-option>
|
| 356 |
-
<a-select-option value="en">{{ t('lang.en') }}</a-select-option>
|
| 357 |
-
</a-select>
|
| 358 |
-
<p class="hint">{{ t('settings.general.interfaceLanguageHint') }}</p>
|
| 359 |
-
</div>
|
| 360 |
-
<div class="setting-row">
|
| 361 |
-
<label><AudioOutlined class="label-icon" />{{ t('settings.recognition.language') }}</label>
|
| 362 |
-
<a-select v-model:value="recognitionLanguage" style="width: 100%;">
|
| 363 |
-
<a-select-option v-for="lan in languages" :value="lan" :key="lan">
|
| 364 |
-
{{ t('lang.' + lan) }}
|
| 365 |
-
</a-select-option>
|
| 366 |
-
</a-select>
|
| 367 |
-
<p class="hint">{{ t('settings.recognition.languageHint') }}</p>
|
| 368 |
-
</div>
|
| 369 |
-
</div>
|
| 370 |
-
</a-tab-pane>
|
| 371 |
-
|
| 372 |
-
<!-- 高级:系统提示词 -->
|
| 373 |
-
<a-tab-pane key="advanced" :tab="t('settings.tabs.advanced')">
|
| 374 |
-
<div class="tab-body">
|
| 375 |
-
<div class="setting-row">
|
| 376 |
-
<label>{{ t('settings.prompt.title') }}</label>
|
| 377 |
-
<a-radio-group button-style="solid" size="small" v-model:value="promptLang" style="margin-bottom: 12px;">
|
| 378 |
-
<a-radio-button value="zh">{{ t('lang.zh') }}</a-radio-button>
|
| 379 |
-
<a-radio-button value="en">{{ t('lang.en') }}</a-radio-button>
|
| 380 |
-
</a-radio-group>
|
| 381 |
-
<div v-show="promptLang === 'zh'">
|
| 382 |
-
<a-textarea v-model:value="current_prompt_zh" :placeholder="default_prompt_zh"
|
| 383 |
-
:auto-size="{ minRows: 6, maxRows: 10 }" show-count :maxlength="2000" allow-clear />
|
| 384 |
-
<a-button size="small" @click="resetPrompt('zh')" style="margin-top: 12px;">{{ t('common.reset') }}</a-button>
|
| 385 |
-
</div>
|
| 386 |
-
<div v-show="promptLang === 'en'">
|
| 387 |
-
<a-textarea v-model:value="current_prompt_en" :placeholder="default_prompt_en"
|
| 388 |
-
:auto-size="{ minRows: 6, maxRows: 10 }" show-count :maxlength="2000" allow-clear />
|
| 389 |
-
<a-button size="small" @click="resetPrompt('en')" style="margin-top: 12px;">{{ t('common.reset') }}</a-button>
|
| 390 |
-
</div>
|
| 391 |
-
</div>
|
| 392 |
-
</div>
|
| 393 |
-
</a-tab-pane>
|
| 394 |
-
|
| 395 |
-
<!-- 关于 -->
|
| 396 |
-
<a-tab-pane key="about" :tab="t('settings.tabs.about')">
|
| 397 |
-
<div class="tab-body about">
|
| 398 |
-
<div class="about-head">
|
| 399 |
-
<div class="about-name">Voice Dialogue</div>
|
| 400 |
-
<div class="about-ver">{{ t('settings.about.version') }} {{ appVersion }}</div>
|
| 401 |
-
<div class="about-tagline">{{ t('settings.about.tagline') }}</div>
|
| 402 |
-
</div>
|
| 403 |
-
|
| 404 |
-
<div class="about-section">
|
| 405 |
-
<div class="about-section-title">{{ t('settings.about.modelsTitle') }}</div>
|
| 406 |
-
<div class="about-item">
|
| 407 |
-
<div class="about-item-label">{{ t('settings.about.llm') }}</div>
|
| 408 |
-
<div class="about-item-desc">
|
| 409 |
-
{{ t('settings.about.llmDesc') }}
|
| 410 |
-
<a href="https://huggingface.co/Qwen/Qwen3-8B" target="_blank" rel="noopener">Qwen3 ↗</a>
|
| 411 |
-
</div>
|
| 412 |
-
</div>
|
| 413 |
-
<div class="about-item">
|
| 414 |
-
<div class="about-item-label">{{ t('settings.about.asr') }}</div>
|
| 415 |
-
<div class="about-item-desc">
|
| 416 |
-
{{ asrEngineName || t('settings.about.asrDesc') }}
|
| 417 |
-
<a v-for="link in asrEngineLinks" :key="link.url" :href="link.url" target="_blank"
|
| 418 |
-
rel="noopener">{{ link.name }} ↗</a>
|
| 419 |
-
</div>
|
| 420 |
-
</div>
|
| 421 |
-
<div class="about-item">
|
| 422 |
-
<div class="about-item-label">{{ t('settings.about.tts') }}</div>
|
| 423 |
-
<div class="about-item-desc">
|
| 424 |
-
{{ t('settings.about.ttsDesc') }}
|
| 425 |
-
<a href="https://github.com/RVC-Boss/GPT-SoVITS" target="_blank" rel="noopener">GPT-SoVITS ↗</a>
|
| 426 |
-
<a href="https://huggingface.co/hexgrad/Kokoro-82M" target="_blank" rel="noopener">Kokoro ↗</a>
|
| 427 |
-
</div>
|
| 428 |
-
</div>
|
| 429 |
-
</div>
|
| 430 |
-
|
| 431 |
-
<div class="about-section">
|
| 432 |
-
<div class="about-section-title">{{ t('settings.about.linksTitle') }}</div>
|
| 433 |
-
<div class="about-item">
|
| 434 |
-
<div class="about-item-label">{{ t('settings.about.repoApp') }}</div>
|
| 435 |
-
<a class="about-link" href="https://huggingface.co/MoYoYoTech/VoiceDialogue" target="_blank" rel="noopener">huggingface.co/MoYoYoTech/VoiceDialogue</a>
|
| 436 |
-
</div>
|
| 437 |
-
<div class="about-item">
|
| 438 |
-
<div class="about-item-label">{{ t('settings.about.repoVoices') }}</div>
|
| 439 |
-
<a class="about-link" href="https://huggingface.co/MoYoYoTech/tone-models" target="_blank" rel="noopener">huggingface.co/MoYoYoTech/tone-models</a>
|
| 440 |
-
</div>
|
| 441 |
-
</div>
|
| 442 |
-
|
| 443 |
-
<div class="about-copyright">{{ t('settings.about.copyright') }}</div>
|
| 444 |
-
</div>
|
| 445 |
-
</a-tab-pane>
|
| 446 |
-
</a-tabs>
|
| 447 |
-
</a-modal>
|
| 448 |
-
</template>
|
| 449 |
-
|
| 450 |
-
<style lang="scss" scoped>
|
| 451 |
-
// 固定内容区高度,切换 Tab 时横条不再跳动
|
| 452 |
-
.tab-body {
|
| 453 |
-
height: 360px;
|
| 454 |
-
overflow-y: auto;
|
| 455 |
-
padding: 4px 8px 4px 2px;
|
| 456 |
-
}
|
| 457 |
-
|
| 458 |
-
.setting-row {
|
| 459 |
-
margin-bottom: 20px;
|
| 460 |
-
|
| 461 |
-
// 仅作用于字段标题(直接子 label),避免影响嵌套的 radio-button 等 <label>
|
| 462 |
-
> label {
|
| 463 |
-
display: block;
|
| 464 |
-
font-size: 15px;
|
| 465 |
-
font-weight: 500;
|
| 466 |
-
margin-bottom: 8px;
|
| 467 |
-
|
| 468 |
-
.label-icon {
|
| 469 |
-
margin-right: 6px;
|
| 470 |
-
color: #1890ff;
|
| 471 |
-
}
|
| 472 |
-
}
|
| 473 |
-
|
| 474 |
-
.hint {
|
| 475 |
-
font-size: 12px;
|
| 476 |
-
color: #999;
|
| 477 |
-
margin: 8px 0 0;
|
| 478 |
-
}
|
| 479 |
-
|
| 480 |
-
.row-inline {
|
| 481 |
-
display: flex;
|
| 482 |
-
align-items: center;
|
| 483 |
-
justify-content: space-between;
|
| 484 |
-
}
|
| 485 |
-
}
|
| 486 |
-
|
| 487 |
-
.voice-group {
|
| 488 |
-
display: flex;
|
| 489 |
-
flex-direction: column;
|
| 490 |
-
margin-top: 8px;
|
| 491 |
-
}
|
| 492 |
-
|
| 493 |
-
/* 关于页 */
|
| 494 |
-
.about {
|
| 495 |
-
.about-head {
|
| 496 |
-
text-align: center;
|
| 497 |
-
margin-bottom: 24px;
|
| 498 |
-
|
| 499 |
-
.about-name {
|
| 500 |
-
font-size: 20px;
|
| 501 |
-
font-weight: 600;
|
| 502 |
-
}
|
| 503 |
-
.about-ver {
|
| 504 |
-
font-size: 13px;
|
| 505 |
-
color: #888;
|
| 506 |
-
margin-top: 2px;
|
| 507 |
-
}
|
| 508 |
-
.about-tagline {
|
| 509 |
-
font-size: 12px;
|
| 510 |
-
color: #999;
|
| 511 |
-
margin-top: 4px;
|
| 512 |
-
}
|
| 513 |
-
}
|
| 514 |
-
|
| 515 |
-
.about-section {
|
| 516 |
-
margin-bottom: 20px;
|
| 517 |
-
|
| 518 |
-
.about-section-title {
|
| 519 |
-
font-size: 13px;
|
| 520 |
-
font-weight: 600;
|
| 521 |
-
color: #666;
|
| 522 |
-
margin-bottom: 10px;
|
| 523 |
-
}
|
| 524 |
-
}
|
| 525 |
-
|
| 526 |
-
.about-item {
|
| 527 |
-
margin-bottom: 12px;
|
| 528 |
-
|
| 529 |
-
.about-item-label {
|
| 530 |
-
font-size: 14px;
|
| 531 |
-
font-weight: 500;
|
| 532 |
-
}
|
| 533 |
-
.about-item-desc {
|
| 534 |
-
font-size: 12px;
|
| 535 |
-
color: #777;
|
| 536 |
-
margin-top: 2px;
|
| 537 |
-
line-height: 1.6;
|
| 538 |
-
|
| 539 |
-
a { margin-left: 6px; }
|
| 540 |
-
}
|
| 541 |
-
}
|
| 542 |
-
|
| 543 |
-
a {
|
| 544 |
-
color: #1677ff;
|
| 545 |
-
text-decoration: none;
|
| 546 |
-
&:hover { text-decoration: underline; }
|
| 547 |
-
}
|
| 548 |
-
|
| 549 |
-
.about-link {
|
| 550 |
-
font-size: 13px;
|
| 551 |
-
word-break: break-all;
|
| 552 |
-
}
|
| 553 |
-
|
| 554 |
-
.about-copyright {
|
| 555 |
-
margin-top: 16px;
|
| 556 |
-
font-size: 11px;
|
| 557 |
-
color: #aaa;
|
| 558 |
-
text-align: center;
|
| 559 |
-
}
|
| 560 |
-
}
|
| 561 |
-
|
| 562 |
-
.voice-radio {
|
| 563 |
-
display: flex;
|
| 564 |
-
align-items: center;
|
| 565 |
-
height: 40px;
|
| 566 |
-
line-height: 40px;
|
| 567 |
-
|
| 568 |
-
.voice-name {
|
| 569 |
-
margin-right: 8px;
|
| 570 |
-
}
|
| 571 |
-
}
|
| 572 |
-
|
| 573 |
-
.audio-play-btn {
|
| 574 |
-
padding: 0 6px;
|
| 575 |
-
border-radius: 4px;
|
| 576 |
-
|
| 577 |
-
&.playing {
|
| 578 |
-
background-color: #f6ffed;
|
| 579 |
-
}
|
| 580 |
-
}
|
| 581 |
-
</style>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
frontend/src/views/Welcome/index.vue
CHANGED
|
@@ -2,66 +2,303 @@
|
|
| 2 |
|
| 3 |
import router from "@/router.ts";
|
| 4 |
import { useSettingsStore } from "@/stores/config.ts";
|
| 5 |
-
import { ref,
|
| 6 |
import { Modal } from 'ant-design-vue';
|
| 7 |
-
import {
|
| 8 |
-
import { useI18n } from "vue-i18n";
|
| 9 |
import axios from "axios";
|
| 10 |
-
import
|
| 11 |
-
import setting from "@/assets/setting.png";
|
| 12 |
|
| 13 |
-
const
|
| 14 |
-
|
| 15 |
-
const settingsStore = useSettingsStore()
|
|
|
|
|
|
|
| 16 |
|
| 17 |
-
const settingsOpen = ref<boolean>(false);
|
| 18 |
-
const chatLoading = ref<boolean>(false);
|
| 19 |
|
| 20 |
-
// 当前实际生效的 ASR 引擎,显示在设置按钮左侧
|
| 21 |
-
const asrEngineName = ref<string>("");
|
| 22 |
onMounted(async () => {
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
const data = await res.json();
|
| 26 |
-
if (data?.display_name) asrEngineName.value = data.display_name;
|
| 27 |
-
} catch (e) {
|
| 28 |
-
console.error("Error fetching ASR engine:", e);
|
| 29 |
-
}
|
| 30 |
});
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
const startAudioChat = async () => {
|
| 33 |
try {
|
| 34 |
chatLoading.value = true;
|
| 35 |
const response = await fetch(`${base_url}/system/start`, {
|
| 36 |
method: 'POST',
|
| 37 |
-
headers: {
|
|
|
|
|
|
|
| 38 |
body: JSON.stringify({
|
| 39 |
-
enable_echo_cancellation:
|
| 40 |
-
input_device_index: settingsStore.$state.inputDeviceIndex ?? null,
|
| 41 |
-
output_device_index: settingsStore.$state.outputDeviceIndex ?? null
|
| 42 |
})
|
| 43 |
});
|
| 44 |
if (!response.ok) {
|
| 45 |
throw new Error(`HTTP error! status: ${response.status}`);
|
| 46 |
}
|
| 47 |
-
await response.json();
|
|
|
|
| 48 |
return true;
|
| 49 |
} catch (error) {
|
| 50 |
-
console.error('Error starting
|
| 51 |
return false;
|
| 52 |
} finally {
|
| 53 |
chatLoading.value = false;
|
| 54 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
};
|
| 56 |
|
| 57 |
-
const
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
}
|
| 63 |
-
router.replace('/home');
|
| 64 |
};
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
</script>
|
| 66 |
|
| 67 |
<template>
|
|
@@ -69,67 +306,178 @@ const chatAction = async () => {
|
|
| 69 |
<div class="content">
|
| 70 |
<div class="inner-content">
|
| 71 |
<div class="text-box">
|
| 72 |
-
<div class="title">
|
| 73 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
</div>
|
| 75 |
<div class="btn-box">
|
| 76 |
<a-button @click="chatAction" block :loading="chatLoading" type="primary" size="large">
|
| 77 |
-
<span>
|
| 78 |
</a-button>
|
| 79 |
</div>
|
| 80 |
</div>
|
| 81 |
</div>
|
| 82 |
|
| 83 |
<div class="actions">
|
| 84 |
-
<
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
<a-button type="text" @click="settingsOpen = true" class="settings-btn"
|
| 89 |
-
:title="t('settings.entry')">
|
| 90 |
<template #icon>
|
| 91 |
<img :src="setting" width="28" height="28" alt="settings" />
|
| 92 |
</template>
|
| 93 |
</a-button>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
</div>
|
| 95 |
|
| 96 |
-
<
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
</div>
|
| 98 |
</template>
|
| 99 |
|
| 100 |
<style lang="scss" scoped>
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
background: rgba(255, 255, 255, 0.5);
|
| 112 |
-
border: 1px solid rgba(255, 255, 255, 0.7);
|
| 113 |
-
backdrop-filter: blur(10px);
|
| 114 |
-
-webkit-backdrop-filter: blur(10px);
|
| 115 |
-
box-shadow: 0 4px 16px rgba(31, 38, 135, 0.12);
|
| 116 |
}
|
| 117 |
|
| 118 |
-
.
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 128 |
display: flex;
|
|
|
|
| 129 |
align-items: center;
|
| 130 |
-
justify-content: center;
|
| 131 |
}
|
| 132 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 133 |
.welcome-wrapper {
|
| 134 |
width: 100%;
|
| 135 |
height: 100%;
|
|
@@ -175,7 +523,6 @@ const chatAction = async () => {
|
|
| 175 |
margin-top: 10px;
|
| 176 |
}
|
| 177 |
}
|
| 178 |
-
|
| 179 |
.btn-box {
|
| 180 |
width: 224px;
|
| 181 |
height: 80px;
|
|
@@ -184,11 +531,10 @@ const chatAction = async () => {
|
|
| 184 |
}
|
| 185 |
|
| 186 |
.actions {
|
| 187 |
-
width: 100%;
|
| 188 |
-
height:
|
| 189 |
-
|
| 190 |
display: flex;
|
| 191 |
-
align-items: center;
|
| 192 |
justify-content: flex-end;
|
| 193 |
}
|
| 194 |
}
|
|
|
|
| 2 |
|
| 3 |
import router from "@/router.ts";
|
| 4 |
import { useSettingsStore } from "@/stores/config.ts";
|
| 5 |
+
import { onMounted, onUnmounted, ref, reactive, computed, watch, h } from "vue";
|
| 6 |
import { Modal } from 'ant-design-vue';
|
| 7 |
+
import { SoundTwoTone, SoundOutlined } from "@ant-design/icons-vue";
|
|
|
|
| 8 |
import axios from "axios";
|
| 9 |
+
import PromptText from "./Components/PromptText.vue";
|
|
|
|
| 10 |
|
| 11 |
+
const base_url = axios.defaults.baseURL
|
| 12 |
+
|
| 13 |
+
const settingsStore = useSettingsStore()
|
| 14 |
+
|
| 15 |
+
import setting from "@/assets/setting.png"
|
| 16 |
|
|
|
|
|
|
|
| 17 |
|
|
|
|
|
|
|
| 18 |
onMounted(async () => {
|
| 19 |
+
await fetchASRLanguages();
|
| 20 |
+
await fetchTTSRoles();
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
});
|
| 22 |
|
| 23 |
+
const chatAction = async () => {
|
| 24 |
+
const state = await startAudioChat();
|
| 25 |
+
if (!state) {
|
| 26 |
+
console.error('Failed to start audio chat system service');
|
| 27 |
+
|
| 28 |
+
Modal.error({
|
| 29 |
+
title: 'Error',
|
| 30 |
+
content: 'Failed to start audio chat system service',
|
| 31 |
+
});
|
| 32 |
+
return;
|
| 33 |
+
}
|
| 34 |
+
router.replace('/home')
|
| 35 |
+
}
|
| 36 |
+
const chatLoading = ref<boolean>(false);
|
| 37 |
+
|
| 38 |
const startAudioChat = async () => {
|
| 39 |
try {
|
| 40 |
chatLoading.value = true;
|
| 41 |
const response = await fetch(`${base_url}/system/start`, {
|
| 42 |
method: 'POST',
|
| 43 |
+
headers: {
|
| 44 |
+
'Content-Type': 'application/json',
|
| 45 |
+
},
|
| 46 |
body: JSON.stringify({
|
| 47 |
+
enable_echo_cancellation: echoCancel.value
|
|
|
|
|
|
|
| 48 |
})
|
| 49 |
});
|
| 50 |
if (!response.ok) {
|
| 51 |
throw new Error(`HTTP error! status: ${response.status}`);
|
| 52 |
}
|
| 53 |
+
const data = await response.json();
|
| 54 |
+
console.log('ASR Instance started successfully:', data);
|
| 55 |
return true;
|
| 56 |
} catch (error) {
|
| 57 |
+
console.error('Error starting ASR instance:', error);
|
| 58 |
return false;
|
| 59 |
} finally {
|
| 60 |
chatLoading.value = false;
|
| 61 |
}
|
| 62 |
+
}
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
const voiceModelOpen = ref<boolean>(false);
|
| 66 |
+
const modalLoading = ref<boolean>(false);
|
| 67 |
+
|
| 68 |
+
const handleVoiceModalCancel = () => {
|
| 69 |
+
voiceModelOpen.value = false;
|
| 70 |
+
role.value = settingsStore.$state.role;
|
| 71 |
+
language.value = settingsStore.$state.language;
|
| 72 |
};
|
| 73 |
|
| 74 |
+
const handleVoiceModalSubmit = async () => {
|
| 75 |
+
console.log('Selected Language:', language.value);
|
| 76 |
+
console.log('Selected Role:', role.value);
|
| 77 |
+
console.log('Echo Cancel:', echoCancel.value);
|
| 78 |
+
settingsStore.$state.language = language.value;
|
| 79 |
+
settingsStore.$state.role = role.value || '';
|
| 80 |
+
settingsStore.$state.echoCancel = echoCancel.value;
|
| 81 |
+
|
| 82 |
+
await pushConfig(settingsStore.$state.role);
|
| 83 |
+
};
|
| 84 |
+
|
| 85 |
+
const pushConfig = async (model_id: string) => {
|
| 86 |
+
try {
|
| 87 |
+
modalLoading.value = true;
|
| 88 |
+
const response = await fetch(`${base_url}/tts/models/load`, {
|
| 89 |
+
method: 'POST',
|
| 90 |
+
headers: {
|
| 91 |
+
'Content-Type': 'application/json',
|
| 92 |
+
},
|
| 93 |
+
body: JSON.stringify({
|
| 94 |
+
"model_id": model_id,
|
| 95 |
+
})
|
| 96 |
+
});
|
| 97 |
+
if (!response.ok) {
|
| 98 |
+
throw new Error(`HTTP error! status: ${response.status}`);
|
| 99 |
+
}
|
| 100 |
+
const data = await response.json();
|
| 101 |
+
console.log('Config pushed successfully:', data);
|
| 102 |
+
|
| 103 |
+
const response2 = await fetch(`${base_url}/asr/instance/create`, {
|
| 104 |
+
method: 'POST',
|
| 105 |
+
headers: {
|
| 106 |
+
'Content-Type': 'application/json',
|
| 107 |
+
},
|
| 108 |
+
body: JSON.stringify({
|
| 109 |
+
"language": language.value,
|
| 110 |
+
})
|
| 111 |
+
});
|
| 112 |
+
if (!response2.ok) {
|
| 113 |
+
throw new Error(`HTTP error! status: ${response2.status}`);
|
| 114 |
+
}
|
| 115 |
+
const data2 = await response2.json();
|
| 116 |
+
console.log('ASR Language set successfully:', data2);
|
| 117 |
+
|
| 118 |
+
} catch (err) {
|
| 119 |
+
console.error('Error pushing config:', err);
|
| 120 |
+
Modal.error({
|
| 121 |
+
title: 'Error',
|
| 122 |
+
content: "Error config: " + JSON.stringify(err),
|
| 123 |
+
});
|
| 124 |
+
} finally {
|
| 125 |
+
modalLoading.value = false;
|
| 126 |
+
voiceModelOpen.value = false;
|
| 127 |
+
}
|
| 128 |
+
|
| 129 |
+
console.log('Selected Language:', language.value);
|
| 130 |
+
console.log('Selected Role:', role.value);
|
| 131 |
+
}
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
const language = ref<string>(settingsStore.$state.language || 'zh');
|
| 135 |
+
const languages = reactive([]);
|
| 136 |
+
const languageOptions = {
|
| 137 |
+
'zh': 'Chinese',
|
| 138 |
+
'en': 'English',
|
| 139 |
+
'auto': 'Auto',
|
| 140 |
+
};
|
| 141 |
+
const role = ref<string>(settingsStore.$state.role || '');
|
| 142 |
+
const roles = reactive([])
|
| 143 |
+
const echoCancel = ref<boolean>(settingsStore.$state.echoCancel ?? true);
|
| 144 |
+
|
| 145 |
+
const radioStyle = reactive({
|
| 146 |
+
display: 'flex',
|
| 147 |
+
height: '40px',
|
| 148 |
+
lineHeight: '40px',
|
| 149 |
+
fontSize: '16px',
|
| 150 |
+
marginBottom: '8px',
|
| 151 |
+
});
|
| 152 |
+
|
| 153 |
+
const filteredRoles = computed(() => {
|
| 154 |
+
const is_chinese = language.value == 'zh';
|
| 155 |
+
return roles.filter(ro => ro['is_chinese_voice'] == is_chinese);
|
| 156 |
+
});
|
| 157 |
+
|
| 158 |
+
watch(
|
| 159 |
+
() => language.value,
|
| 160 |
+
(newLang) => {
|
| 161 |
+
// 语言切换后,自动选中第一个可用角色
|
| 162 |
+
if (filteredRoles.value.length > 0) {
|
| 163 |
+
const current_role_id = settingsStore.$state.role;
|
| 164 |
+
const current_role = filteredRoles.value.find(ro => ro['id'] == current_role_id);
|
| 165 |
+
if (current_role) {
|
| 166 |
+
role.value = current_role_id;
|
| 167 |
+
} else {
|
| 168 |
+
role.value = filteredRoles.value[0]['id'];
|
| 169 |
+
}
|
| 170 |
+
} else {
|
| 171 |
+
role.value = "";
|
| 172 |
+
}
|
| 173 |
+
}
|
| 174 |
+
);
|
| 175 |
+
|
| 176 |
+
|
| 177 |
+
const fetchTTSRoles = async () => {
|
| 178 |
+
try {
|
| 179 |
+
const response = await fetch(`${base_url}/tts/models`);
|
| 180 |
+
const data = await response.json()
|
| 181 |
+
if (data && data.models) {
|
| 182 |
+
// @ts-ignore
|
| 183 |
+
roles.splice(0, data.length, ...data.models)
|
| 184 |
+
console.log('Fetched TTS Roles:', roles);
|
| 185 |
+
|
| 186 |
+
if (data.current_model_id) {
|
| 187 |
+
role.value = data.current_model_id;
|
| 188 |
+
}
|
| 189 |
+
}
|
| 190 |
+
} catch (error) {
|
| 191 |
+
console.error('Error fetching TTS roles:', error);
|
| 192 |
+
}
|
| 193 |
+
};
|
| 194 |
+
|
| 195 |
+
const fetchASRLanguages = async () => {
|
| 196 |
+
try {
|
| 197 |
+
const response = await fetch(`${base_url}/asr/languages`);
|
| 198 |
+
const data = await response.json();
|
| 199 |
+
if (data && data.languages) {
|
| 200 |
+
// @ts-ignore
|
| 201 |
+
languages.splice(0, languages.length, ...data.languages);
|
| 202 |
+
console.log('Fetched ASR Languages:', data.languages);
|
| 203 |
+
|
| 204 |
+
if (data.current_asr_language) {
|
| 205 |
+
language.value = data.current_asr_language;
|
| 206 |
+
}
|
| 207 |
+
}
|
| 208 |
+
} catch (error) {
|
| 209 |
+
console.error('Error fetching ASR languages:', error);
|
| 210 |
+
}
|
| 211 |
+
};
|
| 212 |
+
|
| 213 |
+
const togglePopover = (item: string) => {
|
| 214 |
+
popoverVisible.value = !popoverVisible.value;
|
| 215 |
+
if (item == 'voice') {
|
| 216 |
+
voiceModelOpen.value = true;
|
| 217 |
+
} else if (item == 'prompt') {
|
| 218 |
+
promptModelOpen.value = true;
|
| 219 |
+
}
|
| 220 |
+
};
|
| 221 |
+
|
| 222 |
+
const popoverVisible = ref<boolean>(false);
|
| 223 |
+
const promptModelOpen = ref<boolean>(false);
|
| 224 |
+
|
| 225 |
+
// 音频播放状态管理
|
| 226 |
+
const currentPlayingId = ref<string | null>(null);
|
| 227 |
+
const currentAudio = ref<HTMLAudioElement | null>(null);
|
| 228 |
+
|
| 229 |
+
// 修改音频播放逻辑
|
| 230 |
+
const playRefAudio = async (id: string, e: Event) => {
|
| 231 |
+
console.log('Playing reference audio for role:', id);
|
| 232 |
+
|
| 233 |
+
e.stopPropagation();
|
| 234 |
+
e.preventDefault();
|
| 235 |
+
|
| 236 |
+
try {
|
| 237 |
+
// 如果点击的是当前正在播放的音频,则停止播放
|
| 238 |
+
if (currentPlayingId.value === id && currentAudio.value) {
|
| 239 |
+
currentAudio.value.pause();
|
| 240 |
+
currentAudio.value = null;
|
| 241 |
+
currentPlayingId.value = null;
|
| 242 |
+
console.log('Audio stopped');
|
| 243 |
+
return;
|
| 244 |
+
}
|
| 245 |
+
|
| 246 |
+
// 如果有其他音频正在播放,先停止它
|
| 247 |
+
if (currentAudio.value) {
|
| 248 |
+
currentAudio.value.pause();
|
| 249 |
+
currentAudio.value = null;
|
| 250 |
+
}
|
| 251 |
+
|
| 252 |
+
// 创建新的音频实例
|
| 253 |
+
const audio = new Audio(`${base_url}/tts/models/${id}/reference-audio`);
|
| 254 |
+
|
| 255 |
+
// 设置音频事件监听
|
| 256 |
+
audio.addEventListener('ended', () => {
|
| 257 |
+
currentPlayingId.value = null;
|
| 258 |
+
currentAudio.value = null;
|
| 259 |
+
});
|
| 260 |
+
|
| 261 |
+
audio.addEventListener('error', (error) => {
|
| 262 |
+
console.error('Audio playback error:', error);
|
| 263 |
+
currentPlayingId.value = null;
|
| 264 |
+
currentAudio.value = null;
|
| 265 |
+
Modal.error({
|
| 266 |
+
title: 'Error',
|
| 267 |
+
content: 'Failed to play reference audio',
|
| 268 |
+
});
|
| 269 |
+
});
|
| 270 |
+
|
| 271 |
+
// 开始播放
|
| 272 |
+
await audio.play();
|
| 273 |
+
currentPlayingId.value = id;
|
| 274 |
+
currentAudio.value = audio;
|
| 275 |
+
console.log('Audio played successfully');
|
| 276 |
+
|
| 277 |
+
} catch (error) {
|
| 278 |
+
console.error('Error playing audio:', error);
|
| 279 |
+
currentPlayingId.value = null;
|
| 280 |
+
currentAudio.value = null;
|
| 281 |
+
Modal.error({
|
| 282 |
+
title: 'Error',
|
| 283 |
+
content: 'Failed to play reference audio',
|
| 284 |
+
});
|
| 285 |
}
|
|
|
|
| 286 |
};
|
| 287 |
+
|
| 288 |
+
// 组���卸载时清理音频
|
| 289 |
+
onUnmounted(() => {
|
| 290 |
+
if (currentAudio.value) {
|
| 291 |
+
currentAudio.value.pause();
|
| 292 |
+
currentAudio.value = null;
|
| 293 |
+
}
|
| 294 |
+
currentPlayingId.value = null;
|
| 295 |
+
});
|
| 296 |
+
|
| 297 |
+
// 计算属性:判断是否正在播放
|
| 298 |
+
const isPlaying = (id: string) => {
|
| 299 |
+
return currentPlayingId.value === id;
|
| 300 |
+
};
|
| 301 |
+
|
| 302 |
</script>
|
| 303 |
|
| 304 |
<template>
|
|
|
|
| 306 |
<div class="content">
|
| 307 |
<div class="inner-content">
|
| 308 |
<div class="text-box">
|
| 309 |
+
<div class="title">
|
| 310 |
+
欢迎使用
|
| 311 |
+
</div>
|
| 312 |
+
<div class="sub-title">
|
| 313 |
+
点击下方按钮开始对话
|
| 314 |
+
</div>
|
| 315 |
</div>
|
| 316 |
<div class="btn-box">
|
| 317 |
<a-button @click="chatAction" block :loading="chatLoading" type="primary" size="large">
|
| 318 |
+
<span>开始对话</span>
|
| 319 |
</a-button>
|
| 320 |
</div>
|
| 321 |
</div>
|
| 322 |
</div>
|
| 323 |
|
| 324 |
<div class="actions">
|
| 325 |
+
<!-- <a-button type="text" @click="toggleSider">sider</a-button> -->
|
| 326 |
+
|
| 327 |
+
<a-button v-if="false" type="text" @click="voiceModelOpen = true"
|
| 328 |
+
style="width:44px; height: 44px; margin-right:24px;margin-bottom: 24px;">
|
|
|
|
|
|
|
| 329 |
<template #icon>
|
| 330 |
<img :src="setting" width="28" height="28" alt="settings" />
|
| 331 |
</template>
|
| 332 |
</a-button>
|
| 333 |
+
<a-popover v-if="true" v-model:open="popoverVisible" trigger="click" ok-text="Yes" cancel-text="No" placement="bottomRight">
|
| 334 |
+
<template #content>
|
| 335 |
+
<div class="custom-popover-list">
|
| 336 |
+
<div class="custom-popover-item" @click="togglePopover('voice')">
|
| 337 |
+
选择音色</div>
|
| 338 |
+
<div class="custom-popover-item" @click="togglePopover('prompt')">Prompt调试</div>
|
| 339 |
+
</div>
|
| 340 |
+
</template>
|
| 341 |
+
<img :src="setting" alt="item actions" style="width: 28px; height: 28px; margin-right:24px;margin-top: 16px;">
|
| 342 |
+
</a-popover>
|
| 343 |
</div>
|
| 344 |
|
| 345 |
+
<a-modal v-model:open="voiceModelOpen" :title="null" :mask-closable="false" :closable="false" centered>
|
| 346 |
+
<template #footer>
|
| 347 |
+
<a-button key="back" @click="handleVoiceModalCancel">Cancel</a-button>
|
| 348 |
+
<a-button key="submit" type="primary" :loading="modalLoading" @click="handleVoiceModalSubmit">Submit</a-button>
|
| 349 |
+
</template>
|
| 350 |
+
<div class="languages">
|
| 351 |
+
<div class="echo-cancel-item">
|
| 352 |
+
<div style="display: flex; justify-content: space-between; align-items: center;">
|
| 353 |
+
<p style="margin: 0;">Enable Echo Cancellation:</p>
|
| 354 |
+
<a-switch v-model:checked="echoCancel" />
|
| 355 |
+
</div>
|
| 356 |
+
</div>
|
| 357 |
+
</div>
|
| 358 |
+
<div class="languages">
|
| 359 |
+
<div class="language-item">
|
| 360 |
+
<p>Select Language:</p>
|
| 361 |
+
<a-select v-model:value="language" style="width: 100%;">
|
| 362 |
+
<a-select-option v-for="lan in languages" :value="lan" :key="lan">
|
| 363 |
+
{{ languageOptions[lan] }}
|
| 364 |
+
</a-select-option>
|
| 365 |
+
</a-select>
|
| 366 |
+
</div>
|
| 367 |
+
</div>
|
| 368 |
+
<div class="languages">
|
| 369 |
+
<div class="role-item">
|
| 370 |
+
<p>Select voice Role:</p>
|
| 371 |
+
<a-radio-group size="large" v-model:value="role">
|
| 372 |
+
<a-radio v-for="r in filteredRoles" :style="radioStyle" :value="r['id']" :key="r['id']">
|
| 373 |
+
<div style="display: flex; justify-content: space-between; align-items: center; width:450px;">
|
| 374 |
+
{{ r['character_name'] }}
|
| 375 |
+
<a-button
|
| 376 |
+
:key="r['id']"
|
| 377 |
+
type="text"
|
| 378 |
+
@click="playRefAudio(r['id'], $event)"
|
| 379 |
+
class="audio-play-btn"
|
| 380 |
+
:class="{ 'playing': isPlaying(r['id']) }"
|
| 381 |
+
>
|
| 382 |
+
<SoundTwoTone
|
| 383 |
+
v-if="isPlaying(r['id'])"
|
| 384 |
+
style="font-size: 18px; color: #52c41a;"
|
| 385 |
+
class="playing-icon"
|
| 386 |
+
/>
|
| 387 |
+
<SoundOutlined
|
| 388 |
+
v-else
|
| 389 |
+
style="font-size: 18px; color: #1890ff;"
|
| 390 |
+
/>
|
| 391 |
+
</a-button>
|
| 392 |
+
</div>
|
| 393 |
+
|
| 394 |
+
</a-radio>
|
| 395 |
+
</a-radio-group>
|
| 396 |
+
|
| 397 |
+
</div>
|
| 398 |
+
</div>
|
| 399 |
+
</a-modal>
|
| 400 |
+
|
| 401 |
+
<PromptText v-model:open="promptModelOpen" />
|
| 402 |
</div>
|
| 403 |
</template>
|
| 404 |
|
| 405 |
<style lang="scss" scoped>
|
| 406 |
+
|
| 407 |
+
.languages {
|
| 408 |
+
margin-top: 24px;
|
| 409 |
+
margin-bottom: 24px;
|
| 410 |
+
|
| 411 |
+
p {
|
| 412 |
+
font-size: 16px;
|
| 413 |
+
font-weight: 500;
|
| 414 |
+
margin-bottom: 8px;
|
| 415 |
+
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 416 |
}
|
| 417 |
|
| 418 |
+
.audio-play-btn {
|
| 419 |
+
padding: 0px 8px;
|
| 420 |
+
padding-top:2px;
|
| 421 |
+
border-radius: 4px;
|
| 422 |
+
transition: all 0.2s;
|
| 423 |
+
height: 40px;
|
| 424 |
+
|
| 425 |
+
&:hover {
|
| 426 |
+
background-color: #f0f0f0;
|
| 427 |
+
}
|
| 428 |
+
|
| 429 |
+
&.playing {
|
| 430 |
+
background-color: #f6ffed;
|
| 431 |
+
border-color: #1890ff;
|
| 432 |
+
|
| 433 |
+
.playing-icon {
|
| 434 |
+
animation: pulse 1.5s infinite;
|
| 435 |
+
}
|
| 436 |
+
}
|
| 437 |
+
}
|
| 438 |
+
|
| 439 |
+
@keyframes pulse {
|
| 440 |
+
0% {
|
| 441 |
+
opacity: 1;
|
| 442 |
+
transform: scale(1);
|
| 443 |
+
}
|
| 444 |
+
50% {
|
| 445 |
+
opacity: 0.7;
|
| 446 |
+
transform: scale(1.1);
|
| 447 |
+
}
|
| 448 |
+
100% {
|
| 449 |
+
opacity: 1;
|
| 450 |
+
transform: scale(1);
|
| 451 |
+
}
|
| 452 |
+
}
|
| 453 |
+
|
| 454 |
+
.btn-groups {
|
| 455 |
+
margin-top: 36px;
|
| 456 |
display: flex;
|
| 457 |
+
justify-content: space-between;
|
| 458 |
align-items: center;
|
|
|
|
| 459 |
}
|
| 460 |
|
| 461 |
+
.custom-popover-list {
|
| 462 |
+
width: 92px;
|
| 463 |
+
margin: 0;
|
| 464 |
+
.custom-popover-item {
|
| 465 |
+
font-size: 14px;
|
| 466 |
+
line-height: 36px;
|
| 467 |
+
font-weight: 500;
|
| 468 |
+
color: #1e1e1e;
|
| 469 |
+
cursor: pointer;
|
| 470 |
+
border-radius: 4px;
|
| 471 |
+
padding: 0 8px;
|
| 472 |
+
margin: 0px -8px;
|
| 473 |
+
transition: background 0.2s;
|
| 474 |
+
}
|
| 475 |
+
.custom-popover-item:hover, .custom-popover-item:focus {
|
| 476 |
+
background: #e5e7eb;
|
| 477 |
+
}
|
| 478 |
+
}
|
| 479 |
+
|
| 480 |
+
|
| 481 |
.welcome-wrapper {
|
| 482 |
width: 100%;
|
| 483 |
height: 100%;
|
|
|
|
| 523 |
margin-top: 10px;
|
| 524 |
}
|
| 525 |
}
|
|
|
|
| 526 |
.btn-box {
|
| 527 |
width: 224px;
|
| 528 |
height: 80px;
|
|
|
|
| 531 |
}
|
| 532 |
|
| 533 |
.actions {
|
| 534 |
+
width: 100%;;
|
| 535 |
+
height: 64px;
|
| 536 |
+
|
| 537 |
display: flex;
|
|
|
|
| 538 |
justify-content: flex-end;
|
| 539 |
}
|
| 540 |
}
|
main.py
CHANGED
|
@@ -63,19 +63,6 @@ def main():
|
|
| 63 |
parser = create_argument_parser()
|
| 64 |
args = parser.parse_args()
|
| 65 |
|
| 66 |
-
# 列出音频输入设备后退出
|
| 67 |
-
if getattr(args, 'list_audio_devices', False):
|
| 68 |
-
from voice_dialogue.audio.devices import list_input_devices
|
| 69 |
-
devices = list_input_devices()
|
| 70 |
-
print(f"\n可用音频输入设备 ({len(devices)}):")
|
| 71 |
-
print(f"{'索引':>4} {'通道':>4} {'采样率':>7} {'默认':>4} 名称")
|
| 72 |
-
for d in devices:
|
| 73 |
-
default_mark = '✓' if d['is_default'] else ''
|
| 74 |
-
print(f"{d['index']:>4} {d['max_input_channels']:>4} "
|
| 75 |
-
f"{d['default_sample_rate']:>7} {default_mark:>4} {d['name']}")
|
| 76 |
-
print("\n使用 --input-device <索引> 选择设备。")
|
| 77 |
-
sys.exit(0)
|
| 78 |
-
|
| 79 |
set_debug_mode(args.debug)
|
| 80 |
|
| 81 |
print(f"""
|
|
@@ -91,10 +78,8 @@ VoiceDialogue - 语音对话系统
|
|
| 91 |
if args.mode == 'cli':
|
| 92 |
print(f"语言设置: {args.language}")
|
| 93 |
print(f"说话人: {args.speaker}")
|
| 94 |
-
if args.input_device is not None:
|
| 95 |
-
print(f"输入设备索引: {args.input_device}")
|
| 96 |
print("正在启动命令行语音对话系统...")
|
| 97 |
-
launch_system(args.language, args.speaker, args.disable_echo_cancellation
|
| 98 |
|
| 99 |
elif args.mode == 'api':
|
| 100 |
launch_api_server(
|
|
|
|
| 63 |
parser = create_argument_parser()
|
| 64 |
args = parser.parse_args()
|
| 65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
set_debug_mode(args.debug)
|
| 67 |
|
| 68 |
print(f"""
|
|
|
|
| 78 |
if args.mode == 'cli':
|
| 79 |
print(f"语言设置: {args.language}")
|
| 80 |
print(f"说话人: {args.speaker}")
|
|
|
|
|
|
|
| 81 |
print("正在启动命令行语音对话系统...")
|
| 82 |
+
launch_system(args.language, args.speaker, args.disable_echo_cancellation)
|
| 83 |
|
| 84 |
elif args.mode == 'api':
|
| 85 |
launch_api_server(
|
pyproject.toml
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
[project]
|
| 2 |
name = "voice_dialogue"
|
| 3 |
-
version = "1.
|
| 4 |
description = "一个基于AI的智能语音对话系统,支持实时语音识别、自然语言处理和语音合成"
|
| 5 |
readme = "README.md"
|
| 6 |
requires-python = ">=3.11"
|
|
@@ -8,11 +8,11 @@ dependencies = [
|
|
| 8 |
"cn2an>=0.5.23",
|
| 9 |
"einops>=0.8.1",
|
| 10 |
"en-core-web-sm",
|
| 11 |
-
"fastapi==0.
|
| 12 |
"ffmpeg-python>=0.2.0",
|
| 13 |
"funasr-onnx==0.4.1",
|
| 14 |
"g2p-en>=2.1.0",
|
| 15 |
-
"huggingface-hub==0.
|
| 16 |
"jieba>=0.42.1",
|
| 17 |
"jieba-fast>=0.53",
|
| 18 |
"langchain==0.2.17",
|
|
@@ -29,11 +29,10 @@ dependencies = [
|
|
| 29 |
"pypinyin>=0.54.0",
|
| 30 |
"pytorch-lightning==2.3.1",
|
| 31 |
"pywhispercpp",
|
| 32 |
-
"qwen-asr>=0.0.6",
|
| 33 |
"silero-vad==5.1.2",
|
| 34 |
"soundfile==0.13.1",
|
| 35 |
"torch==2.3.1",
|
| 36 |
-
"transformers==4.
|
| 37 |
"uvicorn==0.34.3",
|
| 38 |
"websockets>=15.0.1",
|
| 39 |
"wordsegment>=1.3.1",
|
|
|
|
| 1 |
[project]
|
| 2 |
name = "voice_dialogue"
|
| 3 |
+
version = "1.0.0"
|
| 4 |
description = "一个基于AI的智能语音对话系统,支持实时语音识别、自然语言处理和语音合成"
|
| 5 |
readme = "README.md"
|
| 6 |
requires-python = ">=3.11"
|
|
|
|
| 8 |
"cn2an>=0.5.23",
|
| 9 |
"einops>=0.8.1",
|
| 10 |
"en-core-web-sm",
|
| 11 |
+
"fastapi==0.115.12",
|
| 12 |
"ffmpeg-python>=0.2.0",
|
| 13 |
"funasr-onnx==0.4.1",
|
| 14 |
"g2p-en>=2.1.0",
|
| 15 |
+
"huggingface-hub==0.32.4",
|
| 16 |
"jieba>=0.42.1",
|
| 17 |
"jieba-fast>=0.53",
|
| 18 |
"langchain==0.2.17",
|
|
|
|
| 29 |
"pypinyin>=0.54.0",
|
| 30 |
"pytorch-lightning==2.3.1",
|
| 31 |
"pywhispercpp",
|
|
|
|
| 32 |
"silero-vad==5.1.2",
|
| 33 |
"soundfile==0.13.1",
|
| 34 |
"torch==2.3.1",
|
| 35 |
+
"transformers==4.41.2",
|
| 36 |
"uvicorn==0.34.3",
|
| 37 |
"websockets>=15.0.1",
|
| 38 |
"wordsegment>=1.3.1",
|
scripts/convert_tts_weights_to_safetensors.py
DELETED
|
@@ -1,47 +0,0 @@
|
|
| 1 |
-
"""将 TTS 预训练权重 (.bin) 转换为 safetensors。
|
| 2 |
-
|
| 3 |
-
qwen-asr 分支将 transformers 升级到 4.57+,其安全策略 (CVE-2025-32434)
|
| 4 |
-
拒绝在 torch < 2.6 上加载 pytorch_model.bin。transformers 加载时优先使用
|
| 5 |
-
model.safetensors,因此本地转换一次即可,无需升级 torch。
|
| 6 |
-
|
| 7 |
-
用法: python scripts/convert_tts_weights_to_safetensors.py
|
| 8 |
-
"""
|
| 9 |
-
from pathlib import Path
|
| 10 |
-
|
| 11 |
-
import torch
|
| 12 |
-
from safetensors.torch import save_file
|
| 13 |
-
|
| 14 |
-
MOYOYO_PRETRAINED_PATH = Path(__file__).parent.parent / "assets" / "models" / "tts" / "moyoyo"
|
| 15 |
-
|
| 16 |
-
PRETRAINED_DIRS = [
|
| 17 |
-
"chinese-roberta-wwm-ext-large",
|
| 18 |
-
"chinese-hubert-base",
|
| 19 |
-
]
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
def main():
|
| 23 |
-
for dirname in PRETRAINED_DIRS:
|
| 24 |
-
model_dir = MOYOYO_PRETRAINED_PATH / dirname
|
| 25 |
-
bin_path = model_dir / "pytorch_model.bin"
|
| 26 |
-
st_path = model_dir / "model.safetensors"
|
| 27 |
-
|
| 28 |
-
if st_path.exists():
|
| 29 |
-
print(f"已存在,跳过: {st_path}")
|
| 30 |
-
continue
|
| 31 |
-
if not bin_path.exists():
|
| 32 |
-
print(f"找不到权重文件: {bin_path}")
|
| 33 |
-
continue
|
| 34 |
-
|
| 35 |
-
state_dict = torch.load(bin_path, map_location="cpu", weights_only=True)
|
| 36 |
-
# clone 断开共享内存,safetensors 不允许张量间共享存储
|
| 37 |
-
state_dict = {
|
| 38 |
-
key: value.clone().contiguous()
|
| 39 |
-
for key, value in state_dict.items()
|
| 40 |
-
if isinstance(value, torch.Tensor)
|
| 41 |
-
}
|
| 42 |
-
save_file(state_dict, st_path, metadata={"format": "pt"})
|
| 43 |
-
print(f"{dirname}: {len(state_dict)} tensors -> {st_path.stat().st_size // 1024 ** 2} MB")
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
if __name__ == "__main__":
|
| 47 |
-
main()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/voice_dialogue/api/app.py
CHANGED
|
@@ -59,8 +59,7 @@ def _register_routes(app: FastAPI):
|
|
| 59 |
v1_router.include_router(settings_routes.router, prefix="/settings", tags=["设置管理"])
|
| 60 |
app.include_router(v1_router)
|
| 61 |
|
| 62 |
-
|
| 63 |
-
app.include_router(websocket_routes.ws)
|
| 64 |
|
| 65 |
# 根路径和健康检查
|
| 66 |
_register_health_routes(app)
|
|
|
|
| 59 |
v1_router.include_router(settings_routes.router, prefix="/settings", tags=["设置管理"])
|
| 60 |
app.include_router(v1_router)
|
| 61 |
|
| 62 |
+
app.add_websocket_route("/api/v1/ws", websocket_routes.ws)
|
|
|
|
| 63 |
|
| 64 |
# 根路径和健康检查
|
| 65 |
_register_health_routes(app)
|
src/voice_dialogue/api/core/lifespan.py
CHANGED
|
@@ -24,8 +24,8 @@ class LifespanManager:
|
|
| 24 |
startup_start_time = time.time()
|
| 25 |
|
| 26 |
try:
|
| 27 |
-
# 初始化系统语言
|
| 28 |
-
system_language =
|
| 29 |
logger.info(f"系统默认语言: {system_language}")
|
| 30 |
|
| 31 |
# 初始化TTS配置
|
|
|
|
| 24 |
startup_start_time = time.time()
|
| 25 |
|
| 26 |
try:
|
| 27 |
+
# 初始化系统语言
|
| 28 |
+
system_language = get_system_language()
|
| 29 |
logger.info(f"系统默认语言: {system_language}")
|
| 30 |
|
| 31 |
# 初始化TTS配置
|
src/voice_dialogue/api/core/service_factories.py
CHANGED
|
@@ -12,15 +12,11 @@ class ServiceFactories:
|
|
| 12 |
"""服务工厂类,封装所有服务的创建逻辑"""
|
| 13 |
|
| 14 |
@staticmethod
|
| 15 |
-
def create_audio_capture(
|
| 16 |
-
enable_echo_cancellation: bool = True,
|
| 17 |
-
input_device_index: int = None,
|
| 18 |
-
) -> AudioCapture:
|
| 19 |
"""创建音频捕获服务"""
|
| 20 |
return AudioCapture(
|
| 21 |
audio_frames_queue=audio_frames_queue,
|
| 22 |
-
enable_echo_cancellation=enable_echo_cancellation
|
| 23 |
-
input_device_index=input_device_index,
|
| 24 |
)
|
| 25 |
|
| 26 |
@staticmethod
|
|
@@ -134,14 +130,11 @@ def get_core_voice_service_definitions(system_language: str, tts_config: BaseTTS
|
|
| 134 |
]
|
| 135 |
|
| 136 |
|
| 137 |
-
def get_audio_capture_service_definition(
|
| 138 |
-
enable_echo_cancellation: bool = True,
|
| 139 |
-
input_device_index: int = None,
|
| 140 |
-
) -> ServiceDefinition:
|
| 141 |
"""获取音频捕获服务定义"""
|
| 142 |
return ServiceDefinition(
|
| 143 |
name="audio_capture",
|
| 144 |
-
factory=lambda: ServiceFactories.create_audio_capture(enable_echo_cancellation
|
| 145 |
dependencies=[],
|
| 146 |
health_check=lambda service: hasattr(service, 'is_ready') and service.is_ready
|
| 147 |
)
|
|
|
|
| 12 |
"""服务工厂类,封装所有服务的创建逻辑"""
|
| 13 |
|
| 14 |
@staticmethod
|
| 15 |
+
def create_audio_capture(enable_echo_cancellation: bool = True) -> AudioCapture:
|
|
|
|
|
|
|
|
|
|
| 16 |
"""创建音频捕获服务"""
|
| 17 |
return AudioCapture(
|
| 18 |
audio_frames_queue=audio_frames_queue,
|
| 19 |
+
enable_echo_cancellation=enable_echo_cancellation
|
|
|
|
| 20 |
)
|
| 21 |
|
| 22 |
@staticmethod
|
|
|
|
| 130 |
]
|
| 131 |
|
| 132 |
|
| 133 |
+
def get_audio_capture_service_definition(enable_echo_cancellation: bool = True) -> ServiceDefinition:
|
|
|
|
|
|
|
|
|
|
| 134 |
"""获取音频捕获服务定义"""
|
| 135 |
return ServiceDefinition(
|
| 136 |
name="audio_capture",
|
| 137 |
+
factory=lambda: ServiceFactories.create_audio_capture(enable_echo_cancellation),
|
| 138 |
dependencies=[],
|
| 139 |
health_check=lambda service: hasattr(service, 'is_ready') and service.is_ready
|
| 140 |
)
|
src/voice_dialogue/api/routes/system_routes.py
CHANGED
|
@@ -3,33 +3,15 @@ import time
|
|
| 3 |
|
| 4 |
from fastapi import APIRouter, HTTPException, BackgroundTasks, Request
|
| 5 |
|
| 6 |
-
from voice_dialogue.audio.capture import resolves_to_native_aec
|
| 7 |
-
from voice_dialogue.audio.devices import (
|
| 8 |
-
list_input_devices, get_default_input_device_index, is_valid_input_device,
|
| 9 |
-
list_output_devices, get_default_output_device_index, is_valid_output_device,
|
| 10 |
-
)
|
| 11 |
-
from voice_dialogue.config.audio_config import (
|
| 12 |
-
get_input_device_index, save_input_device_index,
|
| 13 |
-
get_output_device_index, save_output_device_index,
|
| 14 |
-
)
|
| 15 |
from voice_dialogue.core.constants import session_manager
|
| 16 |
from voice_dialogue.utils.logger import logger
|
| 17 |
from ..core.service_factories import get_audio_capture_service_definition, get_speech_monitor_service_definition
|
| 18 |
from ..schemas.system_schemas import (
|
| 19 |
-
SystemStatusResponse, SystemResponse, SystemStartRequest
|
| 20 |
-
AudioInputDevicesResponse, AudioInputDevice, AudioOutputDevice, ASREngineResponse,
|
| 21 |
-
OutputDeviceRequest
|
| 22 |
)
|
| 23 |
|
| 24 |
router = APIRouter()
|
| 25 |
|
| 26 |
-
# ASR 引擎注册名 -> 展示名称
|
| 27 |
-
ASR_ENGINE_DISPLAY_NAMES = {
|
| 28 |
-
'qwen': 'Qwen3-ASR-1.7B',
|
| 29 |
-
'funasr': 'FunASR Paraformer',
|
| 30 |
-
'whisper': 'Whisper medium',
|
| 31 |
-
}
|
| 32 |
-
|
| 33 |
# 全局系统状态
|
| 34 |
_system_status = {
|
| 35 |
"status": "stopped",
|
|
@@ -78,59 +60,6 @@ async def get_system_status(request: Request):
|
|
| 78 |
raise HTTPException(status_code=500, detail=f"获取系统状态失败: {str(e)}")
|
| 79 |
|
| 80 |
|
| 81 |
-
@router.get("/audio-devices", response_model=AudioInputDevicesResponse, summary="获取可用音频输入设备")
|
| 82 |
-
async def get_audio_devices():
|
| 83 |
-
"""
|
| 84 |
-
列出系统所有可用的音频输入设备(含外置麦克风/麦克风阵列),
|
| 85 |
-
供前端选择采集设备。
|
| 86 |
-
"""
|
| 87 |
-
try:
|
| 88 |
-
devices = [AudioInputDevice(**d) for d in list_input_devices()]
|
| 89 |
-
output_devices = [AudioOutputDevice(**d) for d in list_output_devices()]
|
| 90 |
-
return AudioInputDevicesResponse(
|
| 91 |
-
devices=devices,
|
| 92 |
-
current_device_index=get_input_device_index(),
|
| 93 |
-
default_device_index=get_default_input_device_index(),
|
| 94 |
-
output_devices=output_devices,
|
| 95 |
-
current_output_device_index=get_output_device_index(),
|
| 96 |
-
default_output_device_index=get_default_output_device_index(),
|
| 97 |
-
)
|
| 98 |
-
except Exception as e:
|
| 99 |
-
logger.error(f"获取音频输入设备失败: {e}", exc_info=True)
|
| 100 |
-
raise HTTPException(status_code=500, detail=f"获取音频输入设备失败: {str(e)}")
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
@router.post("/audio-output-device", response_model=SystemResponse, summary="设置音频输出设备")
|
| 104 |
-
async def set_audio_output_device(request: OutputDeviceRequest):
|
| 105 |
-
"""
|
| 106 |
-
保存输出设备选择。播放服务在每次播放时读取该设置,
|
| 107 |
-
会话进行中修改也会在下一句生效,无需重启。
|
| 108 |
-
"""
|
| 109 |
-
output_device_index = request.output_device_index
|
| 110 |
-
if not is_valid_output_device(output_device_index):
|
| 111 |
-
raise HTTPException(status_code=400, detail=f"无效的输出设备索引: {output_device_index}")
|
| 112 |
-
if not save_output_device_index(output_device_index):
|
| 113 |
-
raise HTTPException(status_code=500, detail="保存输出设备设置失败")
|
| 114 |
-
return SystemResponse(success=True, message="输出设备已更新")
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
@router.get("/asr-engine", response_model=ASREngineResponse, summary="获取当前 ASR 引擎")
|
| 118 |
-
async def get_asr_engine():
|
| 119 |
-
"""
|
| 120 |
-
返回当前生效的 ASR 引擎(语言映射 + 展示名称),
|
| 121 |
-
供前端在首页和关于页显示实际使用的识别模型。
|
| 122 |
-
"""
|
| 123 |
-
try:
|
| 124 |
-
from voice_dialogue.asr import asr_manager
|
| 125 |
-
mappings = asr_manager.get_asr_statistics()['language_mappings']
|
| 126 |
-
engines = sorted(set(mappings.values()))
|
| 127 |
-
display_name = ' + '.join(ASR_ENGINE_DISPLAY_NAMES.get(engine, engine) for engine in engines)
|
| 128 |
-
return ASREngineResponse(mappings=mappings, display_name=display_name)
|
| 129 |
-
except Exception as e:
|
| 130 |
-
logger.error(f"获取ASR引擎信息失败: {e}", exc_info=True)
|
| 131 |
-
raise HTTPException(status_code=500, detail=f"获取ASR引擎信息失败: {str(e)}")
|
| 132 |
-
|
| 133 |
-
|
| 134 |
@router.post("/start", response_model=SystemResponse, summary="启动系统")
|
| 135 |
async def start_system(
|
| 136 |
request: SystemStartRequest,
|
|
@@ -147,30 +76,6 @@ async def start_system(
|
|
| 147 |
message="系统已经在运行中或正在启动"
|
| 148 |
)
|
| 149 |
|
| 150 |
-
# 解析输入设备:请求未指定时回退到已保存的设备
|
| 151 |
-
input_device_index = request.input_device_index
|
| 152 |
-
if input_device_index is None:
|
| 153 |
-
input_device_index = get_input_device_index()
|
| 154 |
-
|
| 155 |
-
if not is_valid_input_device(input_device_index):
|
| 156 |
-
logger.warning(f"请求的输入设备 {input_device_index} 无效,回退到系统默认设备")
|
| 157 |
-
input_device_index = None
|
| 158 |
-
|
| 159 |
-
# 持久化用户选择,供下次启动复用
|
| 160 |
-
save_input_device_index(input_device_index)
|
| 161 |
-
|
| 162 |
-
# 解析输出设备:请求未指定时回退到已保存的设备
|
| 163 |
-
output_device_index = request.output_device_index
|
| 164 |
-
if output_device_index is None:
|
| 165 |
-
output_device_index = get_output_device_index()
|
| 166 |
-
|
| 167 |
-
if not is_valid_output_device(output_device_index):
|
| 168 |
-
logger.warning(f"请求的输出设备 {output_device_index} 无效,回退到系统默认设备")
|
| 169 |
-
output_device_index = None
|
| 170 |
-
|
| 171 |
-
# 播放服务在每次播放时读取该设置,保存即生效
|
| 172 |
-
save_output_device_index(output_device_index)
|
| 173 |
-
|
| 174 |
# 更新状态
|
| 175 |
_system_status["status"] = "starting"
|
| 176 |
session_manager.reset_id()
|
|
@@ -179,8 +84,7 @@ async def start_system(
|
|
| 179 |
background_tasks.add_task(
|
| 180 |
_start_system_background,
|
| 181 |
fastapi_request,
|
| 182 |
-
request.enable_echo_cancellation
|
| 183 |
-
input_device_index,
|
| 184 |
)
|
| 185 |
|
| 186 |
return SystemResponse(
|
|
@@ -310,11 +214,7 @@ async def restart_system(
|
|
| 310 |
raise HTTPException(status_code=500, detail=f"系统重启失败: {str(e)}")
|
| 311 |
|
| 312 |
|
| 313 |
-
async def _start_system_background(
|
| 314 |
-
request: Request,
|
| 315 |
-
enable_echo_cancellation: bool = True,
|
| 316 |
-
input_device_index: int = None,
|
| 317 |
-
):
|
| 318 |
"""
|
| 319 |
后台启动系统的实际逻辑 - 创建并启动audio_capture服务
|
| 320 |
"""
|
|
@@ -357,9 +257,7 @@ async def _start_system_background(
|
|
| 357 |
logger.info("语音监控服务已在运行")
|
| 358 |
else:
|
| 359 |
# 创建语音监控服务定义
|
| 360 |
-
|
| 361 |
-
# 选择了外置设备走 PyAudio 时,必须启用软件 VAD。
|
| 362 |
-
enable_vad = not resolves_to_native_aec(enable_echo_cancellation, input_device_index)
|
| 363 |
speech_monitor_def = get_speech_monitor_service_definition(enable_vad)
|
| 364 |
|
| 365 |
# 启动语音监控服务
|
|
@@ -373,7 +271,7 @@ async def _start_system_background(
|
|
| 373 |
logger.info("音频捕获服务已在运行")
|
| 374 |
else:
|
| 375 |
# 创建audio_capture服务定义
|
| 376 |
-
audio_capture_def = get_audio_capture_service_definition(enable_echo_cancellation
|
| 377 |
|
| 378 |
# 启动audio_capture服务
|
| 379 |
success = service_manager.start_service(audio_capture_def)
|
|
|
|
| 3 |
|
| 4 |
from fastapi import APIRouter, HTTPException, BackgroundTasks, Request
|
| 5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
from voice_dialogue.core.constants import session_manager
|
| 7 |
from voice_dialogue.utils.logger import logger
|
| 8 |
from ..core.service_factories import get_audio_capture_service_definition, get_speech_monitor_service_definition
|
| 9 |
from ..schemas.system_schemas import (
|
| 10 |
+
SystemStatusResponse, SystemResponse, SystemStartRequest
|
|
|
|
|
|
|
| 11 |
)
|
| 12 |
|
| 13 |
router = APIRouter()
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
# 全局系统状态
|
| 16 |
_system_status = {
|
| 17 |
"status": "stopped",
|
|
|
|
| 60 |
raise HTTPException(status_code=500, detail=f"获取系统状态失败: {str(e)}")
|
| 61 |
|
| 62 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
@router.post("/start", response_model=SystemResponse, summary="启动系统")
|
| 64 |
async def start_system(
|
| 65 |
request: SystemStartRequest,
|
|
|
|
| 76 |
message="系统已经在运行中或正在启动"
|
| 77 |
)
|
| 78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
# 更新状态
|
| 80 |
_system_status["status"] = "starting"
|
| 81 |
session_manager.reset_id()
|
|
|
|
| 84 |
background_tasks.add_task(
|
| 85 |
_start_system_background,
|
| 86 |
fastapi_request,
|
| 87 |
+
request.enable_echo_cancellation
|
|
|
|
| 88 |
)
|
| 89 |
|
| 90 |
return SystemResponse(
|
|
|
|
| 214 |
raise HTTPException(status_code=500, detail=f"系统重启失败: {str(e)}")
|
| 215 |
|
| 216 |
|
| 217 |
+
async def _start_system_background(request: Request, enable_echo_cancellation: bool = True):
|
|
|
|
|
|
|
|
|
|
|
|
|
| 218 |
"""
|
| 219 |
后台启动系统的实际逻辑 - 创建并启动audio_capture服务
|
| 220 |
"""
|
|
|
|
| 257 |
logger.info("语音监控服务已在运行")
|
| 258 |
else:
|
| 259 |
# 创建语音监控服务定义
|
| 260 |
+
enable_vad = not enable_echo_cancellation
|
|
|
|
|
|
|
| 261 |
speech_monitor_def = get_speech_monitor_service_definition(enable_vad)
|
| 262 |
|
| 263 |
# 启动语音监控服务
|
|
|
|
| 271 |
logger.info("音频捕获服务已在运行")
|
| 272 |
else:
|
| 273 |
# 创建audio_capture服务定义
|
| 274 |
+
audio_capture_def = get_audio_capture_service_definition(enable_echo_cancellation)
|
| 275 |
|
| 276 |
# 启动audio_capture服务
|
| 277 |
success = service_manager.start_service(audio_capture_def)
|
src/voice_dialogue/api/schemas/system_schemas.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
from typing import Optional, Literal, Dict, Any
|
| 2 |
|
| 3 |
from pydantic import BaseModel, Field
|
| 4 |
|
|
@@ -17,51 +17,10 @@ class SystemStatusResponse(BaseModel):
|
|
| 17 |
|
| 18 |
class SystemStartRequest(BaseModel):
|
| 19 |
"""系统启动请求"""
|
| 20 |
-
enable_echo_cancellation: bool = Field(default=True, description="是否启用回声消除
|
| 21 |
-
input_device_index: Optional[int] = Field(default=None, description="输入设备索引(如外置麦克风阵列);为空则使用系统默认设备")
|
| 22 |
-
output_device_index: Optional[int] = Field(default=None, description="输出设备索引(如外置扬声器);为空则使用系统默认设备")
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
class AudioInputDevice(BaseModel):
|
| 26 |
-
"""音频输入设备信息"""
|
| 27 |
-
index: int = Field(..., description="设备索引")
|
| 28 |
-
name: str = Field(..., description="设备名称")
|
| 29 |
-
max_input_channels: int = Field(..., description="最大输入通道数")
|
| 30 |
-
default_sample_rate: int = Field(..., description="设备默认采样率")
|
| 31 |
-
is_default: bool = Field(default=False, description="是否为系统默认输入设备")
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
class AudioOutputDevice(BaseModel):
|
| 35 |
-
"""音频输出设备信息"""
|
| 36 |
-
index: int = Field(..., description="设备索引")
|
| 37 |
-
name: str = Field(..., description="设备名称")
|
| 38 |
-
max_output_channels: int = Field(..., description="最大输出通道数")
|
| 39 |
-
default_sample_rate: int = Field(..., description="设备默认采样率")
|
| 40 |
-
is_default: bool = Field(default=False, description="是否为系统默认输出设备")
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
class AudioInputDevicesResponse(BaseModel):
|
| 44 |
-
"""音频设备列表响应(含输入与输出设备)"""
|
| 45 |
-
devices: List[AudioInputDevice] = Field(default_factory=list, description="可用输入设备列表")
|
| 46 |
-
current_device_index: Optional[int] = Field(default=None, description="当前已选择/保存的输入设备索引")
|
| 47 |
-
default_device_index: Optional[int] = Field(default=None, description="系统默认输入设备索引")
|
| 48 |
-
output_devices: List[AudioOutputDevice] = Field(default_factory=list, description="可用输出设备列表")
|
| 49 |
-
current_output_device_index: Optional[int] = Field(default=None, description="当前已选择/保存的输出设备索引")
|
| 50 |
-
default_output_device_index: Optional[int] = Field(default=None, description="系统默认输出设备索引")
|
| 51 |
|
| 52 |
|
| 53 |
class SystemResponse(BaseModel):
|
| 54 |
"""系统操作响应"""
|
| 55 |
success: bool = Field(..., description="操作是否成功")
|
| 56 |
message: str = Field(..., description="响应消息")
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
class OutputDeviceRequest(BaseModel):
|
| 60 |
-
"""设置输出设备请求"""
|
| 61 |
-
output_device_index: Optional[int] = Field(default=None, description="输出设备索引;为空则使用系统默认设备")
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
class ASREngineResponse(BaseModel):
|
| 65 |
-
"""当前 ASR 引擎信息"""
|
| 66 |
-
mappings: Dict[str, str] = Field(default_factory=dict, description="语言到 ASR 引擎的映射,如 {'zh': 'qwen'}")
|
| 67 |
-
display_name: str = Field(..., description="当前 ASR 引擎的展示名称")
|
|
|
|
| 1 |
+
from typing import Optional, Literal, Dict, Any
|
| 2 |
|
| 3 |
from pydantic import BaseModel, Field
|
| 4 |
|
|
|
|
| 17 |
|
| 18 |
class SystemStartRequest(BaseModel):
|
| 19 |
"""系统启动请求"""
|
| 20 |
+
enable_echo_cancellation: bool = Field(default=True, description="是否启用回声消除")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
|
| 23 |
class SystemResponse(BaseModel):
|
| 24 |
"""系统操作响应"""
|
| 25 |
success: bool = Field(..., description="操作是否成功")
|
| 26 |
message: str = Field(..., description="响应消息")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/voice_dialogue/asr/manager.py
CHANGED
|
@@ -1,6 +1,5 @@
|
|
| 1 |
import importlib.util
|
| 2 |
import inspect
|
| 3 |
-
import os
|
| 4 |
import re
|
| 5 |
from dataclasses import dataclass
|
| 6 |
from typing import Dict, Type, List, Literal, Optional
|
|
@@ -93,26 +92,11 @@ class ASRManager:
|
|
| 93 |
|
| 94 |
def __init__(self):
|
| 95 |
self._asr_instances: Dict[str, ASRInterface] = {}
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
}
|
| 102 |
-
else:
|
| 103 |
-
self._language_to_asr_mapping = {
|
| 104 |
-
'zh': 'qwen',
|
| 105 |
-
'en': 'qwen',
|
| 106 |
-
}
|
| 107 |
-
|
| 108 |
-
def _resolve_unregistered(self, language: str, asr_type: str) -> str:
|
| 109 |
-
"""所选引擎未注册时(如 qwen-asr 未安装)回退到传统引擎。"""
|
| 110 |
-
fallback = {'zh': 'funasr', 'en': 'whisper'}.get(language)
|
| 111 |
-
if fallback and fallback in asr_tables.asr_classes:
|
| 112 |
-
logger.warning(f"ASR引擎 '{asr_type}' 未注册,回退到 '{fallback}'")
|
| 113 |
-
self._language_to_asr_mapping[language] = fallback
|
| 114 |
-
return fallback
|
| 115 |
-
return asr_type
|
| 116 |
|
| 117 |
def create_asr(self, language: Literal['auto', 'zh', 'en']) -> ASRInterface:
|
| 118 |
"""
|
|
@@ -131,9 +115,6 @@ class ASRManager:
|
|
| 131 |
# 根据语言选择合适的ASR引擎
|
| 132 |
asr_type = self._get_asr_type_for_language(language)
|
| 133 |
|
| 134 |
-
if asr_type not in asr_tables.asr_classes:
|
| 135 |
-
asr_type = self._resolve_unregistered(language, asr_type)
|
| 136 |
-
|
| 137 |
if asr_type not in asr_tables.asr_classes:
|
| 138 |
raise ValueError(f"ASR类型 '{asr_type}' 未注册")
|
| 139 |
|
|
|
|
| 1 |
import importlib.util
|
| 2 |
import inspect
|
|
|
|
| 3 |
import re
|
| 4 |
from dataclasses import dataclass
|
| 5 |
from typing import Dict, Type, List, Literal, Optional
|
|
|
|
| 92 |
|
| 93 |
def __init__(self):
|
| 94 |
self._asr_instances: Dict[str, ASRInterface] = {}
|
| 95 |
+
self._language_to_asr_mapping = {
|
| 96 |
+
'zh': 'funasr', # 中文优先使用FunASR
|
| 97 |
+
'en': 'whisper', # 英文优先使用Whisper
|
| 98 |
+
# 'auto': 'whisper', # 自动检测默认使用Whisper
|
| 99 |
+
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 100 |
|
| 101 |
def create_asr(self, language: Literal['auto', 'zh', 'en']) -> ASRInterface:
|
| 102 |
"""
|
|
|
|
| 115 |
# 根据语言选择合适的ASR引擎
|
| 116 |
asr_type = self._get_asr_type_for_language(language)
|
| 117 |
|
|
|
|
|
|
|
|
|
|
| 118 |
if asr_type not in asr_tables.asr_classes:
|
| 119 |
raise ValueError(f"ASR类型 '{asr_type}' 未注册")
|
| 120 |
|
src/voice_dialogue/asr/models/__init__.py
CHANGED
|
@@ -19,12 +19,3 @@ except ImportError as e:
|
|
| 19 |
from voice_dialogue.utils.logger import logger
|
| 20 |
|
| 21 |
logger.warning(f"Failed to import some Whisper implementations: {e}")
|
| 22 |
-
|
| 23 |
-
try:
|
| 24 |
-
from .qwen import QwenASRClient
|
| 25 |
-
|
| 26 |
-
__all__.append('QwenASRClient')
|
| 27 |
-
except ImportError as e:
|
| 28 |
-
from voice_dialogue.utils.logger import logger
|
| 29 |
-
|
| 30 |
-
logger.warning(f"Failed to import some Qwen ASR implementations: {e}")
|
|
|
|
| 19 |
from voice_dialogue.utils.logger import logger
|
| 20 |
|
| 21 |
logger.warning(f"Failed to import some Whisper implementations: {e}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/voice_dialogue/asr/models/qwen.py
DELETED
|
@@ -1,76 +0,0 @@
|
|
| 1 |
-
import os
|
| 2 |
-
import typing
|
| 3 |
-
|
| 4 |
-
import numpy as np
|
| 5 |
-
import torch
|
| 6 |
-
from qwen_asr import Qwen3ASRModel
|
| 7 |
-
|
| 8 |
-
from voice_dialogue.asr.manager import asr_tables
|
| 9 |
-
from voice_dialogue.asr.models.base import ASRInterface
|
| 10 |
-
from voice_dialogue.asr.utils import ensure_minimum_audio_duration
|
| 11 |
-
from voice_dialogue.config import paths
|
| 12 |
-
from voice_dialogue.utils.logger import logger
|
| 13 |
-
|
| 14 |
-
# 内置模型目录(打包分发时随应用携带,存在则离线加载)
|
| 15 |
-
BUILTIN_QWEN_ASR_MODEL_PATH = paths.ASR_MODELS_PATH / 'qwen3-asr-1.7b'
|
| 16 |
-
|
| 17 |
-
TARGET_SAMPLE_RATE = 16000
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
def resolve_model_path() -> str:
|
| 21 |
-
"""模型来源优先级:环境变量 > 内置目录 > HuggingFace 自动下载。"""
|
| 22 |
-
env_model = os.environ.get('QWEN_ASR_MODEL')
|
| 23 |
-
if env_model:
|
| 24 |
-
return env_model
|
| 25 |
-
if (BUILTIN_QWEN_ASR_MODEL_PATH / 'config.json').exists():
|
| 26 |
-
return BUILTIN_QWEN_ASR_MODEL_PATH.as_posix()
|
| 27 |
-
return 'Qwen/Qwen3-ASR-1.7B'
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
@asr_tables.register('asr_classes', 'qwen')
|
| 31 |
-
class QwenASRClient(ASRInterface):
|
| 32 |
-
"""Qwen3-ASR 客户端(transformers 后端,macOS 上使用 MPS 加速)"""
|
| 33 |
-
supported_langs = ['zh', 'en']
|
| 34 |
-
|
| 35 |
-
def __init__(self):
|
| 36 |
-
super().__init__()
|
| 37 |
-
self.model: typing.Optional[Qwen3ASRModel] = None
|
| 38 |
-
|
| 39 |
-
def setup(self, **kwargs) -> None:
|
| 40 |
-
model_name = kwargs.get('model') or resolve_model_path()
|
| 41 |
-
|
| 42 |
-
if torch.backends.mps.is_available():
|
| 43 |
-
device_map, dtype = 'mps', torch.bfloat16
|
| 44 |
-
elif torch.cuda.is_available():
|
| 45 |
-
device_map, dtype = 'cuda:0', torch.bfloat16
|
| 46 |
-
else:
|
| 47 |
-
device_map, dtype = 'cpu', torch.float32
|
| 48 |
-
|
| 49 |
-
logger.info(f'[INFO] Loading Qwen3-ASR model: {model_name} (device={device_map}, dtype={dtype})')
|
| 50 |
-
self.model = Qwen3ASRModel.from_pretrained(
|
| 51 |
-
model_name,
|
| 52 |
-
dtype=dtype,
|
| 53 |
-
device_map=device_map,
|
| 54 |
-
max_inference_batch_size=1,
|
| 55 |
-
max_new_tokens=256,
|
| 56 |
-
)
|
| 57 |
-
|
| 58 |
-
def warmup(self) -> None:
|
| 59 |
-
logger.info('[INFO] Warming up Qwen3-ASR model...')
|
| 60 |
-
try:
|
| 61 |
-
self.transcribe(self.warmup_audiodata)
|
| 62 |
-
logger.info('[INFO] Qwen3-ASR model warmed up.')
|
| 63 |
-
except Exception as e:
|
| 64 |
-
logger.warning(f'[WARNING] Qwen3-ASR model warmup failed: {e}')
|
| 65 |
-
|
| 66 |
-
def transcribe(self, audio_array: np.ndarray, language: str = None) -> str:
|
| 67 |
-
audio_array = ensure_minimum_audio_duration(audio_array)
|
| 68 |
-
|
| 69 |
-
# 始终使用自动语种检测:指定语言会强制模型"只输出转写文本",
|
| 70 |
-
# 静音/噪声段会被迫编出幻听文字;自动模式下非语音段返回空串,
|
| 71 |
-
# 由上游丢弃,从根上消除幻听。
|
| 72 |
-
results = self.model.transcribe(
|
| 73 |
-
audio=(audio_array, TARGET_SAMPLE_RATE),
|
| 74 |
-
language=None,
|
| 75 |
-
)
|
| 76 |
-
return ' '.join(result.text for result in results).strip()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/voice_dialogue/audio/capture/__init__.py
CHANGED
|
@@ -4,43 +4,12 @@
|
|
| 4 |
根据配置选择并管理具体的音频捕获策略。
|
| 5 |
"""
|
| 6 |
from multiprocessing import Queue
|
| 7 |
-
from typing import Optional
|
| 8 |
|
| 9 |
from voice_dialogue.utils.logger import logger
|
| 10 |
from .aec_capture import AecCapture
|
| 11 |
from .pyaudio_capture import PyAudioCapture
|
| 12 |
|
| 13 |
|
| 14 |
-
def resolves_to_native_aec(
|
| 15 |
-
enable_echo_cancellation: bool,
|
| 16 |
-
input_device_index: Optional[int] = None,
|
| 17 |
-
) -> bool:
|
| 18 |
-
"""
|
| 19 |
-
判断在给定配置下是否会使用 macOS 原生 AEC 采集策略。
|
| 20 |
-
|
| 21 |
-
原生 AEC 库作用于系统默认输入设备,且自带 VAD。因此当满足以下任一情况时使用原生 AEC:
|
| 22 |
-
- 启用回声消除且未指定具体输入设备(隐式使用默认设备);
|
| 23 |
-
- 启用回声消除且所选设备恰好就是系统默认输入设备
|
| 24 |
-
(原生 AEC 本就采集默认设备,等价覆盖)。
|
| 25 |
-
|
| 26 |
-
只有当选择了"非默认"输入设备(如外置麦克风阵列)时,才退化为 PyAudio 策略——
|
| 27 |
-
此时回声消除依赖设备自身硬件,语音活动检测改用软件 VAD。
|
| 28 |
-
|
| 29 |
-
上层据此决定 SpeechStateMonitor 是否需要启用软件 VAD
|
| 30 |
-
(enable_vad = not resolves_to_native_aec(...))。
|
| 31 |
-
"""
|
| 32 |
-
if not enable_echo_cancellation:
|
| 33 |
-
return False
|
| 34 |
-
if input_device_index is None:
|
| 35 |
-
return True
|
| 36 |
-
# 所选设备即系统默认设备时,仍可使用原生 AEC
|
| 37 |
-
try:
|
| 38 |
-
from voice_dialogue.audio.devices import get_default_input_device_index
|
| 39 |
-
return input_device_index == get_default_input_device_index()
|
| 40 |
-
except Exception:
|
| 41 |
-
return False
|
| 42 |
-
|
| 43 |
-
|
| 44 |
class AudioCapture:
|
| 45 |
"""
|
| 46 |
音频捕获器门面 (Facade)。
|
|
@@ -54,44 +23,29 @@ class AudioCapture:
|
|
| 54 |
self,
|
| 55 |
audio_frames_queue: Queue,
|
| 56 |
enable_echo_cancellation: bool = True,
|
| 57 |
-
input_device_index: Optional[int] = None,
|
| 58 |
-
channels: Optional[int] = None,
|
| 59 |
):
|
| 60 |
"""
|
| 61 |
初始化音频捕获器。
|
| 62 |
|
| 63 |
Args:
|
| 64 |
audio_frames_queue (Queue): 用于存放捕获的音频帧的队列。
|
| 65 |
-
enable_echo_cancellation (bool): 是否启用回声消除功能。
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
input_device_index (Optional[int]): 指定的输入设备索引(如外置麦克风阵列)。
|
| 69 |
-
一旦指定,则使用 PyAudio 策略采集该设备,
|
| 70 |
-
回声消除依赖设备硬件。
|
| 71 |
-
channels (Optional[int]): 采集通道数(仅 PyAudio 策略生效,多通道会降混为单声道)。
|
| 72 |
"""
|
| 73 |
-
use_native_aec = resolves_to_native_aec(enable_echo_cancellation, input_device_index)
|
| 74 |
self._strategy = None
|
| 75 |
try:
|
| 76 |
-
if
|
| 77 |
self._strategy = AecCapture(audio_frames_queue=audio_frames_queue)
|
| 78 |
else:
|
| 79 |
-
self._strategy = PyAudioCapture(
|
| 80 |
-
audio_frames_queue=audio_frames_queue,
|
| 81 |
-
input_device_index=input_device_index,
|
| 82 |
-
channels=channels,
|
| 83 |
-
)
|
| 84 |
logger.info(f"音频捕获策略已选择: {self._strategy.__class__.__name__}")
|
| 85 |
except Exception as e:
|
| 86 |
logger.error(
|
| 87 |
-
f"初始化 {AecCapture.__name__ if
|
| 88 |
# 只有在尝试 AEC 失败时才回退
|
| 89 |
if not isinstance(self._strategy, PyAudioCapture):
|
| 90 |
-
self._strategy = PyAudioCapture(
|
| 91 |
-
audio_frames_queue=audio_frames_queue,
|
| 92 |
-
input_device_index=input_device_index,
|
| 93 |
-
channels=channels,
|
| 94 |
-
)
|
| 95 |
logger.info(f"已回退到音频捕获策略: {self._strategy.__class__.__name__}")
|
| 96 |
|
| 97 |
def start(self):
|
|
|
|
| 4 |
根据配置选择并管理具体的音频捕获策略。
|
| 5 |
"""
|
| 6 |
from multiprocessing import Queue
|
|
|
|
| 7 |
|
| 8 |
from voice_dialogue.utils.logger import logger
|
| 9 |
from .aec_capture import AecCapture
|
| 10 |
from .pyaudio_capture import PyAudioCapture
|
| 11 |
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
class AudioCapture:
|
| 14 |
"""
|
| 15 |
音频捕获器门面 (Facade)。
|
|
|
|
| 23 |
self,
|
| 24 |
audio_frames_queue: Queue,
|
| 25 |
enable_echo_cancellation: bool = True,
|
|
|
|
|
|
|
| 26 |
):
|
| 27 |
"""
|
| 28 |
初始化音频捕获器。
|
| 29 |
|
| 30 |
Args:
|
| 31 |
audio_frames_queue (Queue): 用于存放捕获的音频帧的队列。
|
| 32 |
+
enable_echo_cancellation (bool): 是否启用回声消除功能。
|
| 33 |
+
若为 True,则使用 AEC 原生库;
|
| 34 |
+
否则,使用 PyAudio。
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
"""
|
|
|
|
| 36 |
self._strategy = None
|
| 37 |
try:
|
| 38 |
+
if enable_echo_cancellation:
|
| 39 |
self._strategy = AecCapture(audio_frames_queue=audio_frames_queue)
|
| 40 |
else:
|
| 41 |
+
self._strategy = PyAudioCapture(audio_frames_queue=audio_frames_queue)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
logger.info(f"音频捕获策略已选择: {self._strategy.__class__.__name__}")
|
| 43 |
except Exception as e:
|
| 44 |
logger.error(
|
| 45 |
+
f"初始化 {AecCapture.__name__ if enable_echo_cancellation else PyAudioCapture.__name__} 失败: {e}, 将回退到 PyAudio。")
|
| 46 |
# 只有在尝试 AEC 失败时才回退
|
| 47 |
if not isinstance(self._strategy, PyAudioCapture):
|
| 48 |
+
self._strategy = PyAudioCapture(audio_frames_queue=audio_frames_queue)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
logger.info(f"已回退到音频捕获策略: {self._strategy.__class__.__name__}")
|
| 50 |
|
| 51 |
def start(self):
|
src/voice_dialogue/audio/capture/pyaudio_capture.py
CHANGED
|
@@ -1,130 +1,41 @@
|
|
| 1 |
from multiprocessing import Queue
|
| 2 |
-
from typing import Optional
|
| 3 |
|
| 4 |
-
import numpy as np
|
| 5 |
import pyaudio
|
| 6 |
|
| 7 |
from voice_dialogue.utils.logger import logger
|
| 8 |
from .base_capture import BaseCapture
|
| 9 |
|
| 10 |
-
# 下游 ASR / VAD 统一要求 16kHz 单声道 int16 音频
|
| 11 |
-
TARGET_SAMPLE_RATE = 16000
|
| 12 |
-
|
| 13 |
|
| 14 |
class PyAudioCapture(BaseCapture):
|
| 15 |
"""
|
| 16 |
使用 PyAudio 进行标准的音频采集策略。
|
| 17 |
-
|
| 18 |
-
支持选择指定的输入设备(如外置麦克风阵列),并自动将多通道、
|
| 19 |
-
非 16kHz 的输入降混并重采样为下游所需的 16kHz 单声道 int16 数据。
|
| 20 |
"""
|
| 21 |
|
| 22 |
-
def __init__(
|
| 23 |
-
self,
|
| 24 |
-
audio_frames_queue: Queue,
|
| 25 |
-
input_device_index: Optional[int] = None,
|
| 26 |
-
channels: Optional[int] = None,
|
| 27 |
-
**kwargs
|
| 28 |
-
):
|
| 29 |
-
"""
|
| 30 |
-
Args:
|
| 31 |
-
audio_frames_queue (Queue): 用于存放捕获的音频帧的队列。
|
| 32 |
-
input_device_index (Optional[int]): 输入设备索引;None 表示使用系统默认设备。
|
| 33 |
-
channels (Optional[int]): 采集通道数;None 表示自动使用设备支持的最大通道数
|
| 34 |
-
(麦克风阵列通常为多通道,采集后会降混为单声道)。
|
| 35 |
-
"""
|
| 36 |
super().__init__(audio_frames_queue=audio_frames_queue, **kwargs)
|
| 37 |
-
self.input_device_index = input_device_index
|
| 38 |
-
self.requested_channels = channels
|
| 39 |
-
|
| 40 |
-
def _resolve_device_params(self, p: pyaudio.PyAudio):
|
| 41 |
-
"""根据所选设备解析采集通道数与采集采样率。"""
|
| 42 |
-
# 默认参数(系统默认设备、单声道、16kHz)
|
| 43 |
-
device_index = self.input_device_index
|
| 44 |
-
channels = self.requested_channels or 1
|
| 45 |
-
sample_rate = TARGET_SAMPLE_RATE
|
| 46 |
-
|
| 47 |
-
try:
|
| 48 |
-
if device_index is None:
|
| 49 |
-
device_index = int(p.get_default_input_device_info().get("index"))
|
| 50 |
-
info = p.get_device_info_by_index(device_index)
|
| 51 |
-
max_channels = int(info.get("maxInputChannels", 1)) or 1
|
| 52 |
-
# 未显式指定通道数时,采集设备的全部通道再降混(适配麦克风阵列)
|
| 53 |
-
if self.requested_channels is None:
|
| 54 |
-
channels = max_channels
|
| 55 |
-
else:
|
| 56 |
-
channels = min(self.requested_channels, max_channels)
|
| 57 |
-
|
| 58 |
-
# 优先尝试 16kHz;若设备不支持则采用设备默认采样率,后续重采样
|
| 59 |
-
device_rate = int(info.get("defaultSampleRate", TARGET_SAMPLE_RATE))
|
| 60 |
-
if not p.is_format_supported(
|
| 61 |
-
rate=TARGET_SAMPLE_RATE,
|
| 62 |
-
input_device=device_index,
|
| 63 |
-
input_channels=channels,
|
| 64 |
-
input_format=pyaudio.paInt16,
|
| 65 |
-
):
|
| 66 |
-
sample_rate = device_rate
|
| 67 |
-
except Exception as e:
|
| 68 |
-
logger.warning(f"解析输入设备参数失败,回退到默认设备/单声道/16kHz: {e}")
|
| 69 |
-
device_index = self.input_device_index
|
| 70 |
-
channels = 1
|
| 71 |
-
sample_rate = TARGET_SAMPLE_RATE
|
| 72 |
-
|
| 73 |
-
return device_index, channels, sample_rate
|
| 74 |
|
| 75 |
def _init_pyaudio(self):
|
| 76 |
"""初始化 PyAudio 并返回实例和配置。"""
|
| 77 |
p = pyaudio.PyAudio()
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
logger.info(
|
| 82 |
-
f"PyAudio 采集配置: device_index={device_index}, channels={channels}, "
|
| 83 |
-
f"sample_rate={sample_rate} -> {TARGET_SAMPLE_RATE}, chunk={chunk}"
|
| 84 |
-
)
|
| 85 |
-
return p, chunk, sample_rate, channels, device_index
|
| 86 |
|
| 87 |
-
def _open_stream(self, p, chunk, sample_rate
|
| 88 |
"""打开 PyAudio 音频流。"""
|
| 89 |
return p.open(
|
| 90 |
format=pyaudio.paInt16,
|
| 91 |
-
channels=
|
| 92 |
rate=sample_rate,
|
| 93 |
input=True,
|
| 94 |
-
input_device_index=device_index,
|
| 95 |
frames_per_buffer=chunk,
|
| 96 |
)
|
| 97 |
|
| 98 |
-
def
|
| 99 |
-
"""将原始多通道/任意采样率的 int16 数据降混并重采样为 16kHz 单声道 int16。"""
|
| 100 |
-
samples = np.frombuffer(data, dtype=np.int16)
|
| 101 |
-
if samples.size == 0:
|
| 102 |
-
return None
|
| 103 |
-
|
| 104 |
-
# 多通道降混为单声道(按通道求平均)
|
| 105 |
-
if channels > 1:
|
| 106 |
-
frame_count = samples.size // channels
|
| 107 |
-
if frame_count == 0:
|
| 108 |
-
return None
|
| 109 |
-
samples = samples[:frame_count * channels].reshape(-1, channels)
|
| 110 |
-
mono = samples.astype(np.float32).mean(axis=1)
|
| 111 |
-
else:
|
| 112 |
-
mono = samples.astype(np.float32)
|
| 113 |
-
|
| 114 |
-
# 重采样到 16kHz
|
| 115 |
-
if sample_rate != TARGET_SAMPLE_RATE:
|
| 116 |
-
import soxr
|
| 117 |
-
mono = soxr.resample(mono, sample_rate, TARGET_SAMPLE_RATE)
|
| 118 |
-
|
| 119 |
-
return np.clip(mono, -32768, 32767).astype(np.int16).tobytes()
|
| 120 |
-
|
| 121 |
-
def _capture_loop(self, stream, chunk, channels, sample_rate):
|
| 122 |
"""PyAudio 音频捕获的主循环。"""
|
| 123 |
logger.info("使用 PyAudio 开始音频采集...")
|
| 124 |
self.is_ready = True
|
| 125 |
|
| 126 |
-
needs_processing = channels > 1 or sample_rate != TARGET_SAMPLE_RATE
|
| 127 |
-
|
| 128 |
while not self.is_exited:
|
| 129 |
data = stream.read(chunk, exception_on_overflow=False)
|
| 130 |
if data is None:
|
|
@@ -133,11 +44,6 @@ class PyAudioCapture(BaseCapture):
|
|
| 133 |
if self.is_paused:
|
| 134 |
continue
|
| 135 |
|
| 136 |
-
if needs_processing:
|
| 137 |
-
data = self._to_mono_16k(data, channels, sample_rate)
|
| 138 |
-
if data is None:
|
| 139 |
-
continue
|
| 140 |
-
|
| 141 |
self.audio_frames_queue.put(data)
|
| 142 |
|
| 143 |
def _cleanup(self, stream, p):
|
|
@@ -151,11 +57,11 @@ class PyAudioCapture(BaseCapture):
|
|
| 151 |
"""
|
| 152 |
线程主循环,执行 PyAudio 音频采集。
|
| 153 |
"""
|
| 154 |
-
p, chunk, sample_rate
|
| 155 |
stream = None
|
| 156 |
try:
|
| 157 |
-
stream = self._open_stream(p, chunk, sample_rate
|
| 158 |
-
self._capture_loop(stream, chunk
|
| 159 |
except Exception as e:
|
| 160 |
logger.error(f'PyAudio 音频捕获器运行时发生错误: {e}')
|
| 161 |
finally:
|
|
|
|
| 1 |
from multiprocessing import Queue
|
|
|
|
| 2 |
|
|
|
|
| 3 |
import pyaudio
|
| 4 |
|
| 5 |
from voice_dialogue.utils.logger import logger
|
| 6 |
from .base_capture import BaseCapture
|
| 7 |
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
class PyAudioCapture(BaseCapture):
|
| 10 |
"""
|
| 11 |
使用 PyAudio 进行标准的音频采集策略。
|
|
|
|
|
|
|
|
|
|
| 12 |
"""
|
| 13 |
|
| 14 |
+
def __init__(self, audio_frames_queue: Queue, **kwargs):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
super().__init__(audio_frames_queue=audio_frames_queue, **kwargs)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
def _init_pyaudio(self):
|
| 18 |
"""初始化 PyAudio 并返回实例和配置。"""
|
| 19 |
p = pyaudio.PyAudio()
|
| 20 |
+
chunk = 1024
|
| 21 |
+
sample_rate = 16000
|
| 22 |
+
return p, chunk, sample_rate
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
+
def _open_stream(self, p, chunk, sample_rate):
|
| 25 |
"""打开 PyAudio 音频流。"""
|
| 26 |
return p.open(
|
| 27 |
format=pyaudio.paInt16,
|
| 28 |
+
channels=1,
|
| 29 |
rate=sample_rate,
|
| 30 |
input=True,
|
|
|
|
| 31 |
frames_per_buffer=chunk,
|
| 32 |
)
|
| 33 |
|
| 34 |
+
def _capture_loop(self, stream, chunk):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
"""PyAudio 音频捕获的主循环。"""
|
| 36 |
logger.info("使用 PyAudio 开始音频采集...")
|
| 37 |
self.is_ready = True
|
| 38 |
|
|
|
|
|
|
|
| 39 |
while not self.is_exited:
|
| 40 |
data = stream.read(chunk, exception_on_overflow=False)
|
| 41 |
if data is None:
|
|
|
|
| 44 |
if self.is_paused:
|
| 45 |
continue
|
| 46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
self.audio_frames_queue.put(data)
|
| 48 |
|
| 49 |
def _cleanup(self, stream, p):
|
|
|
|
| 57 |
"""
|
| 58 |
线程主循环,执行 PyAudio 音频采集。
|
| 59 |
"""
|
| 60 |
+
p, chunk, sample_rate = self._init_pyaudio()
|
| 61 |
stream = None
|
| 62 |
try:
|
| 63 |
+
stream = self._open_stream(p, chunk, sample_rate)
|
| 64 |
+
self._capture_loop(stream, chunk)
|
| 65 |
except Exception as e:
|
| 66 |
logger.error(f'PyAudio 音频捕获器运行时发生错误: {e}')
|
| 67 |
finally:
|
src/voice_dialogue/audio/devices.py
DELETED
|
@@ -1,167 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
音频设备枚举工具。
|
| 3 |
-
|
| 4 |
-
提供列出系统可用输入/输出设备(包括外置麦克风阵列、外置扬声器)的能力,
|
| 5 |
-
供 CLI、API 以及前端进行设备选择。
|
| 6 |
-
"""
|
| 7 |
-
from typing import List, Optional, TypedDict
|
| 8 |
-
|
| 9 |
-
import pyaudio
|
| 10 |
-
|
| 11 |
-
from voice_dialogue.utils.logger import logger
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
class InputDeviceInfo(TypedDict):
|
| 15 |
-
"""输入设备信息。"""
|
| 16 |
-
index: int
|
| 17 |
-
name: str
|
| 18 |
-
max_input_channels: int
|
| 19 |
-
default_sample_rate: int
|
| 20 |
-
is_default: bool
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
class OutputDeviceInfo(TypedDict):
|
| 24 |
-
"""输出设备信息。"""
|
| 25 |
-
index: int
|
| 26 |
-
name: str
|
| 27 |
-
max_output_channels: int
|
| 28 |
-
default_sample_rate: int
|
| 29 |
-
is_default: bool
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
def _get_default_input_index(p: pyaudio.PyAudio) -> Optional[int]:
|
| 33 |
-
"""获取系统默认输入设备索引,失败时返回 None。"""
|
| 34 |
-
try:
|
| 35 |
-
return int(p.get_default_input_device_info().get("index"))
|
| 36 |
-
except Exception:
|
| 37 |
-
return None
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
def list_input_devices() -> List[InputDeviceInfo]:
|
| 41 |
-
"""
|
| 42 |
-
列出所有可用的音频输入设备。
|
| 43 |
-
|
| 44 |
-
Returns:
|
| 45 |
-
List[InputDeviceInfo]: 输入设备列表(仅包含 maxInputChannels > 0 的设备)。
|
| 46 |
-
"""
|
| 47 |
-
devices: List[InputDeviceInfo] = []
|
| 48 |
-
p = pyaudio.PyAudio()
|
| 49 |
-
try:
|
| 50 |
-
default_index = _get_default_input_index(p)
|
| 51 |
-
for i in range(p.get_device_count()):
|
| 52 |
-
try:
|
| 53 |
-
info = p.get_device_info_by_index(i)
|
| 54 |
-
except Exception as e:
|
| 55 |
-
logger.warning(f"读取音频设备 {i} 信息失败: {e}")
|
| 56 |
-
continue
|
| 57 |
-
|
| 58 |
-
max_input_channels = int(info.get("maxInputChannels", 0))
|
| 59 |
-
if max_input_channels <= 0:
|
| 60 |
-
continue
|
| 61 |
-
|
| 62 |
-
devices.append(
|
| 63 |
-
InputDeviceInfo(
|
| 64 |
-
index=int(info.get("index", i)),
|
| 65 |
-
name=str(info.get("name", f"device-{i}")),
|
| 66 |
-
max_input_channels=max_input_channels,
|
| 67 |
-
default_sample_rate=int(info.get("defaultSampleRate", 16000)),
|
| 68 |
-
is_default=(int(info.get("index", i)) == default_index),
|
| 69 |
-
)
|
| 70 |
-
)
|
| 71 |
-
finally:
|
| 72 |
-
p.terminate()
|
| 73 |
-
|
| 74 |
-
return devices
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
def get_default_input_device_index() -> Optional[int]:
|
| 78 |
-
"""获取系统默认输入设备索引。"""
|
| 79 |
-
p = pyaudio.PyAudio()
|
| 80 |
-
try:
|
| 81 |
-
return _get_default_input_index(p)
|
| 82 |
-
finally:
|
| 83 |
-
p.terminate()
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
def is_valid_input_device(index: Optional[int]) -> bool:
|
| 87 |
-
"""
|
| 88 |
-
校验给定索引是否为有效的输入设备。
|
| 89 |
-
|
| 90 |
-
Args:
|
| 91 |
-
index: 设备索引;None 表示使用系统默认设备,视为有效。
|
| 92 |
-
|
| 93 |
-
Returns:
|
| 94 |
-
bool: 是否有效。
|
| 95 |
-
"""
|
| 96 |
-
if index is None:
|
| 97 |
-
return True
|
| 98 |
-
return any(d["index"] == index for d in list_input_devices())
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
def _get_default_output_index(p: pyaudio.PyAudio) -> Optional[int]:
|
| 102 |
-
"""获取系统默认输出设备索引,失败时返回 None。"""
|
| 103 |
-
try:
|
| 104 |
-
return int(p.get_default_output_device_info().get("index"))
|
| 105 |
-
except Exception:
|
| 106 |
-
return None
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
def list_output_devices() -> List[OutputDeviceInfo]:
|
| 110 |
-
"""
|
| 111 |
-
列出所有可用的音频输出设备。
|
| 112 |
-
|
| 113 |
-
Returns:
|
| 114 |
-
List[OutputDeviceInfo]: 输出设备列表(仅包含 maxOutputChannels > 0 的设备)。
|
| 115 |
-
"""
|
| 116 |
-
devices: List[OutputDeviceInfo] = []
|
| 117 |
-
p = pyaudio.PyAudio()
|
| 118 |
-
try:
|
| 119 |
-
default_index = _get_default_output_index(p)
|
| 120 |
-
for i in range(p.get_device_count()):
|
| 121 |
-
try:
|
| 122 |
-
info = p.get_device_info_by_index(i)
|
| 123 |
-
except Exception as e:
|
| 124 |
-
logger.warning(f"读取音频设备 {i} 信息失败: {e}")
|
| 125 |
-
continue
|
| 126 |
-
|
| 127 |
-
max_output_channels = int(info.get("maxOutputChannels", 0))
|
| 128 |
-
if max_output_channels <= 0:
|
| 129 |
-
continue
|
| 130 |
-
|
| 131 |
-
devices.append(
|
| 132 |
-
OutputDeviceInfo(
|
| 133 |
-
index=int(info.get("index", i)),
|
| 134 |
-
name=str(info.get("name", f"device-{i}")),
|
| 135 |
-
max_output_channels=max_output_channels,
|
| 136 |
-
default_sample_rate=int(info.get("defaultSampleRate", 48000)),
|
| 137 |
-
is_default=(int(info.get("index", i)) == default_index),
|
| 138 |
-
)
|
| 139 |
-
)
|
| 140 |
-
finally:
|
| 141 |
-
p.terminate()
|
| 142 |
-
|
| 143 |
-
return devices
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
def get_default_output_device_index() -> Optional[int]:
|
| 147 |
-
"""获取系统默认输出设备索引。"""
|
| 148 |
-
p = pyaudio.PyAudio()
|
| 149 |
-
try:
|
| 150 |
-
return _get_default_output_index(p)
|
| 151 |
-
finally:
|
| 152 |
-
p.terminate()
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
def is_valid_output_device(index: Optional[int]) -> bool:
|
| 156 |
-
"""
|
| 157 |
-
校验给定索引是否为有效的输出设备。
|
| 158 |
-
|
| 159 |
-
Args:
|
| 160 |
-
index: 设备索引;None 表示使用系统默认设备,视为有效。
|
| 161 |
-
|
| 162 |
-
Returns:
|
| 163 |
-
bool: 是否有效。
|
| 164 |
-
"""
|
| 165 |
-
if index is None:
|
| 166 |
-
return True
|
| 167 |
-
return any(d["index"] == index for d in list_output_devices())
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/voice_dialogue/audio/player.py
CHANGED
|
@@ -1,78 +1,10 @@
|
|
| 1 |
import tempfile
|
| 2 |
-
from typing import Optional
|
| 3 |
|
| 4 |
-
import numpy as np
|
| 5 |
import soundfile as sf
|
| 6 |
from playsound import playsound
|
| 7 |
|
| 8 |
-
from voice_dialogue.utils.logger import logger
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
def _to_int16(audio_data) -> np.ndarray:
|
| 12 |
-
"""将音频数据规整为一维 int16。"""
|
| 13 |
-
audio = np.asarray(audio_data)
|
| 14 |
-
if audio.ndim > 1:
|
| 15 |
-
audio = audio.mean(axis=-1)
|
| 16 |
-
if audio.dtype != np.int16:
|
| 17 |
-
audio = np.clip(audio, -1.0, 1.0)
|
| 18 |
-
audio = (audio * 32767.0).astype(np.int16)
|
| 19 |
-
return audio
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
def _play_via_pyaudio(audio_data, sample_rate: int, output_device_index: int):
|
| 23 |
-
"""通过 PyAudio 输出流播放,支持指定输出设备。"""
|
| 24 |
-
import pyaudio
|
| 25 |
-
|
| 26 |
-
audio = _to_int16(audio_data)
|
| 27 |
-
|
| 28 |
-
p = pyaudio.PyAudio()
|
| 29 |
-
try:
|
| 30 |
-
# 设备不支持该采样率时,重采样到设备默认采样率
|
| 31 |
-
try:
|
| 32 |
-
p.is_format_supported(
|
| 33 |
-
rate=sample_rate,
|
| 34 |
-
output_device=output_device_index,
|
| 35 |
-
output_channels=1,
|
| 36 |
-
output_format=pyaudio.paInt16,
|
| 37 |
-
)
|
| 38 |
-
except Exception:
|
| 39 |
-
device_rate = int(p.get_device_info_by_index(output_device_index).get("defaultSampleRate", 48000))
|
| 40 |
-
logger.info(f"输出设备不支持 {sample_rate}Hz,重采样到 {device_rate}Hz")
|
| 41 |
-
import soxr
|
| 42 |
-
audio = soxr.resample(audio, sample_rate, device_rate).astype(np.int16)
|
| 43 |
-
sample_rate = device_rate
|
| 44 |
-
|
| 45 |
-
stream = p.open(
|
| 46 |
-
format=pyaudio.paInt16,
|
| 47 |
-
channels=1,
|
| 48 |
-
rate=sample_rate,
|
| 49 |
-
output=True,
|
| 50 |
-
output_device_index=output_device_index,
|
| 51 |
-
)
|
| 52 |
-
try:
|
| 53 |
-
stream.write(audio.tobytes())
|
| 54 |
-
finally:
|
| 55 |
-
stream.stop_stream()
|
| 56 |
-
stream.close()
|
| 57 |
-
finally:
|
| 58 |
-
p.terminate()
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
def play_audio(audio_data, sample_rate=16000, output_device_index: Optional[int] = None):
|
| 62 |
-
"""播放音频。
|
| 63 |
-
|
| 64 |
-
Args:
|
| 65 |
-
audio_data: 音频数据
|
| 66 |
-
sample_rate: 采样率
|
| 67 |
-
output_device_index: 输出设备索引;None 表示系统默认设备
|
| 68 |
-
"""
|
| 69 |
-
if output_device_index is not None:
|
| 70 |
-
try:
|
| 71 |
-
_play_via_pyaudio(audio_data, sample_rate, output_device_index)
|
| 72 |
-
return
|
| 73 |
-
except Exception as e:
|
| 74 |
-
logger.warning(f"指定输出设备 {output_device_index} 播放失败,回退到系统默认设备: {e}")
|
| 75 |
|
|
|
|
| 76 |
with tempfile.NamedTemporaryFile('w+b', suffix='.wav') as soundfile:
|
| 77 |
sf.write(soundfile, audio_data, samplerate=sample_rate, subtype='PCM_16', closefd=False)
|
| 78 |
playsound(soundfile.name, block=True)
|
|
|
|
| 1 |
import tempfile
|
|
|
|
| 2 |
|
|
|
|
| 3 |
import soundfile as sf
|
| 4 |
from playsound import playsound
|
| 5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
+
def play_audio(audio_data, sample_rate=16000):
|
| 8 |
with tempfile.NamedTemporaryFile('w+b', suffix='.wav') as soundfile:
|
| 9 |
sf.write(soundfile, audio_data, samplerate=sample_rate, subtype='PCM_16', closefd=False)
|
| 10 |
playsound(soundfile.name, block=True)
|
src/voice_dialogue/cli/args.py
CHANGED
|
@@ -74,20 +74,6 @@ def create_argument_parser():
|
|
| 74 |
default=False,
|
| 75 |
help='禁用回声消除功能 (默认: 不禁用)'
|
| 76 |
)
|
| 77 |
-
cli_group.add_argument(
|
| 78 |
-
'--input-device', '-i',
|
| 79 |
-
type=int,
|
| 80 |
-
default=None,
|
| 81 |
-
metavar='INDEX',
|
| 82 |
-
help='指定输入设备索引(如外置麦克风阵列)。多通道会自动降混为单声道;'
|
| 83 |
-
'指定后回声消除依赖设备硬件。用 --list-audio-devices 查看可用索引。'
|
| 84 |
-
)
|
| 85 |
-
cli_group.add_argument(
|
| 86 |
-
'--list-audio-devices',
|
| 87 |
-
action='store_true',
|
| 88 |
-
default=False,
|
| 89 |
-
help='列出可用的音频输入设备及其索引后退出'
|
| 90 |
-
)
|
| 91 |
|
| 92 |
# API服务器模式参数
|
| 93 |
api_group = parser.add_argument_group('API服务器模式参数')
|
|
|
|
| 74 |
default=False,
|
| 75 |
help='禁用回声消除功能 (默认: 不禁用)'
|
| 76 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
# API服务器模式参数
|
| 79 |
api_group = parser.add_argument_group('API服务器模式参数')
|
src/voice_dialogue/config/audio_config.py
DELETED
|
@@ -1,77 +0,0 @@
|
|
| 1 |
-
"""音频设备配置管理模块。
|
| 2 |
-
|
| 3 |
-
持久化用户选择的输入设备(如外置麦克风阵列),在重启后自动复用。
|
| 4 |
-
"""
|
| 5 |
-
import json
|
| 6 |
-
from typing import Optional, TypedDict
|
| 7 |
-
|
| 8 |
-
from voice_dialogue.utils.logger import logger
|
| 9 |
-
from .paths import AUDIO_SETTINGS_PATH
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
class AudioSettings(TypedDict, total=False):
|
| 13 |
-
"""音频设置。"""
|
| 14 |
-
input_device_index: Optional[int]
|
| 15 |
-
output_device_index: Optional[int]
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
_audio_settings_cache: Optional[AudioSettings] = None
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
def get_audio_settings() -> AudioSettings:
|
| 22 |
-
"""加载用户音频设置(带内存缓存)。"""
|
| 23 |
-
global _audio_settings_cache
|
| 24 |
-
if _audio_settings_cache is not None:
|
| 25 |
-
return _audio_settings_cache
|
| 26 |
-
|
| 27 |
-
if not AUDIO_SETTINGS_PATH.exists():
|
| 28 |
-
_audio_settings_cache = {}
|
| 29 |
-
return _audio_settings_cache
|
| 30 |
-
|
| 31 |
-
try:
|
| 32 |
-
with open(AUDIO_SETTINGS_PATH, "r", encoding="utf-8") as f:
|
| 33 |
-
_audio_settings_cache = json.load(f)
|
| 34 |
-
except (json.JSONDecodeError, IOError) as e:
|
| 35 |
-
logger.error(f"无法加载音频设置,使用空配置: {e}")
|
| 36 |
-
_audio_settings_cache = {}
|
| 37 |
-
return _audio_settings_cache
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
def get_input_device_index() -> Optional[int]:
|
| 41 |
-
"""获取已保存的输入设备索引;未配置时返回 None(系统默认设备)。"""
|
| 42 |
-
value = get_audio_settings().get("input_device_index")
|
| 43 |
-
return int(value) if value is not None else None
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
def _save_audio_setting(key: str, value: Optional[int]) -> bool:
|
| 47 |
-
"""保存单项音频设置并刷新缓存。"""
|
| 48 |
-
global _audio_settings_cache
|
| 49 |
-
settings = dict(get_audio_settings())
|
| 50 |
-
settings[key] = value
|
| 51 |
-
try:
|
| 52 |
-
if not AUDIO_SETTINGS_PATH.parent.exists():
|
| 53 |
-
AUDIO_SETTINGS_PATH.parent.mkdir(parents=True, exist_ok=True)
|
| 54 |
-
with open(AUDIO_SETTINGS_PATH, "w", encoding="utf-8") as f:
|
| 55 |
-
json.dump(settings, f, ensure_ascii=False, indent=4)
|
| 56 |
-
_audio_settings_cache = settings # type: ignore[assignment]
|
| 57 |
-
logger.info(f"音频设置已保存: {key}={value}")
|
| 58 |
-
return True
|
| 59 |
-
except IOError as e:
|
| 60 |
-
logger.error(f"无法保存音频设置: {e}")
|
| 61 |
-
return False
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
def save_input_device_index(input_device_index: Optional[int]) -> bool:
|
| 65 |
-
"""保存用户选择的输入设备索引。"""
|
| 66 |
-
return _save_audio_setting("input_device_index", input_device_index)
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
def get_output_device_index() -> Optional[int]:
|
| 70 |
-
"""获取已保存的输出设备索引;未配置时返回 None(系统默认设备)。"""
|
| 71 |
-
value = get_audio_settings().get("output_device_index")
|
| 72 |
-
return int(value) if value is not None else None
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
def save_output_device_index(output_device_index: Optional[int]) -> bool:
|
| 76 |
-
"""保存用户选择的输出设备索引。"""
|
| 77 |
-
return _save_audio_setting("output_device_index", output_device_index)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/voice_dialogue/config/paths.py
CHANGED
|
@@ -46,7 +46,6 @@ APP_DATA_PATH = get_app_data_path()
|
|
| 46 |
if not APP_DATA_PATH.exists():
|
| 47 |
APP_DATA_PATH.mkdir(parents=True, exist_ok=True)
|
| 48 |
USER_PROMPTS_PATH = APP_DATA_PATH / "user_prompts.json"
|
| 49 |
-
AUDIO_SETTINGS_PATH = APP_DATA_PATH / "audio_settings.json"
|
| 50 |
|
| 51 |
|
| 52 |
def load_third_party():
|
|
|
|
| 46 |
if not APP_DATA_PATH.exists():
|
| 47 |
APP_DATA_PATH.mkdir(parents=True, exist_ok=True)
|
| 48 |
USER_PROMPTS_PATH = APP_DATA_PATH / "user_prompts.json"
|
|
|
|
| 49 |
|
| 50 |
|
| 51 |
def load_third_party():
|
src/voice_dialogue/core/launcher.py
CHANGED
|
@@ -6,7 +6,7 @@
|
|
| 6 |
|
| 7 |
import time
|
| 8 |
|
| 9 |
-
from voice_dialogue.audio.capture import AudioCapture
|
| 10 |
from voice_dialogue.config.speaker_config import get_tts_config_by_speaker_name, get_available_speaker_names
|
| 11 |
from voice_dialogue.core.constants import (
|
| 12 |
audio_frames_queue,
|
|
@@ -23,7 +23,6 @@ def launch_system(
|
|
| 23 |
user_language: str,
|
| 24 |
speaker: str,
|
| 25 |
disable_echo_cancellation: bool = False,
|
| 26 |
-
input_device_index: int = None,
|
| 27 |
) -> None:
|
| 28 |
"""
|
| 29 |
启动完整的语音对话系统
|
|
@@ -101,10 +100,7 @@ def launch_system(
|
|
| 101 |
threads.append(audio_player)
|
| 102 |
|
| 103 |
# 语音状态监测
|
| 104 |
-
|
| 105 |
-
# 指定外置设备走 PyAudio 时,必须启用软件 VAD。
|
| 106 |
-
enable_echo_cancellation = not disable_echo_cancellation
|
| 107 |
-
enable_vad = not resolves_to_native_aec(enable_echo_cancellation, input_device_index)
|
| 108 |
speech_monitor = SpeechStateMonitor(
|
| 109 |
audio_frame_queue=audio_frames_queue,
|
| 110 |
user_voice_queue=user_voice_queue,
|
|
@@ -115,10 +111,10 @@ def launch_system(
|
|
| 115 |
threads.append(speech_monitor)
|
| 116 |
|
| 117 |
# 音频采集
|
|
|
|
| 118 |
audio_capture = AudioCapture(
|
| 119 |
audio_frames_queue=audio_frames_queue,
|
| 120 |
-
enable_echo_cancellation=enable_echo_cancellation
|
| 121 |
-
input_device_index=input_device_index,
|
| 122 |
)
|
| 123 |
audio_capture.daemon = True
|
| 124 |
audio_capture.start()
|
|
|
|
| 6 |
|
| 7 |
import time
|
| 8 |
|
| 9 |
+
from voice_dialogue.audio.capture import AudioCapture
|
| 10 |
from voice_dialogue.config.speaker_config import get_tts_config_by_speaker_name, get_available_speaker_names
|
| 11 |
from voice_dialogue.core.constants import (
|
| 12 |
audio_frames_queue,
|
|
|
|
| 23 |
user_language: str,
|
| 24 |
speaker: str,
|
| 25 |
disable_echo_cancellation: bool = False,
|
|
|
|
| 26 |
) -> None:
|
| 27 |
"""
|
| 28 |
启动完整的语音对话系统
|
|
|
|
| 100 |
threads.append(audio_player)
|
| 101 |
|
| 102 |
# 语音状态监测
|
| 103 |
+
enable_vad = disable_echo_cancellation
|
|
|
|
|
|
|
|
|
|
| 104 |
speech_monitor = SpeechStateMonitor(
|
| 105 |
audio_frame_queue=audio_frames_queue,
|
| 106 |
user_voice_queue=user_voice_queue,
|
|
|
|
| 111 |
threads.append(speech_monitor)
|
| 112 |
|
| 113 |
# 音频采集
|
| 114 |
+
enable_echo_cancellation = not disable_echo_cancellation
|
| 115 |
audio_capture = AudioCapture(
|
| 116 |
audio_frames_queue=audio_frames_queue,
|
| 117 |
+
enable_echo_cancellation=enable_echo_cancellation
|
|
|
|
| 118 |
)
|
| 119 |
audio_capture.daemon = True
|
| 120 |
audio_capture.start()
|
src/voice_dialogue/services/asr_service.py
CHANGED
|
@@ -42,7 +42,7 @@ class ASRService(BaseThread, PerformanceLogMixin):
|
|
| 42 |
voice_task.whisper_start_time = time.time()
|
| 43 |
|
| 44 |
user_voice: np.array = voice_task.user_voice
|
| 45 |
-
transcribed_text = self.client.transcribe(user_voice
|
| 46 |
if not transcribed_text.strip():
|
| 47 |
voice_state_manager.reset_task_id()
|
| 48 |
continue
|
|
|
|
| 42 |
voice_task.whisper_start_time = time.time()
|
| 43 |
|
| 44 |
user_voice: np.array = voice_task.user_voice
|
| 45 |
+
transcribed_text = self.client.transcribe(user_voice)
|
| 46 |
if not transcribed_text.strip():
|
| 47 |
voice_state_manager.reset_task_id()
|
| 48 |
continue
|
src/voice_dialogue/services/audio_player_service.py
CHANGED
|
@@ -4,7 +4,6 @@ from queue import Empty
|
|
| 4 |
from typing import Optional
|
| 5 |
|
| 6 |
from voice_dialogue.audio.player import play_audio
|
| 7 |
-
from voice_dialogue.config.audio_config import get_output_device_index
|
| 8 |
from voice_dialogue.core.base import BaseThread
|
| 9 |
from voice_dialogue.core.constants import voice_state_manager, silence_over_threshold_event
|
| 10 |
from voice_dialogue.models.voice_task import VoiceTask, AnswerDisplayMessage
|
|
@@ -65,8 +64,7 @@ class AudioPlayerService(BaseThread, TaskStatusMixin, HistoryMixin, PerformanceL
|
|
| 65 |
|
| 66 |
if not self.is_stopped:
|
| 67 |
audio_data, sample_rate = voice_task.tts_generated_sentence_audio
|
| 68 |
-
|
| 69 |
-
play_audio(audio_data, sample_rate, output_device_index=get_output_device_index())
|
| 70 |
|
| 71 |
# 任务处理完毕,跳出内部循环
|
| 72 |
break
|
|
|
|
| 4 |
from typing import Optional
|
| 5 |
|
| 6 |
from voice_dialogue.audio.player import play_audio
|
|
|
|
| 7 |
from voice_dialogue.core.base import BaseThread
|
| 8 |
from voice_dialogue.core.constants import voice_state_manager, silence_over_threshold_event
|
| 9 |
from voice_dialogue.models.voice_task import VoiceTask, AnswerDisplayMessage
|
|
|
|
| 64 |
|
| 65 |
if not self.is_stopped:
|
| 66 |
audio_data, sample_rate = voice_task.tts_generated_sentence_audio
|
| 67 |
+
play_audio(audio_data, sample_rate)
|
|
|
|
| 68 |
|
| 69 |
# 任务处理完毕,跳出内部循环
|
| 70 |
break
|
src/voice_dialogue/tts/runtime/moyoyo.py
CHANGED
|
@@ -34,9 +34,6 @@ class MoYoYoTTS(TTSInterface):
|
|
| 34 |
|
| 35 |
def setup(self, **kwargs) -> None:
|
| 36 |
"""设置TTS模块"""
|
| 37 |
-
from voice_dialogue.tts.weights_migration import ensure_safetensors_weights
|
| 38 |
-
ensure_safetensors_weights()
|
| 39 |
-
|
| 40 |
tts_config = TTS_Config(self.config.get_runtime_config())
|
| 41 |
self.tts_module = TTSModule(tts_config)
|
| 42 |
self.tts_module.setup_inference_params(
|
|
|
|
| 34 |
|
| 35 |
def setup(self, **kwargs) -> None:
|
| 36 |
"""设置TTS模块"""
|
|
|
|
|
|
|
|
|
|
| 37 |
tts_config = TTS_Config(self.config.get_runtime_config())
|
| 38 |
self.tts_module = TTSModule(tts_config)
|
| 39 |
self.tts_module.setup_inference_params(
|
src/voice_dialogue/tts/weights_migration.py
DELETED
|
@@ -1,45 +0,0 @@
|
|
| 1 |
-
"""TTS 预训练权重 safetensors 迁移。
|
| 2 |
-
|
| 3 |
-
transformers >= 4.56 的安全策略 (CVE-2025-32434) 拒绝在 torch < 2.6 上加载
|
| 4 |
-
pytorch_model.bin。transformers 加载时优先使用 model.safetensors,因此首次
|
| 5 |
-
启动时把 .bin 转换一次即可,无需升级 torch。
|
| 6 |
-
"""
|
| 7 |
-
from pathlib import Path
|
| 8 |
-
|
| 9 |
-
from voice_dialogue.config import paths
|
| 10 |
-
from voice_dialogue.utils.logger import logger
|
| 11 |
-
|
| 12 |
-
PRETRAINED_DIRS = [
|
| 13 |
-
"chinese-roberta-wwm-ext-large",
|
| 14 |
-
"chinese-hubert-base",
|
| 15 |
-
]
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
def ensure_safetensors_weights() -> None:
|
| 19 |
-
"""确保 MoYoYo TTS 的预训练权重存在 safetensors 版本,缺失时从 .bin 转换。"""
|
| 20 |
-
moyoyo_path = Path(paths.TTS_MODELS_PATH) / "moyoyo"
|
| 21 |
-
|
| 22 |
-
for dirname in PRETRAINED_DIRS:
|
| 23 |
-
model_dir = moyoyo_path / dirname
|
| 24 |
-
bin_path = model_dir / "pytorch_model.bin"
|
| 25 |
-
st_path = model_dir / "model.safetensors"
|
| 26 |
-
|
| 27 |
-
if st_path.exists() or not bin_path.exists():
|
| 28 |
-
continue
|
| 29 |
-
|
| 30 |
-
logger.info(f"[INFO] 首次启动:转换 {dirname} 权重为 safetensors...")
|
| 31 |
-
try:
|
| 32 |
-
import torch
|
| 33 |
-
from safetensors.torch import save_file
|
| 34 |
-
|
| 35 |
-
state_dict = torch.load(bin_path, map_location="cpu", weights_only=True)
|
| 36 |
-
# clone 断开共享内存,safetensors 不允许张量间共享存储
|
| 37 |
-
state_dict = {
|
| 38 |
-
key: value.clone().contiguous()
|
| 39 |
-
for key, value in state_dict.items()
|
| 40 |
-
if hasattr(value, "clone")
|
| 41 |
-
}
|
| 42 |
-
save_file(state_dict, st_path, metadata={"format": "pt"})
|
| 43 |
-
logger.info(f"[INFO] {dirname} 转换完成: {st_path.stat().st_size // 1024 ** 2} MB")
|
| 44 |
-
except Exception as e:
|
| 45 |
-
logger.error(f"[ERROR] 转换 {dirname} 权重失败: {e}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
uv.lock
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|