Merge branch 'main' of hf.co:MoYoYoTech/VoiceDialogue
Browse files* 'main' of hf.co:MoYoYoTech/VoiceDialogue:
更新API文档
缓存user_prompts以提高性能
调整prompt的逻辑,确保展示的提示词中不包含/no_think指令
添加设置相关的API路由
在测试文件中添加对中文和英文系统提示的导入检查,以确保在未定义时能够正确加载配置。
调整LLM prompt的存放位置
增强任务有效性检查,添加日志记录以便于调试。
在生成TTS音频时始终记录生成信息。
- src/voice_dialogue/api/app.py +2 -1
- src/voice_dialogue/api/core/config.py +19 -13
- src/voice_dialogue/api/routes/__init__.py +2 -2
- src/voice_dialogue/api/routes/settings_routes.py +102 -0
- src/voice_dialogue/config/llm_config.py +15 -1
- src/voice_dialogue/config/paths.py +22 -0
- src/voice_dialogue/config/user_config.py +128 -0
- src/voice_dialogue/services/audio/generator.py +3 -3
- src/voice_dialogue/services/audio/player.py +2 -1
- src/voice_dialogue/services/mixins.py +10 -2
- src/voice_dialogue/services/text/generator.py +9 -22
- tests/test_llm_dialogue.py +6 -0
src/voice_dialogue/api/app.py
CHANGED
|
@@ -11,7 +11,7 @@ from .core.config import AppConfig
|
|
| 11 |
from .core.lifespan import lifespan
|
| 12 |
from .middleware.logging import LoggingMiddleware
|
| 13 |
from .middleware.rate_limit import RateLimitMiddleware
|
| 14 |
-
from .routes import tts_routes, asr_routes, system_routes, websocket_routes
|
| 15 |
|
| 16 |
|
| 17 |
def create_app() -> FastAPI:
|
|
@@ -56,6 +56,7 @@ def _register_routes(app: FastAPI):
|
|
| 56 |
v1_router.include_router(tts_routes.router, prefix="/tts", tags=["TTS模型管理"])
|
| 57 |
v1_router.include_router(asr_routes.router, prefix="/asr", tags=["ASR模型管理"])
|
| 58 |
v1_router.include_router(system_routes.router, prefix="/system", tags=["系统管理"])
|
|
|
|
| 59 |
app.include_router(v1_router)
|
| 60 |
|
| 61 |
app.add_websocket_route("/api/v1/ws", websocket_routes.ws)
|
|
|
|
| 11 |
from .core.lifespan import lifespan
|
| 12 |
from .middleware.logging import LoggingMiddleware
|
| 13 |
from .middleware.rate_limit import RateLimitMiddleware
|
| 14 |
+
from .routes import tts_routes, asr_routes, system_routes, websocket_routes, settings_routes
|
| 15 |
|
| 16 |
|
| 17 |
def create_app() -> FastAPI:
|
|
|
|
| 56 |
v1_router.include_router(tts_routes.router, prefix="/tts", tags=["TTS模型管理"])
|
| 57 |
v1_router.include_router(asr_routes.router, prefix="/asr", tags=["ASR模型管理"])
|
| 58 |
v1_router.include_router(system_routes.router, prefix="/system", tags=["系统管理"])
|
| 59 |
+
v1_router.include_router(settings_routes.router, prefix="/settings", tags=["设置管理"])
|
| 60 |
app.include_router(v1_router)
|
| 61 |
|
| 62 |
app.add_websocket_route("/api/v1/ws", websocket_routes.ws)
|
src/voice_dialogue/api/core/config.py
CHANGED
|
@@ -64,13 +64,12 @@ class AppConfig:
|
|
| 64 |
* **动态语言切换**: 运行时创建和切换不同语言的ASR实例
|
| 65 |
|
| 66 |
### 🤖 智能对话
|
| 67 |
-
* **大语言模型集成**: 基于
|
| 68 |
* **上下文理解**: 支持多轮对话和上下文记忆
|
| 69 |
-
* **自定义系统提示**: 可配置AI
|
| 70 |
|
| 71 |
### 🎭 高质量语音合成 (TTS)
|
| 72 |
-
*
|
| 73 |
-
* **英文角色**: 基于Kokoro TTS技术,支持Heart、Bella、Nicole等自然语音
|
| 74 |
* **智能引擎选择**: 根据内容语言自动选择最适合的TTS引擎
|
| 75 |
* **动态角色管理**: 运行时加载、切换和管理语音角色
|
| 76 |
|
|
@@ -79,40 +78,47 @@ class AppConfig:
|
|
| 79 |
* **状态监控**: 实时监控系统和模型状态
|
| 80 |
* **会话管理**: 智能的会话ID管理和消息路由
|
| 81 |
|
| 82 |
-
### 🔧
|
| 83 |
* **服务生命周期**: 完整的系统启动、停止、重启控制
|
| 84 |
* **音频捕获**: 高质量的音频输入处理和回声消除
|
| 85 |
* **状态监控**: 详细的服务状态和性能指标
|
|
|
|
| 86 |
|
| 87 |
## 📋 主要API端点
|
| 88 |
|
| 89 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
* `GET /api/v1/tts/models` - 获取所有可用的TTS模型列表
|
| 91 |
* `POST /api/v1/tts/models/load` - 加载指定的TTS模型
|
| 92 |
* `GET /api/v1/tts/models/{model_id}/status` - 查看模型下载和加载状态
|
| 93 |
* `DELETE /api/v1/tts/models/{model_id}` - 删除已下载的模型
|
| 94 |
|
| 95 |
-
### 语音识别管理
|
| 96 |
* `GET /api/v1/asr/languages` - 获取支持的识别语言列表
|
| 97 |
* `POST /api/v1/asr/instance/create` - 创建指定语言的ASR实例
|
| 98 |
|
| 99 |
-
### 系统控制
|
| 100 |
* `GET /api/v1/system/status` - 获取系统整体状态
|
| 101 |
* `POST /api/v1/system/start` - 启动语音对话系统
|
| 102 |
* `POST /api/v1/system/stop` - 停止语音对话系统
|
| 103 |
* `POST /api/v1/system/restart` - 重启语音对话系统
|
| 104 |
|
| 105 |
-
### 实时通信
|
| 106 |
* `WebSocket /api/v1/ws` - WebSocket连接,接收实时系统消息
|
| 107 |
|
| 108 |
## 🛠️ 技术特性
|
| 109 |
|
| 110 |
* **异步处理**: 基于FastAPI的高性能异步架构
|
| 111 |
* **后台任务**: 模型下载和加载在后台执行,不阻塞API响应
|
| 112 |
-
*
|
| 113 |
-
*
|
| 114 |
-
*
|
| 115 |
-
* **API文档**: 自动生成的交互式API文档
|
| 116 |
|
| 117 |
## 💡 使用场景
|
| 118 |
|
|
|
|
| 64 |
* **动态语言切换**: 运行时创建和切换不同语言的ASR实例
|
| 65 |
|
| 66 |
### 🤖 智能对话
|
| 67 |
+
* **大语言模型集成**: 基于Qwen等先进模型
|
| 68 |
* **上下文理解**: 支持多轮对话和上下文记忆
|
| 69 |
+
* **自定义系统提示**: 可配置AI助手的行为和角色,支持用户自定义
|
| 70 |
|
| 71 |
### 🎭 高质量语音合成 (TTS)
|
| 72 |
+
* **多角色支持**: 集成多种高质量TTS引擎,支持丰富的中英文角色
|
|
|
|
| 73 |
* **智能引擎选择**: 根据内容语言自动选择最适合的TTS引擎
|
| 74 |
* **动态角色管理**: 运行时加载、切换和管理语音角色
|
| 75 |
|
|
|
|
| 78 |
* **状态监控**: 实时监控系统和模型状态
|
| 79 |
* **会话管理**: 智能的会话ID管理和消息路由
|
| 80 |
|
| 81 |
+
### 🔧 系统管理与设置
|
| 82 |
* **服务生命周期**: 完整的系统启动、停止、重启控制
|
| 83 |
* **音频捕获**: 高质量的音频输入处理和回声消除
|
| 84 |
* **状态监控**: 详细的服务状态和性能指标
|
| 85 |
+
* **用户配置**: 支持用户通过API自定义和持久化应用设置
|
| 86 |
|
| 87 |
## 📋 主要API端点
|
| 88 |
|
| 89 |
+
### 设置管理 (Settings)
|
| 90 |
+
* `GET /api/v1/settings/prompts` - 获取当前生效的系统Prompt
|
| 91 |
+
* `POST /api/v1/settings/prompts` - 更新并保存用户自定义的Prompt
|
| 92 |
+
* `DELETE /api/v1/settings/prompts` - 重置Prompt为系统默认值
|
| 93 |
+
* `GET /api/v1/settings/prompts/default` - 获取系统默认的Prompt
|
| 94 |
+
|
| 95 |
+
### TTS模型管理 (TTS)
|
| 96 |
* `GET /api/v1/tts/models` - 获取所有可用的TTS模型列表
|
| 97 |
* `POST /api/v1/tts/models/load` - 加载指定的TTS模型
|
| 98 |
* `GET /api/v1/tts/models/{model_id}/status` - 查看模型下载和加载状态
|
| 99 |
* `DELETE /api/v1/tts/models/{model_id}` - 删除已下载的模型
|
| 100 |
|
| 101 |
+
### 语音识别管理 (ASR)
|
| 102 |
* `GET /api/v1/asr/languages` - 获取支持的识别语言列表
|
| 103 |
* `POST /api/v1/asr/instance/create` - 创建指定语言的ASR实例
|
| 104 |
|
| 105 |
+
### 系统控制 (System)
|
| 106 |
* `GET /api/v1/system/status` - 获取系统整体状态
|
| 107 |
* `POST /api/v1/system/start` - 启动语音对话系统
|
| 108 |
* `POST /api/v1/system/stop` - 停止语音对话系统
|
| 109 |
* `POST /api/v1/system/restart` - 重启语音对话系统
|
| 110 |
|
| 111 |
+
### 实时通信 (WebSocket)
|
| 112 |
* `WebSocket /api/v1/ws` - WebSocket连接,接收实时系统消息
|
| 113 |
|
| 114 |
## 🛠️ 技术特性
|
| 115 |
|
| 116 |
* **异步处理**: 基于FastAPI的高性能异步架构
|
| 117 |
* **后台任务**: 模型下载和加载在后台执行,不阻塞API响应
|
| 118 |
+
* **可配置性**: 支持用户通过API和配置文件自定义核心行为
|
| 119 |
+
* **持久化存储**: 用户设置可被持久化,重启应用后依然生效
|
| 120 |
+
* **内存缓存**: 缓存常用配置,减少磁盘I/O,提升性能
|
| 121 |
+
* **API文档**: 自动生成的交互式API文档(Swagger & ReDoc)
|
| 122 |
|
| 123 |
## 💡 使用场景
|
| 124 |
|
src/voice_dialogue/api/routes/__init__.py
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
-
from . import tts_routes, asr_routes, system_routes, websocket_routes
|
| 2 |
|
| 3 |
-
__all__ = ["tts_routes", "asr_routes", "system_routes", "websocket_routes"]
|
|
|
|
| 1 |
+
from . import tts_routes, asr_routes, system_routes, websocket_routes, settings_routes
|
| 2 |
|
| 3 |
+
__all__ = ["tts_routes", "asr_routes", "system_routes", "websocket_routes", "settings_routes"]
|
src/voice_dialogue/api/routes/settings_routes.py
ADDED
|
@@ -0,0 +1,102 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""设置相关的API路由"""
|
| 2 |
+
from typing import Optional
|
| 3 |
+
|
| 4 |
+
from fastapi import APIRouter, HTTPException
|
| 5 |
+
from pydantic import BaseModel, Field
|
| 6 |
+
|
| 7 |
+
from voice_dialogue.config.llm_config import CHINESE_SYSTEM_PROMPT, ENGLISH_SYSTEM_PROMPT
|
| 8 |
+
from voice_dialogue.config.user_config import (
|
| 9 |
+
get_user_prompts, save_user_prompts, get_raw_prompt, reset_prompts_to_default
|
| 10 |
+
)
|
| 11 |
+
|
| 12 |
+
router = APIRouter()
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
class PromptsResponse(BaseModel):
|
| 16 |
+
"""获取 Prompts 的响应模型"""
|
| 17 |
+
chinese_prompt: str = Field(..., description="中文系统提示词")
|
| 18 |
+
english_prompt: str = Field(..., description="英文系统提示词")
|
| 19 |
+
is_custom: bool = Field(..., description="是否为用户自定义")
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
class UpdatePromptsRequest(BaseModel):
|
| 23 |
+
"""更新 Prompts 的请求模型"""
|
| 24 |
+
chinese_prompt: Optional[str] = Field(None, description="中文系统提示词")
|
| 25 |
+
english_prompt: Optional[str] = Field(None, description="英文系统提示词")
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
class DefaultPromptsResponse(BaseModel):
|
| 29 |
+
"""默认 Prompts 的响应模型"""
|
| 30 |
+
chinese_prompt: str = Field(..., description="默认中文系统提示词")
|
| 31 |
+
english_prompt: str = Field(..., description="默认英文系统提示词")
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
@router.get("/settings/prompts", response_model=PromptsResponse, summary="获取当前生效的 Prompt")
|
| 35 |
+
async def get_current_prompts():
|
| 36 |
+
"""
|
| 37 |
+
获取当前系统中正在使用的中文和英文系统 Prompt
|
| 38 |
+
返回的是原始内容,不包含系统自动添加的 /no_think 指令
|
| 39 |
+
"""
|
| 40 |
+
user_prompts = get_user_prompts()
|
| 41 |
+
is_custom = bool(user_prompts) # 如果有用户自定义配置,则认为是自定义的
|
| 42 |
+
|
| 43 |
+
return PromptsResponse(
|
| 44 |
+
chinese_prompt=get_raw_prompt("zh"),
|
| 45 |
+
english_prompt=get_raw_prompt("en"),
|
| 46 |
+
is_custom=is_custom
|
| 47 |
+
)
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
@router.get("/settings/prompts/default", response_model=DefaultPromptsResponse, summary="获取默认 Prompt")
|
| 51 |
+
async def get_default_prompts():
|
| 52 |
+
"""获取系统默认的 Prompt(原始内容,不包含 /no_think)"""
|
| 53 |
+
return DefaultPromptsResponse(
|
| 54 |
+
chinese_prompt=CHINESE_SYSTEM_PROMPT,
|
| 55 |
+
english_prompt=ENGLISH_SYSTEM_PROMPT
|
| 56 |
+
)
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
@router.post("/settings/prompts", summary="更新并保存用户的 Prompt 设置")
|
| 60 |
+
async def update_user_prompts(request: UpdatePromptsRequest):
|
| 61 |
+
"""
|
| 62 |
+
更新用户自定义的 Prompt
|
| 63 |
+
只更新请求体中提供的字段,未提供的字段将保持不变
|
| 64 |
+
"""
|
| 65 |
+
try:
|
| 66 |
+
# 获取当前用户配置
|
| 67 |
+
current_prompts = get_user_prompts()
|
| 68 |
+
|
| 69 |
+
# 构建更新数据
|
| 70 |
+
update_data = request.model_dump(exclude_unset=True)
|
| 71 |
+
|
| 72 |
+
if not update_data:
|
| 73 |
+
raise HTTPException(status_code=400, detail="请求体不能为空")
|
| 74 |
+
|
| 75 |
+
# 更新配置
|
| 76 |
+
current_prompts.update(update_data)
|
| 77 |
+
|
| 78 |
+
# 保存配置
|
| 79 |
+
if not save_user_prompts(current_prompts):
|
| 80 |
+
raise HTTPException(status_code=500, detail="保存配置失败")
|
| 81 |
+
|
| 82 |
+
return {"message": "用户 Prompt 更新成功", "updated_fields": list(update_data.keys())}
|
| 83 |
+
|
| 84 |
+
except HTTPException:
|
| 85 |
+
raise
|
| 86 |
+
except Exception as e:
|
| 87 |
+
raise HTTPException(status_code=500, detail=f"更新 Prompt 失败: {str(e)}")
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
@router.delete("/settings/prompts", summary="重置 Prompt 为默认值")
|
| 91 |
+
async def reset_prompts():
|
| 92 |
+
"""重置用户自定义的 Prompt 为系统默认值"""
|
| 93 |
+
try:
|
| 94 |
+
if not reset_prompts_to_default():
|
| 95 |
+
raise HTTPException(status_code=500, detail="重置失败")
|
| 96 |
+
|
| 97 |
+
return {"message": "Prompt 已重置为默认值"}
|
| 98 |
+
|
| 99 |
+
except HTTPException:
|
| 100 |
+
raise
|
| 101 |
+
except Exception as e:
|
| 102 |
+
raise HTTPException(status_code=500, detail=f"重置 Prompt 失败: {str(e)}")
|
src/voice_dialogue/config/llm_config.py
CHANGED
|
@@ -1,9 +1,23 @@
|
|
| 1 |
"""LLM模型配置管理"""
|
| 2 |
|
| 3 |
from typing import Dict, Any
|
|
|
|
| 4 |
from voice_dialogue.utils.apple_silicon import get_optimal_llama_cpp_config, get_apple_silicon_info
|
| 5 |
|
| 6 |
-
__all__ = ('get_llm_model_params', 'get_apple_silicon_summary')
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
|
| 8 |
|
| 9 |
def get_llm_model_params() -> Dict[str, Any]:
|
|
|
|
| 1 |
"""LLM模型配置管理"""
|
| 2 |
|
| 3 |
from typing import Dict, Any
|
| 4 |
+
|
| 5 |
from voice_dialogue.utils.apple_silicon import get_optimal_llama_cpp_config, get_apple_silicon_info
|
| 6 |
|
| 7 |
+
__all__ = ('get_llm_model_params', 'get_apple_silicon_summary', 'CHINESE_SYSTEM_PROMPT', 'ENGLISH_SYSTEM_PROMPT')
|
| 8 |
+
|
| 9 |
+
CHINESE_SYSTEM_PROMPT = (
|
| 10 |
+
"你是AI助手。请以自然流畅的中文口语化表达直接回答问题,避免冗余的思考过程。"
|
| 11 |
+
"你的回答第一句话必须少于十个字。每段回答控制在二到三句话,既不要过短也不要过长,以适应对话语境。"
|
| 12 |
+
"回答应准确、精炼且有依据。"
|
| 13 |
+
)
|
| 14 |
+
|
| 15 |
+
ENGLISH_SYSTEM_PROMPT = (
|
| 16 |
+
"You are an AI assistant. "
|
| 17 |
+
"Please answer directly and naturally, using conversational English, without showing your thinking process. "
|
| 18 |
+
"Your first sentence must be less than 10 words. "
|
| 19 |
+
"Your responses should be accurate, concise, and well-supported, ideally around 2-3 sentences long to ensure a good conversational flow."
|
| 20 |
+
)
|
| 21 |
|
| 22 |
|
| 23 |
def get_llm_model_params() -> Dict[str, Any]:
|
src/voice_dialogue/config/paths.py
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
import sys
|
| 2 |
from pathlib import Path
|
| 3 |
|
|
@@ -26,6 +27,27 @@ AUDIO_RESOURCES_PATH = ASSETS_PATH / "audio"
|
|
| 26 |
FRONTEND_ASSETS_PATH = ASSETS_PATH / "www"
|
| 27 |
|
| 28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
def load_third_party():
|
| 30 |
# 添加第三方库到 Python 路径
|
| 31 |
if THIRD_PARTY_PATH.exists() and str(THIRD_PARTY_PATH) not in sys.path:
|
|
|
|
| 1 |
+
import os
|
| 2 |
import sys
|
| 3 |
from pathlib import Path
|
| 4 |
|
|
|
|
| 27 |
FRONTEND_ASSETS_PATH = ASSETS_PATH / "www"
|
| 28 |
|
| 29 |
|
| 30 |
+
# 用户数据路径 - 根据操作系统选择合适的目录
|
| 31 |
+
def get_app_data_path() -> Path:
|
| 32 |
+
"""获取应用数据存储路径"""
|
| 33 |
+
app_name = "Voice Dialogue"
|
| 34 |
+
|
| 35 |
+
if sys.platform == "darwin": # macOS
|
| 36 |
+
base_path = Path.home() / "Library" / "Application Support"
|
| 37 |
+
elif sys.platform == "win32": # Windows
|
| 38 |
+
base_path = Path(os.environ.get("APPDATA", Path.home() / "AppData" / "Roaming"))
|
| 39 |
+
else: # Linux and others
|
| 40 |
+
base_path = Path.home() / ".config"
|
| 41 |
+
|
| 42 |
+
return base_path / app_name
|
| 43 |
+
|
| 44 |
+
|
| 45 |
+
APP_DATA_PATH = get_app_data_path()
|
| 46 |
+
if not APP_DATA_PATH.exists():
|
| 47 |
+
APP_DATA_PATH.mkdir(parents=True, exist_ok=True)
|
| 48 |
+
USER_PROMPTS_PATH = APP_DATA_PATH / "user_prompts.json"
|
| 49 |
+
|
| 50 |
+
|
| 51 |
def load_third_party():
|
| 52 |
# 添加第三方库到 Python 路径
|
| 53 |
if THIRD_PARTY_PATH.exists() and str(THIRD_PARTY_PATH) not in sys.path:
|
src/voice_dialogue/config/user_config.py
ADDED
|
@@ -0,0 +1,128 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""用户配置管理模块"""
|
| 2 |
+
import json
|
| 3 |
+
from typing import Dict, Optional
|
| 4 |
+
|
| 5 |
+
from .llm_config import CHINESE_SYSTEM_PROMPT, ENGLISH_SYSTEM_PROMPT
|
| 6 |
+
from .paths import USER_PROMPTS_PATH
|
| 7 |
+
from ..utils.logger import logger
|
| 8 |
+
|
| 9 |
+
# 内存缓存,避免重复读取文件
|
| 10 |
+
_user_prompts_cache: Optional[Dict[str, str]] = None
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
def get_user_prompts() -> Dict[str, str]:
|
| 14 |
+
"""
|
| 15 |
+
加载用户自定义的 prompts
|
| 16 |
+
|
| 17 |
+
Returns:
|
| 18 |
+
Dict[str, str]: 用户自定义的 prompts。
|
| 19 |
+
"""
|
| 20 |
+
global _user_prompts_cache
|
| 21 |
+
if _user_prompts_cache is not None:
|
| 22 |
+
return _user_prompts_cache
|
| 23 |
+
|
| 24 |
+
if not USER_PROMPTS_PATH.exists():
|
| 25 |
+
logger.info(f"用户配置文件不存在,使用空配置: {USER_PROMPTS_PATH}")
|
| 26 |
+
_user_prompts_cache = {}
|
| 27 |
+
return _user_prompts_cache
|
| 28 |
+
|
| 29 |
+
try:
|
| 30 |
+
with open(USER_PROMPTS_PATH, 'r', encoding='utf-8') as f:
|
| 31 |
+
user_prompts = json.load(f)
|
| 32 |
+
logger.info("成功从文件加载用户自定义 prompts 到缓存")
|
| 33 |
+
_user_prompts_cache = user_prompts
|
| 34 |
+
return _user_prompts_cache
|
| 35 |
+
except (json.JSONDecodeError, IOError) as e:
|
| 36 |
+
logger.error(f"无法加载用户 prompt 配置文件,使用空配置: {e}")
|
| 37 |
+
_user_prompts_cache = {}
|
| 38 |
+
return _user_prompts_cache
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
def save_user_prompts(prompts: Dict[str, str]) -> bool:
|
| 42 |
+
"""
|
| 43 |
+
保存用户自定义的 prompts 到 JSON 文件,并更新缓存。
|
| 44 |
+
|
| 45 |
+
Args:
|
| 46 |
+
prompts: 要保存的 prompts 字典
|
| 47 |
+
|
| 48 |
+
Returns:
|
| 49 |
+
bool: 保存是否成功
|
| 50 |
+
"""
|
| 51 |
+
global _user_prompts_cache
|
| 52 |
+
try:
|
| 53 |
+
# 确保目录存在
|
| 54 |
+
if not USER_PROMPTS_PATH.parent.exists():
|
| 55 |
+
USER_PROMPTS_PATH.parent.mkdir(parents=True, exist_ok=True)
|
| 56 |
+
|
| 57 |
+
with open(USER_PROMPTS_PATH, 'w', encoding='utf-8') as f:
|
| 58 |
+
json.dump(prompts, f, ensure_ascii=False, indent=4)
|
| 59 |
+
logger.info(f"用户 prompts 已保存到: {USER_PROMPTS_PATH}")
|
| 60 |
+
_user_prompts_cache = prompts # 更新缓存
|
| 61 |
+
return True
|
| 62 |
+
except IOError as e:
|
| 63 |
+
logger.error(f"无法保存用户 prompt 配置文件: {e}")
|
| 64 |
+
return False
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
def get_prompt(language: str) -> str:
|
| 68 |
+
"""
|
| 69 |
+
获取指定语言的 prompt,并自动添加 /no_think 指令
|
| 70 |
+
优先从用户配置中获取,如果未配置,则返回默认值
|
| 71 |
+
|
| 72 |
+
Args:
|
| 73 |
+
language: 语言代码,"zh" 表示中文,其他表示英文
|
| 74 |
+
|
| 75 |
+
Returns:
|
| 76 |
+
str: 对应语言的系统提示词(已添加 /no_think)
|
| 77 |
+
"""
|
| 78 |
+
user_prompts = get_user_prompts()
|
| 79 |
+
|
| 80 |
+
if language == "zh":
|
| 81 |
+
base_prompt = user_prompts.get("chinese_prompt", CHINESE_SYSTEM_PROMPT)
|
| 82 |
+
else:
|
| 83 |
+
base_prompt = user_prompts.get("english_prompt", ENGLISH_SYSTEM_PROMPT)
|
| 84 |
+
|
| 85 |
+
# 动态添加 /no_think 指令
|
| 86 |
+
# 检查是否已经包含 /no_think,避免重复添加
|
| 87 |
+
if "/no_think" not in base_prompt:
|
| 88 |
+
base_prompt = base_prompt.rstrip() + "\n/no_think"
|
| 89 |
+
|
| 90 |
+
return base_prompt
|
| 91 |
+
|
| 92 |
+
|
| 93 |
+
def get_raw_prompt(language: str) -> str:
|
| 94 |
+
"""
|
| 95 |
+
获取指定语言的原始 prompt(不添加 /no_think 指令)
|
| 96 |
+
用于API接口返回给前端显示
|
| 97 |
+
|
| 98 |
+
Args:
|
| 99 |
+
language: 语言代码,"zh" 表示中文,其他表示英文
|
| 100 |
+
|
| 101 |
+
Returns:
|
| 102 |
+
str: 对应语言的原始系统提示词
|
| 103 |
+
"""
|
| 104 |
+
user_prompts = get_user_prompts()
|
| 105 |
+
|
| 106 |
+
if language == "zh":
|
| 107 |
+
return user_prompts.get("chinese_prompt", CHINESE_SYSTEM_PROMPT)
|
| 108 |
+
else:
|
| 109 |
+
return user_prompts.get("english_prompt", ENGLISH_SYSTEM_PROMPT)
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
def reset_prompts_to_default() -> bool:
|
| 113 |
+
"""
|
| 114 |
+
重置 prompts 为默认值,并清空缓存。
|
| 115 |
+
|
| 116 |
+
Returns:
|
| 117 |
+
bool: 重置是否成功
|
| 118 |
+
"""
|
| 119 |
+
global _user_prompts_cache
|
| 120 |
+
try:
|
| 121 |
+
if USER_PROMPTS_PATH.exists():
|
| 122 |
+
USER_PROMPTS_PATH.unlink()
|
| 123 |
+
logger.info("用户自定义 prompts 已重置为默认值")
|
| 124 |
+
_user_prompts_cache = {} # 重置缓存为空字典
|
| 125 |
+
return True
|
| 126 |
+
except IOError as e:
|
| 127 |
+
logger.error(f"重置 prompts 失败: {e}")
|
| 128 |
+
return False
|
src/voice_dialogue/services/audio/generator.py
CHANGED
|
@@ -3,7 +3,7 @@ from multiprocessing import Queue
|
|
| 3 |
from queue import Empty
|
| 4 |
|
| 5 |
from voice_dialogue.core.base import BaseThread
|
| 6 |
-
from voice_dialogue.core.constants import voice_state_manager
|
| 7 |
from voice_dialogue.models.voice_task import VoiceTask
|
| 8 |
from voice_dialogue.services.mixins import TaskStatusMixin
|
| 9 |
from voice_dialogue.services.utils import has_no_words
|
|
@@ -83,14 +83,14 @@ class TTSAudioGenerator(BaseThread, TaskStatusMixin):
|
|
| 83 |
return
|
| 84 |
|
| 85 |
if not self.is_task_valid(voice_task):
|
|
|
|
| 86 |
return
|
| 87 |
|
| 88 |
if has_no_words(voice_task.answer_sentence):
|
| 89 |
logger.info(f"跳过仅包含标点的文本: '{voice_task.answer_sentence}'")
|
| 90 |
return
|
| 91 |
|
| 92 |
-
|
| 93 |
-
logger.info(f"TTS 音频生成: {voice_task.answer_sentence}")
|
| 94 |
|
| 95 |
voice_task.tts_start_time = time.time()
|
| 96 |
try:
|
|
|
|
| 3 |
from queue import Empty
|
| 4 |
|
| 5 |
from voice_dialogue.core.base import BaseThread
|
| 6 |
+
from voice_dialogue.core.constants import voice_state_manager
|
| 7 |
from voice_dialogue.models.voice_task import VoiceTask
|
| 8 |
from voice_dialogue.services.mixins import TaskStatusMixin
|
| 9 |
from voice_dialogue.services.utils import has_no_words
|
|
|
|
| 83 |
return
|
| 84 |
|
| 85 |
if not self.is_task_valid(voice_task):
|
| 86 |
+
logger.info(f"TTS 音频生成: 任务<{voice_task.id}> 无效")
|
| 87 |
return
|
| 88 |
|
| 89 |
if has_no_words(voice_task.answer_sentence):
|
| 90 |
logger.info(f"跳过仅包含标点的文本: '{voice_task.answer_sentence}'")
|
| 91 |
return
|
| 92 |
|
| 93 |
+
logger.info(f"TTS 音频生成: {voice_task.answer_sentence}")
|
|
|
|
| 94 |
|
| 95 |
voice_task.tts_start_time = time.time()
|
| 96 |
try:
|
src/voice_dialogue/services/audio/player.py
CHANGED
|
@@ -9,7 +9,7 @@ from playsound import playsound
|
|
| 9 |
|
| 10 |
from voice_dialogue.core.base import BaseThread
|
| 11 |
from voice_dialogue.core.constants import (
|
| 12 |
-
voice_state_manager, silence_over_threshold_event
|
| 13 |
)
|
| 14 |
from voice_dialogue.models.voice_task import VoiceTask, AnswerDisplayMessage
|
| 15 |
from voice_dialogue.services.mixins import TaskStatusMixin, HistoryMixin, PerformanceLogMixin
|
|
@@ -41,6 +41,7 @@ class AudioStreamPlayer(BaseThread, TaskStatusMixin, HistoryMixin, PerformanceLo
|
|
| 41 |
return # 任务被中断,结束处理
|
| 42 |
|
| 43 |
if not self.is_task_valid(voice_task):
|
|
|
|
| 44 |
return # 任务无效,结束处理
|
| 45 |
|
| 46 |
# 等待用户彻底静音的信号
|
|
|
|
| 9 |
|
| 10 |
from voice_dialogue.core.base import BaseThread
|
| 11 |
from voice_dialogue.core.constants import (
|
| 12 |
+
voice_state_manager, silence_over_threshold_event
|
| 13 |
)
|
| 14 |
from voice_dialogue.models.voice_task import VoiceTask, AnswerDisplayMessage
|
| 15 |
from voice_dialogue.services.mixins import TaskStatusMixin, HistoryMixin, PerformanceLogMixin
|
|
|
|
| 41 |
return # 任务被中断,结束处理
|
| 42 |
|
| 43 |
if not self.is_task_valid(voice_task):
|
| 44 |
+
logger.info(f"音频播放: 任务<{voice_task.id}> 无效")
|
| 45 |
return # 任务无效,结束处理
|
| 46 |
|
| 47 |
# 等待用户彻底静音的信号
|
src/voice_dialogue/services/mixins.py
CHANGED
|
@@ -13,16 +13,24 @@ class TaskStatusMixin:
|
|
| 13 |
|
| 14 |
def is_task_interrupted(self, voice_task: VoiceTask) -> bool:
|
| 15 |
"""检查语音任务是否被其他任务中断"""
|
| 16 |
-
|
| 17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
def is_task_valid(self, voice_task: VoiceTask) -> bool:
|
| 20 |
"""检查语音任务是否有效(会话匹配、未被丢弃等)"""
|
| 21 |
if self.is_task_interrupted(voice_task):
|
| 22 |
return False
|
| 23 |
if voice_task.session_id != session_manager.current_id:
|
|
|
|
| 24 |
return False
|
| 25 |
if voice_task.answer_id in dropped_audio_cache:
|
|
|
|
| 26 |
return False
|
| 27 |
return True
|
| 28 |
|
|
|
|
| 13 |
|
| 14 |
def is_task_interrupted(self, voice_task: VoiceTask) -> bool:
|
| 15 |
"""检查语音任务是否被其他任务中断"""
|
| 16 |
+
if not voice_state_manager.interrupt_task_id:
|
| 17 |
+
return False
|
| 18 |
+
|
| 19 |
+
if voice_task.id != voice_state_manager.interrupt_task_id:
|
| 20 |
+
logger.info(f"任务<{voice_task.id}> 被任务<{voice_state_manager.interrupt_task_id}> 中断")
|
| 21 |
+
return True
|
| 22 |
+
|
| 23 |
+
return False
|
| 24 |
|
| 25 |
def is_task_valid(self, voice_task: VoiceTask) -> bool:
|
| 26 |
"""检查语音任务是否有效(会话匹配、未被丢弃等)"""
|
| 27 |
if self.is_task_interrupted(voice_task):
|
| 28 |
return False
|
| 29 |
if voice_task.session_id != session_manager.current_id:
|
| 30 |
+
logger.info(f"任务<{voice_task.id}> 会话不匹配: {voice_task.session_id} != {session_manager.current_id}")
|
| 31 |
return False
|
| 32 |
if voice_task.answer_id in dropped_audio_cache:
|
| 33 |
+
logger.info(f"任务<{voice_task.id}> 被丢弃: {voice_task.answer_id}")
|
| 34 |
return False
|
| 35 |
return True
|
| 36 |
|
src/voice_dialogue/services/text/generator.py
CHANGED
|
@@ -7,27 +7,15 @@ from langchain_core.chat_history import InMemoryChatMessageHistory
|
|
| 7 |
|
| 8 |
from voice_dialogue.config import paths
|
| 9 |
from voice_dialogue.config.llm_config import get_llm_model_params, get_apple_silicon_summary
|
|
|
|
| 10 |
from voice_dialogue.core.base import BaseThread
|
| 11 |
from voice_dialogue.core.constants import chat_history_cache
|
| 12 |
from voice_dialogue.models.voice_task import VoiceTask, QuestionDisplayMessage
|
| 13 |
-
from voice_dialogue.services.text.processor import
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
CHINESE_SYSTEM_PROMPT = (
|
| 18 |
-
"你是AI助手。请以自然流畅的中文口语化表达直接回答问题,避免冗余的思考过程。"
|
| 19 |
-
"你的回答第一句话必须少于十个字。每段回答控制在二到三句话,既不要过短也不要过长,以适应对话语境。"
|
| 20 |
-
"回答应准确、精炼且有依据。"
|
| 21 |
-
"/no_think"
|
| 22 |
-
)
|
| 23 |
-
|
| 24 |
-
ENGLISH_SYSTEM_PROMPT = (
|
| 25 |
-
"You are an AI assistant. "
|
| 26 |
-
"Please answer directly and naturally, using conversational English, without showing your thinking process. "
|
| 27 |
-
"Your first sentence must be less than 10 words. "
|
| 28 |
-
"Your responses should be accurate, concise, and well-supported, ideally around 2-3 sentences long to ensure a good conversational flow."
|
| 29 |
-
"/no_think"
|
| 30 |
)
|
|
|
|
| 31 |
|
| 32 |
|
| 33 |
class LLMResponseGenerator(BaseThread):
|
|
@@ -51,10 +39,7 @@ class LLMResponseGenerator(BaseThread):
|
|
| 51 |
|
| 52 |
def _get_prompt_by_language(self, language: str) -> str:
|
| 53 |
"""根据语言获取对应的 prompt"""
|
| 54 |
-
|
| 55 |
-
return CHINESE_SYSTEM_PROMPT
|
| 56 |
-
else:
|
| 57 |
-
return ENGLISH_SYSTEM_PROMPT
|
| 58 |
|
| 59 |
def get_session_history(self, session_id: str) -> InMemoryChatMessageHistory:
|
| 60 |
message_history = InMemoryChatMessageHistory()
|
|
@@ -206,7 +191,9 @@ class LLMResponseGenerator(BaseThread):
|
|
| 206 |
self.model_instance = create_langchain_chat_llamacpp_instance(
|
| 207 |
local_model_path=model_path, model_params=model_params
|
| 208 |
)
|
| 209 |
-
|
|
|
|
|
|
|
| 210 |
warmup_langchain_pipeline(pipeline)
|
| 211 |
|
| 212 |
self.is_ready = True
|
|
|
|
| 7 |
|
| 8 |
from voice_dialogue.config import paths
|
| 9 |
from voice_dialogue.config.llm_config import get_llm_model_params, get_apple_silicon_summary
|
| 10 |
+
from voice_dialogue.config.user_config import get_prompt
|
| 11 |
from voice_dialogue.core.base import BaseThread
|
| 12 |
from voice_dialogue.core.constants import chat_history_cache
|
| 13 |
from voice_dialogue.models.voice_task import VoiceTask, QuestionDisplayMessage
|
| 14 |
+
from voice_dialogue.services.text.processor import (
|
| 15 |
+
preprocess_sentence_text, create_langchain_chat_llamacpp_instance,
|
| 16 |
+
create_langchain_pipeline, warmup_langchain_pipeline
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
)
|
| 18 |
+
from voice_dialogue.utils.logger import logger
|
| 19 |
|
| 20 |
|
| 21 |
class LLMResponseGenerator(BaseThread):
|
|
|
|
| 39 |
|
| 40 |
def _get_prompt_by_language(self, language: str) -> str:
|
| 41 |
"""根据语言获取对应的 prompt"""
|
| 42 |
+
return get_prompt(language)
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
def get_session_history(self, session_id: str) -> InMemoryChatMessageHistory:
|
| 45 |
message_history = InMemoryChatMessageHistory()
|
|
|
|
| 191 |
self.model_instance = create_langchain_chat_llamacpp_instance(
|
| 192 |
local_model_path=model_path, model_params=model_params
|
| 193 |
)
|
| 194 |
+
# 使用默认中文 prompt 进行 warmup
|
| 195 |
+
prompt = get_prompt("zh")
|
| 196 |
+
pipeline = create_langchain_pipeline(self.model_instance, prompt, self.get_session_history)
|
| 197 |
warmup_langchain_pipeline(pipeline)
|
| 198 |
|
| 199 |
self.is_ready = True
|
tests/test_llm_dialogue.py
CHANGED
|
@@ -31,6 +31,12 @@ ENGLISH_SYSTEM_PROMPT = (
|
|
| 31 |
"/no_think"
|
| 32 |
)
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
class TestLLMDialogue(unittest.TestCase):
|
| 36 |
|
|
|
|
| 31 |
"/no_think"
|
| 32 |
)
|
| 33 |
|
| 34 |
+
if not CHINESE_SYSTEM_PROMPT:
|
| 35 |
+
from voice_dialogue.config.llm_config import CHINESE_SYSTEM_PROMPT
|
| 36 |
+
|
| 37 |
+
if not ENGLISH_SYSTEM_PROMPT:
|
| 38 |
+
from voice_dialogue.config.llm_config import ENGLISH_SYSTEM_PROMPT
|
| 39 |
+
|
| 40 |
|
| 41 |
class TestLLMDialogue(unittest.TestCase):
|
| 42 |
|