VoiceDialogue / src

Commit History

Update TTS speaker configuration: replace static mapping with dynamic retrieval, add available speaker listing, and update CLI argument parsing for improved flexibility and maintainability.

cf355e6

liumaolin commited on Jun 5, 2025

Refactor sentence processing in `text_generator.py`: centralize sentence end mark sets, streamline `_should_end_sentence` logic, and eliminate redundant parameter passing for improved clarity and maintainability.

1ae18a4

liumaolin commited on Jun 5, 2025

Refactor imports in `main.py`: reorder modules for better readability and remove redundant `load_third_party` call.

8d91cc1

liumaolin commited on Jun 5, 2025

Add Kokoro TTS support: integrate new TTS model, configuration, and runtime components for enhanced multilingual voice synthesis.

1cbd55c

liumaolin commited on Jun 5, 2025

Refactor ASR routes: replace `fastapi_request.state` with `fastapi_request.app.state` for consistent application-level state management.

d231de5

liumaolin commited on Jun 5, 2025

Enhance TTS model handling: add dynamic status tracking, model downloading, and default system configuration initialization with API updates to manage active and default TTS models effectively.

fb6d02a

liumaolin commited on Jun 5, 2025

Comment out logging statements in `audio_player.py` to disable performance logs and streamline runtime output.

87a7384

liumaolin commited on Jun 5, 2025

Refactor ASR routes and services: implement instance creation tracking with background task support, enhance `get_supported_languages` with current ASR language, and clean up unused schemas and routes for simplified management.

757f3be

liumaolin commited on Jun 5, 2025

Remove `SystemConfig` and `SystemStartRequest` imports and clean up `all` in `schemas/init.py` for simplified schema management.

51a672c

liumaolin commited on Jun 4, 2025

Remove `SystemConfig` and `SystemStartRequest` models and clean up related API routes and background tasks for simplified system startup and management.

e7ebdb0

liumaolin commited on Jun 4, 2025

Enhance system management and audio capture services: implement `SystemStatusResponse` updates with detailed state tracking, add `audio_capture` service creation and lifecycle management, and refactor API `/system` routes for improved status and control handling.

94c7b78

liumaolin commited on Jun 4, 2025

Extend TTS registry functionality and integrate default system configurations: implement prioritization logic, language preference handling, and fallback mechanisms in `TTSConfigRegistry`; refactor service factory and lifespan management to support dynamic TTS selection and initialization.

a28f7e3

liumaolin commited on Jun 4, 2025

Introduce core module for API lifecycle management: add configuration, service factories, service manager, and lifespan handlers to streamline application startup, shutdown, and service orchestration.

a16e0e5

liumaolin commited on Jun 4, 2025

Add system utilities and initialize core modules: implement `get_system_language` and `get_system_info`, update API startup with system defaults, and integrate ASR, LLM, and speech modules for enhanced functionality.

5c0e715

liumaolin commited on Jun 4, 2025

Refactor core queue initialization: move queue definitions to `constants.py` and clean up redundant imports in `main.py` for better modularity.

bfefeb3

liumaolin commited on Jun 4, 2025

Introduce initial API structure for VoiceDialogue: add dependencies, middleware, and routes for ASR, TTS, system, and voice modules.

8f823b0

liumaolin commited on Jun 4, 2025

Refactor ASR module: introduce modular structure with ASR interface, implement FunASR and Whisper clients, add registry, and consolidate utility functions for enhanced maintainability and extensibility.

59603db

liumaolin commited on Jun 3, 2025

Refactor TTS module: rename `tts_manager` to `manager` for consistency across imports and structure.

89f7f05

liumaolin commited on Jun 3, 2025

Refactor speech processing: add type hint for `_process_active_voice_frame` and replace `max()` with `np.max()` for consistency.

5284873

liumaolin commited on Jun 3, 2025

Refactor TTS audio generation: rename queues for clarity, update `TTSAudioGenerator` initialization, and enhance docstrings for better maintainability.

bba0d84

liumaolin commited on Jun 1, 2025

Refactor TTS architecture: implement runtime interface, TTS manager, universal registry, and factory pattern to support multiple engines.

ef0d09e

liumaolin commited on Jun 1, 2025

Refactor voice model structure: extract MoYoYo-specific configurations and introduce universal TTS registry.

025ca3f

liumaolin commited on Jun 1, 2025

Remove unused `prompt_semantic` and `reference_spec` configuration parameters from voice model definitions.

2a5dcf2

liumaolin commited on May 30, 2025

Add thread readiness checks and is_ready property across services

e80f558

liumaolin commited on May 30, 2025

Using FunASR quantized model.

ac62229

liumaolin commited on May 30, 2025

Add multilingual support and optimize LLM pipeline configuration.

2988b10

liumaolin commited on May 29, 2025

Add descriptions for Chinese voice models.

4643bb2

liumaolin commited on May 29, 2025

Remove unused configuration parameters and conversation templates.

3d953ae

liumaolin commited on May 29, 2025

Integrate FunASR service.

516d7b8

liumaolin commited on May 28, 2025

First commit.

7b64dcd

liumaolin commited on May 28, 2025

Commit History

Update TTS speaker configuration: replace static mapping with dynamic retrieval, add available speaker listing, and update CLI argument parsing for improved flexibility and maintainability. cf355e6

Refactor sentence processing in `text_generator.py`: centralize sentence end mark sets, streamline `_should_end_sentence` logic, and eliminate redundant parameter passing for improved clarity and maintainability. 1ae18a4

Refactor imports in `main.py`: reorder modules for better readability and remove redundant `load_third_party` call. 8d91cc1

Add Kokoro TTS support: integrate new TTS model, configuration, and runtime components for enhanced multilingual voice synthesis. 1cbd55c

Refactor ASR routes: replace `fastapi_request.state` with `fastapi_request.app.state` for consistent application-level state management. d231de5

Enhance TTS model handling: add dynamic status tracking, model downloading, and default system configuration initialization with API updates to manage active and default TTS models effectively. fb6d02a

Comment out logging statements in `audio_player.py` to disable performance logs and streamline runtime output. 87a7384

Refactor ASR routes and services: implement instance creation tracking with background task support, enhance `get_supported_languages` with current ASR language, and clean up unused schemas and routes for simplified management. 757f3be

Remove `SystemConfig` and `SystemStartRequest` imports and clean up `__all__` in `schemas/__init__.py` for simplified schema management. 51a672c

Remove `SystemConfig` and `SystemStartRequest` models and clean up related API routes and background tasks for simplified system startup and management. e7ebdb0

Enhance system management and audio capture services: implement `SystemStatusResponse` updates with detailed state tracking, add `audio_capture` service creation and lifecycle management, and refactor API `/system` routes for improved status and control handling. 94c7b78

Extend TTS registry functionality and integrate default system configurations: implement prioritization logic, language preference handling, and fallback mechanisms in `TTSConfigRegistry`; refactor service factory and lifespan management to support dynamic TTS selection and initialization. a28f7e3

Introduce core module for API lifecycle management: add configuration, service factories, service manager, and lifespan handlers to streamline application startup, shutdown, and service orchestration. a16e0e5

Add system utilities and initialize core modules: implement `get_system_language` and `get_system_info`, update API startup with system defaults, and integrate ASR, LLM, and speech modules for enhanced functionality. 5c0e715

Refactor core queue initialization: move queue definitions to `constants.py` and clean up redundant imports in `main.py` for better modularity. bfefeb3

Introduce initial API structure for VoiceDialogue: add dependencies, middleware, and routes for ASR, TTS, system, and voice modules. 8f823b0

Refactor ASR module: introduce modular structure with ASR interface, implement FunASR and Whisper clients, add registry, and consolidate utility functions for enhanced maintainability and extensibility. 59603db

Refactor TTS module: rename `tts_manager` to `manager` for consistency across imports and structure. 89f7f05

Refactor speech processing: add type hint for `_process_active_voice_frame` and replace `max()` with `np.max()` for consistency. 5284873

Refactor TTS audio generation: rename queues for clarity, update `TTSAudioGenerator` initialization, and enhance docstrings for better maintainability. bba0d84

Refactor TTS architecture: implement runtime interface, TTS manager, universal registry, and factory pattern to support multiple engines. ef0d09e

Refactor voice model structure: extract MoYoYo-specific configurations and introduce universal TTS registry. 025ca3f

Remove unused `prompt_semantic` and `reference_spec` configuration parameters from voice model definitions. 2a5dcf2

Add thread readiness checks and is_ready property across services e80f558

Using FunASR quantized model. ac62229

Add multilingual support and optimize LLM pipeline configuration. 2988b10

Add descriptions for Chinese voice models. 4643bb2

Remove unused configuration parameters and conversation templates. 3d953ae

Integrate FunASR service. 516d7b8

First commit. 7b64dcd

Update TTS speaker configuration: replace static mapping with dynamic retrieval, add available speaker listing, and update CLI argument parsing for improved flexibility and maintainability.

cf355e6

Refactor sentence processing in `text_generator.py`: centralize sentence end mark sets, streamline `_should_end_sentence` logic, and eliminate redundant parameter passing for improved clarity and maintainability.

1ae18a4

Refactor imports in `main.py`: reorder modules for better readability and remove redundant `load_third_party` call.

8d91cc1

Add Kokoro TTS support: integrate new TTS model, configuration, and runtime components for enhanced multilingual voice synthesis.

1cbd55c

Refactor ASR routes: replace `fastapi_request.state` with `fastapi_request.app.state` for consistent application-level state management.

d231de5

Enhance TTS model handling: add dynamic status tracking, model downloading, and default system configuration initialization with API updates to manage active and default TTS models effectively.

fb6d02a

Comment out logging statements in `audio_player.py` to disable performance logs and streamline runtime output.

87a7384

Refactor ASR routes and services: implement instance creation tracking with background task support, enhance `get_supported_languages` with current ASR language, and clean up unused schemas and routes for simplified management.

757f3be

Remove `SystemConfig` and `SystemStartRequest` imports and clean up `all` in `schemas/init.py` for simplified schema management.

51a672c

Remove `SystemConfig` and `SystemStartRequest` models and clean up related API routes and background tasks for simplified system startup and management.

e7ebdb0

Enhance system management and audio capture services: implement `SystemStatusResponse` updates with detailed state tracking, add `audio_capture` service creation and lifecycle management, and refactor API `/system` routes for improved status and control handling.

94c7b78

Extend TTS registry functionality and integrate default system configurations: implement prioritization logic, language preference handling, and fallback mechanisms in `TTSConfigRegistry`; refactor service factory and lifespan management to support dynamic TTS selection and initialization.

a28f7e3

Introduce core module for API lifecycle management: add configuration, service factories, service manager, and lifespan handlers to streamline application startup, shutdown, and service orchestration.

a16e0e5

Add system utilities and initialize core modules: implement `get_system_language` and `get_system_info`, update API startup with system defaults, and integrate ASR, LLM, and speech modules for enhanced functionality.

5c0e715

Refactor core queue initialization: move queue definitions to `constants.py` and clean up redundant imports in `main.py` for better modularity.

bfefeb3

Introduce initial API structure for VoiceDialogue: add dependencies, middleware, and routes for ASR, TTS, system, and voice modules.

8f823b0

Refactor ASR module: introduce modular structure with ASR interface, implement FunASR and Whisper clients, add registry, and consolidate utility functions for enhanced maintainability and extensibility.

59603db

Refactor TTS module: rename `tts_manager` to `manager` for consistency across imports and structure.

89f7f05

Refactor speech processing: add type hint for `_process_active_voice_frame` and replace `max()` with `np.max()` for consistency.

5284873

Refactor TTS audio generation: rename queues for clarity, update `TTSAudioGenerator` initialization, and enhance docstrings for better maintainability.

bba0d84

Refactor TTS architecture: implement runtime interface, TTS manager, universal registry, and factory pattern to support multiple engines.

ef0d09e

Refactor voice model structure: extract MoYoYo-specific configurations and introduce universal TTS registry.

025ca3f

Remove unused `prompt_semantic` and `reference_spec` configuration parameters from voice model definitions.

2a5dcf2

Add thread readiness checks and is_ready property across services

e80f558

Using FunASR quantized model.

ac62229

Add multilingual support and optimize LLM pipeline configuration.

2988b10

Add descriptions for Chinese voice models.

4643bb2

Remove unused configuration parameters and conversation templates.

3d953ae

Integrate FunASR service.

516d7b8

First commit.

7b64dcd