Spaces:

Akjava
/

AIGamingVoice-Japanese

Running on Zero

App Files Files Community

AIGamingVoice-Japanese / README.md

Akjava

add images

0aa6a4a 3 days ago

preview code

raw

history blame contribute delete

3.49 kB

	---
	title: AIGamingVoice Japanese
	emoji: 🐠
	colorFrom: gray
	colorTo: purple
	sdk: gradio
	sdk_version: 6.2.0
	app_file: app.py
	pinned: false
	license: mit
	short_description: TTS voice for AI (Currently Matcha-TTS)
	---

	# AIGamingVoice - Japanese / 日本語

	High-quality, lightweight Japanese Text-to-Speech specifically tuned for AI gaming characters.
	Running on ONNX Runtime for fast inference

	AIゲームキャラクター向けに調整された高品質・軽量な日本語音声合成システムです。
	ONNX Runtime上で動作します。

	## 🌟 Features / 特徴

	- ⚡ Fast & Lightweight: Pure ONNX Runtime implementation
	- 高速・軽量: 純粋なONNX Runtime実装です。
	- 🖼️ Visual Speaker Selection: Select speakers intuitively from an image gallery.
	- 視覚的な話者選択: 画像ギャラリーから直感的にキャラクター（話者）を選択できます。
	- 🇯🇵 Japanese Optimization: Uses `pyopenjtalk` for accurate Japanese phoneme generation.
	- 日本語最適化: `pyopenjtalk` を使用し、正確な日本語読み上げを実現しています。

	## 🛠️ Installation & Local Usage / インストールとローカルでの使用方法

	1. Clone the repository / リポジトリをクローン
	```bash
	git clone https://huggingface.co/spaces/YOUR_USERNAME/AIGamingVoice-Japanese
	cd AIGamingVoice-Japanese
	```

	2. Install dependencies / 依存関係のインストール
	```bash
	pip install -r requirements.txt
	```
	Note: You need `cmake` installed for pyopenjtalk.
	注: pyopenjtalkのインストールには `cmake` が必要です。

	3. Prepare Models / モデルの準備
	Place your `.onnx` models in the `models/` directory.
	`models/` ディレクトリに `.onnx` モデルファイルを配置してください。

	4. Prepare Speaker Images (Optional) / 話者画像の準備（オプション）
	Place images (`0.jpg`, `1.jpg`, ...) in the `imgs/` directory to enable the visual selector.
	`imgs/` ディレクトリに画像ファイル（`0.jpg`, `1.jpg` ...）を配置すると、画像による話者選択機能が有効になります。

	5. Run the application / アプリケーションの実行
	```bash
	python app.py
	```
	Access http://localhost:7860 in your browser.
	ブラウザで http://localhost:7860 にアクセスしてください。

	## 🎮 How to Use / 使い方

	1. Select Model: Choose a voice model from the dropdown.
	- モデル選択: ドロップダウンから音声モデルを選択します。
	2. Select Speaker: Click on a character image or enter the Speaker ID.
	- 話者選択: キャラクター画像をクリックするか、Speaker IDを入力します。
	3. Input Text: Enter Japanese text to synthesize.
	- テキスト入力: 読み上げたい日本語テキストを入力します。
	4. Adjust Settings: Tweak Temperature (randomness) and Speaking Rate (speed).
	- 設定調整: Temperature（ランダム性）やSpeaking Rate（話速）を調整できます。
	5. Synthesize: Click the button to generate audio.
	- 音声生成: ボタンをクリックして音声を生成します。

	## 🤝 Credits / クレジット

	- Matcha-TTS: Architecture based on Matcha-TTS.
	- ONNX Runtime: Inference engine.
	- pyopenjtalk: Japanese text processing frontend.

	---
	Created for AI Gaming Voice Project