Gemma Turkish Speech Head (E2B + Mimi)
Turkish TTS speech adapter for google/gemma-4-E2B-it with kyutai/mimi neural codec.
Trained on Synthetic_Turkish_TTS_Data (CC BY 4.0).
See: https://github.com/g-hano/gemma-voice
Architecture
- Frozen backbone: Gemma 4 E2B-it (text conditioning)
- Frozen codec: Kyutai Mimi (8 codebooks @ 12.5 Hz, 24 kHz)
- Trainable: learned layer-mix (last 6 Gemma layers) + autoregressive cross-attention speech decoder
- Training steps: 12000
Quick start
pip install torch transformers accelerate soundfile huggingface_hub
huggingface-cli login # Gemma 4 is gated — accept license on HF first
git clone https://github.com/g-hano/gemma-voice.git
cd gemma-voice
pip install -e .
cd src
from pathlib import Path
from huggingface_hub import snapshot_download
import soundfile as sf
repo = snapshot_download("Chan-Y/gemma4-turkish-speech-e2b-mimi")
# clone this repo or copy gemma_turkish package next to your script, then:
import sys
sys.path.insert(0, str(Path(repo) / "src"))
from gemma_turkish.speech.config import SpeechTrainConfig
from gemma_turkish.speech.model import GemmaSpeechModel
import json, torch
repo = Path(repo)
cfg = SpeechTrainConfig.from_dict(json.loads((repo / "config.json").read_text()))
model = GemmaSpeechModel(cfg)
GemmaSpeechModel.load_trainable_checkpoint(model, repo / "speech_head.pt")
model = model.cuda().eval()
text = "Merhaba, bu bir Türkçe ses sentezi denemesidir."
wave = model.synthesize(text)
sf.write("out.wav", wave.squeeze().numpy(), cfg.mimi_sample_rate)
Or use the bundled script after downloading the repo:
python inference.py -t "Merhaba dünya."
Files
| File | Description |
|---|---|
speech_head.pt |
Merged trainable weights (layer_mix + speech_head) + embedded config |
config.json |
Full training/inference hyperparameters |
src/gemma_turkish/ |
Model loading & synthesis code |
License
- Speech-head weights: same terms as base Gemma model (see Google Gemma license).
- Training data: CC BY 4.0 (Synthetic Turkish TTS dataset).
- Mimi codec: Kyutai license.
- Downloads last month
- 15