YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
g2p-multilingual-byT5-tiny-mlx
MLX-compatible weights for charsiu/g2p_multilingual_byT5_tiny_16_layers_100 — a ByT5-based multilingual grapheme-to-phoneme converter supporting 100 languages.
Converted from PyTorch to SafeTensors format for use with mlx-swift and MLX.
Model Details
| Property | Value |
|---|---|
| Architecture | T5ForConditionalGeneration (ByT5) |
| Parameters | 20.8M |
| Encoder layers | 12 |
| Decoder layers | 4 |
| d_model | 256 |
| d_ff | 1024 |
| Vocab size | 384 (byte-level) |
| Format | SafeTensors (float32) |
| Size | ~83 MB |
Performance
| Metric | Score |
|---|---|
| PER (Phoneme Error Rate) | 0.096 |
| WER (Word Error Rate) | 0.281 |
Usage with CharsiuG2PKit (Swift)
import CharsiuG2PKit
let g2p = try G2P(modelDirectory: modelURL)
let ipa = g2p.convert("hello", language: "eng-us")
// "ˈhɛɫoʊ"
See CharsiuG2PKit for full documentation.
Usage with Python MLX
import mlx.core as mx
from mlx.utils import tree_unflatten
from safetensors import safe_open
tensors = {}
with safe_open("model.safetensors", framework="numpy") as f:
for key in f.keys():
tensors[key] = mx.array(f.get_tensor(key))
Conversion
Converted from pytorch_model.bin using:
uv run scripts/convert_weights.py --model charsiu/g2p_multilingual_byT5_tiny_16_layers_100
Shared tensor aliases (encoder.embed_tokens.weight, decoder.embed_tokens.weight) are deduplicated — only shared.weight is kept.
Citation
@inproceedings{zhu2022byt5,
title={ByT5 model for massively multilingual grapheme-to-phoneme conversion},
author={Zhu, Jian and Zhang, Cong and Jurgens, David},
booktitle={Interspeech},
year={2022}
}
License
MIT
- Downloads last month
- 36
Quantized