baicai1145 commited on
Commit
b4f54a9
·
verified ·
1 Parent(s): 88c0745

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -3
README.md CHANGED
@@ -1,3 +1,106 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model:
4
+ - hfl/chinese-bert-wwm
5
+ ---
6
+ # G2PWModel_zh-Hans
7
+
8
+ [中文](#中文) | [English](#english)
9
+
10
+ ---
11
+
12
+ ## 中文
13
+
14
+ ### 简介
15
+
16
+ 这是 G2PW (Grapheme-to-Phoneme for Word) 模型的重新训练版本,专门针对简体中文进行了优化。
17
+
18
+ ### 主要优化
19
+
20
+ - **纯简体中文数据集**:此版本仅使用简体中文数据进行训练,提供更准确的简体中文发音预测
21
+ - **更快的推理速度**:推荐使用 [@baicai-1145/g2pw-torch](https://github.com/baicai-1145/g2pw-torch) 进行推理,速度比原版 ONNX 实现更快
22
+
23
+ ### 模型信息
24
+
25
+ - **模型名称**: G2PWModel_zh-Hans
26
+ - **训练数据**: 简体中文语料
27
+ - **优化目标**: 提高简体中文字音转换准确率
28
+
29
+ ### 使用方法
30
+
31
+ 推荐使用 [g2pw-torch](https://github.com/baicai-1145/g2pw-torch) 进行推理:
32
+
33
+ ```bash
34
+ pip install g2pw-torch
35
+ ```
36
+
37
+ ```python
38
+ from g2pw import G2PWConverter
39
+
40
+ converter = G2PWConverter(model_dir='baicai1145/G2PWModel_zh-Hans')
41
+ result = converter('我今天很开心')
42
+ print(result)
43
+ ```
44
+
45
+ ### 致谢
46
+
47
+ - **数据收集与整理**: [@TheSmallHanCat](https://huggingface.co/TheSmallHanCat)
48
+ - **模型训练**: [@baicai1145](https://huggingface.co/baicai1145)
49
+
50
+ ### 相关项目
51
+
52
+ - [baicai-1145/g2pW](https://github.com/baicai-1145/g2pW) - 原始 G2PW 项目
53
+ - [baicai-1145/g2pw-torch](https://github.com/baicai-1145/g2pw-torch) - PyTorch 推理版本(推荐使用)
54
+
55
+ ### 许可证
56
+
57
+ 本项目遵循原项目的许可证协议。
58
+
59
+ ---
60
+
61
+ ## English
62
+
63
+ ### Introduction
64
+
65
+ This is a retrained version of the G2PW (Grapheme-to-Phoneme for Word) model, specifically optimized for Simplified Chinese.
66
+
67
+ ### Key Improvements
68
+
69
+ - **Simplified Chinese Only Dataset**: This version is trained exclusively on Simplified Chinese data, providing more accurate pronunciation predictions for Simplified Chinese
70
+ - **Faster Inference Speed**: We recommend using [@baicai-1145/g2pw-torch](https://github.com/baicai-1145/g2pw-torch) for inference, which is faster than the original ONNX implementation
71
+
72
+ ### Model Information
73
+
74
+ - **Model Name**: G2PWModel_zh-Hans
75
+ - **Training Data**: Simplified Chinese corpus
76
+ - **Optimization Goal**: Improve accuracy of Simplified Chinese grapheme-to-phoneme conversion
77
+
78
+ ### Usage
79
+
80
+ We recommend using [g2pw-torch](https://github.com/baicai-1145/g2pw-torch) for inference:
81
+
82
+ ```bash
83
+ pip install g2pw-torch
84
+ ```
85
+
86
+ ```python
87
+ from g2pw import G2PWConverter
88
+
89
+ converter = G2PWConverter(model_dir='baicai1145/G2PWModel_zh-Hans')
90
+ result = converter('我今天很开心')
91
+ print(result)
92
+ ```
93
+
94
+ ### Credits
95
+
96
+ - **Data Collection and Organization**: [@TheSmallHanCat](https://huggingface.co/TheSmallHanCat)
97
+ - **Model Training**: [@baicai1145](https://huggingface.co/baicai1145)
98
+
99
+ ### Related Projects
100
+
101
+ - [baicai-1145/g2pW](https://github.com/baicai-1145/g2pW) - Original G2PW project
102
+ - [baicai-1145/g2pw-torch](https://github.com/baicai-1145/g2pw-torch) - PyTorch inference version (recommended)
103
+
104
+ ### License
105
+
106
+ This project follows the same license as the original project.