qqc1989 commited on
Commit
fa18cac
·
verified ·
1 Parent(s): 6157858

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +167 -3
README.md CHANGED
@@ -1,3 +1,167 @@
1
- ---
2
- license: bsd-3-clause-clear
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsd-3-clause-clear
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - myshell-ai/MeloTTS-Chinese
8
+ pipeline_tag: text-to-speech
9
+ ---
10
+
11
+ # melotts.axera
12
+
13
+ - MeloTTS DEMO on Axera AX650、AX630C
14
+ - 目前模型分成了 encoder、decoder 两部分,encoder 部分尚未转成 axmodel(目前通过 onnxruntime 运行)
15
+ - Github: https://github.com/ml-inory/melotts.axera
16
+
17
+
18
+ ## Support Platform
19
+
20
+ - AX650
21
+ - [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
22
+ - [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
23
+ - AX630C
24
+ - [爱芯派2](https://axera-pi-2-docs-cn.readthedocs.io/zh-cn/latest/index.html)
25
+ - [Module-LLM](https://docs.m5stack.com/zh_CN/module/Module-LLM)
26
+ - [LLM630 Compute Kit](https://docs.m5stack.com/zh_CN/core/LLM630%20Compute%20Kit)
27
+
28
+ |Chips|output wav | cost time | RTF |
29
+ |--|--|--|--|
30
+ |AX650| 12s | 1.5s | 0.125 |
31
+ |AX630C| 12s | | |
32
+
33
+ ## Requirements
34
+
35
+ ### 添加中文输入支持
36
+
37
+ 执行以下命令,正确安装中文输入法之后,重启终端登录
38
+
39
+ ```
40
+ locale-gen C.utf8
41
+ update-locale LANG=C.utf8
42
+ ```
43
+
44
+ ### Python Requirements
45
+
46
+ #### Requirements
47
+
48
+ ```
49
+ cp -rf nltk_data ~/
50
+ apt-get install libsndfile1-dev libmecab-dev
51
+ mkdir /opt/site-packages
52
+ cd python
53
+ pip3 install -r requirements.txt
54
+ ```
55
+
56
+ #### pyaxengine
57
+
58
+ pyaxengine 是 npu 的 python api,详细安装请参考
59
+
60
+ - https://github.com/AXERA-TECH/pyaxengine
61
+
62
+ ## How to use
63
+
64
+ ```
65
+ root@ax650:/mnt/qtang/melotts.axera/python# python3 melotts.py --help
66
+ [INFO] Available providers: ['AxEngineExecutionProvider']
67
+ usage: melotts [-h] [--sentence SENTENCE] [--wav WAV] [--encoder ENCODER] [--decoder DECODER] [--dec_len DEC_LEN] [--sample_rate SAMPLE_RATE] [--speed SPEED]
68
+ [--language {ZH,ZH_MIX_EN,JP,EN,KR,ES,SP,FR}]
69
+
70
+ Run TTS on input sentence
71
+
72
+ options:
73
+ -h, --help show this help message and exit
74
+ --sentence SENTENCE, -s SENTENCE
75
+ --wav WAV, -w WAV
76
+ --encoder ENCODER, -e ENCODER
77
+ --decoder DECODER, -d DECODER
78
+ --dec_len DEC_LEN
79
+ --sample_rate SAMPLE_RATE, -sr SAMPLE_RATE
80
+ --speed SPEED
81
+ --language {ZH,ZH_MIX_EN,JP,EN,KR,ES,SP,FR}, -l {ZH,ZH_MIX_EN,JP,EN,KR,ES,SP,FR}
82
+
83
+ ```
84
+
85
+ 输入命令
86
+
87
+ ```
88
+ python3 melotts.py -s 爱芯元智半导体股份有限公司,致力于打造世界领先的人工智能感知与边缘计算芯片。服务智慧城市、智能驾驶、机器人的海量普惠的应用 \
89
+ -e encoder-onnx/encoder-zh.onnx \
90
+ -d decoder-ax650/decoder-zh.axmodel \
91
+ ```
92
+
93
+ ```
94
+ root@ax650:/mnt/qtang/melotts.axera/python# python3 melotts.py \
95
+ --wav output.wav \
96
+ --encoder ../models/encoder-onnx/encoder-zh.onnx \
97
+ --decoder ../models/ax650/decoder-zh.axmodel \
98
+ --language ZH \
99
+ --speed 0.9
100
+
101
+ [INFO] Available providers: ['AxEngineExecutionProvider']
102
+ sentence: 爱芯元智半导体股份有限公司,致力于打造世界领先的人工智能感知与边缘计算芯片。服务智慧城市、智能驾驶、机器人的海量普惠的应用
103
+ sample_rate: 44100
104
+ encoder: ../models/encoder-onnx/encoder-zh.onnx
105
+ decoder: ../models/ax650/decoder-zh.axmodel
106
+ language: ZH_MIX_EN
107
+ > Text split to sentences.
108
+ 爱芯元智半导体股份有限公司,
109
+ 致力于打造世界领先的人工智能感知与边缘计算芯片.
110
+ 服务智慧城市、智能驾驶、机器人的海量普惠的应用
111
+ > ===========================
112
+ split_sentences_into_pieces take 3.1397342681884766ms
113
+ [INFO] Using provider: AxEngineExecutionProvider
114
+ [INFO] Chip type: ChipType.MC50
115
+ [INFO] VNPU type: VNPUType.DISABLED
116
+ [INFO] Engine version: 2.10.1s
117
+ [INFO] Model type: 0 (single core)
118
+ [INFO] Compiler version: 3.3 3251425d
119
+ load models take 7986.6042137146ms
120
+
121
+ Sentence[0]: 爱芯元智半导体股份有限公司,
122
+ Load language module take 33348.33884239197ms
123
+ Building prefix dict from the default dictionary ...
124
+ Loading model from cache /tmp/jieba.cache
125
+ Loading model cost 3.227 seconds.
126
+ Prefix dict has been built successfully.
127
+ encoder run take 89.70ms
128
+ Decode slice[0]: decoder run take 108.08ms
129
+ Decode slice[1]: decoder run take 92.15ms
130
+ Decode slice[2]: decoder run take 92.17ms
131
+
132
+ Sentence[1]: 致力于打造世界领先的人工智能感知与边缘计算芯片.
133
+ Load language module take 0.042438507080078125ms
134
+ encoder run take 122.83ms
135
+ Decode slice[0]: decoder run take 92.24ms
136
+ Decode slice[1]: decoder run take 92.34ms
137
+ Decode slice[2]: decoder run take 92.16ms
138
+ Decode slice[3]: decoder run take 92.16ms
139
+ Decode slice[4]: decoder run take 92.22ms
140
+
141
+ Sentence[2]: 服务智慧城市、智能驾驶、机器人的海量普惠的应用
142
+ Load language module take 0.046253204345703125ms
143
+ encoder run take 112.59ms
144
+ Decode slice[0]: decoder run take 92.26ms
145
+ Decode slice[1]: decoder run take 92.16ms
146
+ Decode slice[2]: decoder run take 92.13ms
147
+ Decode slice[3]: decoder run take 92.13ms
148
+ Decode slice[4]: decoder run take 92.10ms
149
+ Save to output.wav
150
+ root@ax650:/mnt/qtang/melotts.axera/python#
151
+ ```
152
+
153
+ 输出音频
154
+
155
+ https://github.com/user-attachments/assets/eda5c10c-7d30-46e5-a56a-f6edcf7813af
156
+
157
+
158
+ 详细的运行参数:
159
+ | 参数名称 | 说明 | 默认值 |
160
+ | --- | --- | --- |
161
+ | -s/--sentence | 输入句子 | |
162
+ | -w/--wav | 输出音频路径,wav格式 | output.wav |
163
+ | -e/--encoder | encoder模型路径 | ../models/encoder.onnx |
164
+ | -d/--decoder | decoder模型路径 | ../models/decoder.axmodel |
165
+ | -sr/--sample_rate | 采样率 | 44100 |
166
+ | --speed | 语速,越大表示越快 | 0.8 |
167
+ | --language | 从"ZH", "ZH_MIX_EN", "JP", "EN", 'KR', "SP", "FR"选择,分别对应中文、中英混合、日语、英语、韩语、西班牙语,法语 | ZH_MIX_EN