Update README.md
Browse files
README.md
CHANGED
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
Hok2Han Seq2Seq Transformer Model
|
| 5 |
+
|
| 6 |
+
這是一個基於 PyTorch 的 Seq2Seq Transformer 模型,用於將台語拼音轉成台語漢字。
|
| 7 |
+
模型權重與設定已上傳至 Hugging Face Hub。
|
| 8 |
+
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
Repo 內容
|
| 12 |
+
- best_model.pth:訓練好的模型權重
|
| 13 |
+
- config.json:模型配置檔
|
| 14 |
+
- hok2han_model.py:模型結構程式碼(需放在本地使用)
|
| 15 |
+
- input__tokenizer/:輸入 Tokenizer 資料夾
|
| 16 |
+
- output_tokenizer/:輸出 Tokenizer 資料夾
|
| 17 |
+
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
安裝需求
|
| 21 |
+
pip install torch transformers huggingface_hub
|
| 22 |
+
|
| 23 |
+
---
|
| 24 |
+
|
| 25 |
+
使用說明
|
| 26 |
+
|
| 27 |
+
1. 放置 hok2han_model.py 至工作目錄。
|
| 28 |
+
|
| 29 |
+
2. 載入模型:
|
| 30 |
+
from hok2han_model import Seq2SeqTransformer
|
| 31 |
+
model = Seq2SeqTransformer.from_pretrained("KikKoh/Hok2Han")
|
| 32 |
+
model.eval()
|
| 33 |
+
|
| 34 |
+
3. 載入 Tokenizer:
|
| 35 |
+
from transformers import Wav2Vec2Processor, BertTokenizer
|
| 36 |
+
input_processor = Wav2Vec2Processor.from_pretrained("你的輸入tokenizer路徑或repo")
|
| 37 |
+
output_tokenizer = BertTokenizer.from_pretrained("你的輸出tokenizer路徑或repo")
|
| 38 |
+
|
| 39 |
+
4. 推論範例(簡略):
|
| 40 |
+
output = model(src=input_ids, tgt=tgt_ids,
|
| 41 |
+
src_pad_idx=input_tokenizer.pad_token_id,
|
| 42 |
+
tgt_pad_idx=output_tokenizer.pad_token_id)
|
| 43 |
+
|
| 44 |
+
pred_ids = output.argmax(dim=-1)
|
| 45 |
+
pred_text = output_tokenizer.decode(pred_ids[0], skip_special_tokens=True)
|
| 46 |
+
print(pred_text)
|
| 47 |
+
|
| 48 |
+
---
|
| 49 |
+
|
| 50 |
+
注意事項
|
| 51 |
+
- 本模型為自訂架構,請使用 hok2han_model.py 中 from_pretrained 載入。
|
| 52 |
+
- 根據設備調整推論裝置。
|
| 53 |
+
|
| 54 |
+
---
|
| 55 |
+
|
| 56 |
+
聯絡
|
| 57 |
+
<strong>KikKoh</strong><br>
|
| 58 |
+
<a href="https://www.facebook.com/kikkoh.2024" target="_blank" style="text-decoration:none; color:#1877F2;">
|
| 59 |
+
<img src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook" style="width:16px; vertical-align:middle; margin-right:6px;">Facebook: KikKoh
|
| 60 |
+
</a>
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
授權
|
| 65 |
+
apache-2.0
|