Pretraining Thai Language The essential pretraining backbone for Thai language understanding. Covers general knowledge (Wiki), formal/legal registers, idiomatic expressions. pythainlp/thai-idioms-instruction Viewer • Updated Oct 9, 2025 • 1.15k • 40 • 1 pythainlp/thai-wiki-dataset-v3 Viewer • Updated Jan 20, 2024 • 197k • 313 • 11 airesearch/WangchanX-Legal-ThaiCCL-Retriever Sentence Similarity • 0.6B • Updated Oct 18, 2024 • 157 • 3 pythainlp/thai-open-data-go-th Viewer • Updated Mar 13, 2024 • 2.35k • 43
airesearch/WangchanX-Legal-ThaiCCL-Retriever Sentence Similarity • 0.6B • Updated Oct 18, 2024 • 157 • 3
Datasets for Pretrained Thai LLM ALL List Datasets for pretrained Thai LLM by PyThaiNLP pythainlp/thai_food_v1.0 Viewer • Updated Feb 18, 2024 • 159 • 47 • 9 pythainlp/thailaw-v1.0 Viewer • Updated Feb 25, 2024 • 52.6k • 51 • 11 pythainlp/thai-wiki-dataset-v3 Viewer • Updated Jan 20, 2024 • 197k • 313 • 11 pythainlp/thaisum Viewer • Updated Oct 8, 2023 • 392k • 372 • 4
Pretraining Thai Language The essential pretraining backbone for Thai language understanding. Covers general knowledge (Wiki), formal/legal registers, idiomatic expressions. pythainlp/thai-idioms-instruction Viewer • Updated Oct 9, 2025 • 1.15k • 40 • 1 pythainlp/thai-wiki-dataset-v3 Viewer • Updated Jan 20, 2024 • 197k • 313 • 11 airesearch/WangchanX-Legal-ThaiCCL-Retriever Sentence Similarity • 0.6B • Updated Oct 18, 2024 • 157 • 3 pythainlp/thai-open-data-go-th Viewer • Updated Mar 13, 2024 • 2.35k • 43
airesearch/WangchanX-Legal-ThaiCCL-Retriever Sentence Similarity • 0.6B • Updated Oct 18, 2024 • 157 • 3
Datasets for Pretrained Thai LLM ALL List Datasets for pretrained Thai LLM by PyThaiNLP pythainlp/thai_food_v1.0 Viewer • Updated Feb 18, 2024 • 159 • 47 • 9 pythainlp/thailaw-v1.0 Viewer • Updated Feb 25, 2024 • 52.6k • 51 • 11 pythainlp/thai-wiki-dataset-v3 Viewer • Updated Jan 20, 2024 • 197k • 313 • 11 pythainlp/thaisum Viewer • Updated Oct 8, 2023 • 392k • 372 • 4