Fine-tuning NLLB-200 for a New Low-Resource Language in 2026
• 1
None defined yet.
FormosanBank is a machine-readable corpus and tooling ecosystem for Taiwan’s Indigenous Formosan languages. This Hugging Face organization hosts datasets and related resources for research, education, language revitalization, and speech/language technology.
Licensing may vary by corpus. Please check each dataset card and the project documentation before reuse.
Formosan-English/Chinese NLLB-200 translation demo
Transcribe Amis audio and export ELAN annotations
Transcribe and edit Paiwan audio into searchable text files
Transcribe spoken audio into text for multiple Formosan languages