KrorngAI
/

TrorYongASR-tiny

@@ -108,6 +108,7 @@ The evaluation assesses two capabilities — language detection and transcriptio
 **Task:** Given audio input, detect the language.
 <div align="center">
 | Metric | Description |
 |--------|-------------|
 | **Precision** | Proportion of predicted languages that are correct |
@@ -121,6 +122,7 @@ The evaluation assesses two capabilities — language detection and transcriptio
 **Task:** Convert audio to text (transcription).
 <div align="center">
 | Metric | Description |
 |--------|-------------|
 | **Token Error Rate** | Proportion of incorrectly transcribed tokens |
@@ -138,6 +140,7 @@ The evaluation assesses two capabilities — language detection and transcriptio
 #### Language Detection Results
 <div align="center">
 | Dataset | Precision | Recall | Accuracy | F1-score |
 |---------|-----------|--------|----------|----------|
 | google/fleurs (Khmer) | 100% | 100% | 100% | 100% |
@@ -152,6 +155,7 @@ The evaluation assesses two capabilities — language detection and transcriptio
 #### Transcription Results
 <div align="center">
 | Metric | Combined (Khmer + English) | Khmer | English |
 |--------|---------------------------|-------|---------|
 | Token Error Rate | 29% | 56% | 19% |
@@ -254,6 +258,7 @@ Khmer datasets include [`DDD-Cambodia/khm-asr-cultural`](https://huggingface.co/
 Split `clean.100` of [`openslr/librispeech_asr`](https://huggingface.co/datasets/openslr/librispeech_asr) was used as English dataset.
 <div align="center">
 | Dataset                            | Language   | Training examples | Validation examples | Description                                       |
 | ---------                          | ---------- | ----------------- | ------------------- |-                                                 |
 | **openslr/openslr**                | Khmer      | 2906              | 0                   | Multi-speaker TTS data for Khmer language (split `SLR42`) |

 **Task:** Given audio input, detect the language.
 <div align="center">
 | Metric | Description |
 |--------|-------------|
 | **Precision** | Proportion of predicted languages that are correct |
 **Task:** Convert audio to text (transcription).
 <div align="center">
 | Metric | Description |
 |--------|-------------|
 | **Token Error Rate** | Proportion of incorrectly transcribed tokens |
 #### Language Detection Results
 <div align="center">
 | Dataset | Precision | Recall | Accuracy | F1-score |
 |---------|-----------|--------|----------|----------|
 | google/fleurs (Khmer) | 100% | 100% | 100% | 100% |
 #### Transcription Results
 <div align="center">
 | Metric | Combined (Khmer + English) | Khmer | English |
 |--------|---------------------------|-------|---------|
 | Token Error Rate | 29% | 56% | 19% |
 Split `clean.100` of [`openslr/librispeech_asr`](https://huggingface.co/datasets/openslr/librispeech_asr) was used as English dataset.
 <div align="center">
 | Dataset                            | Language   | Training examples | Validation examples | Description                                       |
 | ---------                          | ---------- | ----------------- | ------------------- |-                                                 |
 | **openslr/openslr**                | Khmer      | 2906              | 0                   | Multi-speaker TTS data for Khmer language (split `SLR42`) |