Update README.md
Browse files
README.md
CHANGED
|
@@ -54,7 +54,7 @@ The model is trained **only on a large phonetic transliteration dataset**, learn
|
|
| 54 |
|
| 55 |
---
|
| 56 |
|
| 57 |
-
##
|
| 58 |
|
| 59 |
- **Phonetic dataset**
|
| 60 |
- Large-scale Singlish ↔ Sinhala phonetic pairs
|
|
@@ -65,7 +65,7 @@ All datasets were cleaned, normalized, and deduplicated before training.
|
|
| 65 |
|
| 66 |
---
|
| 67 |
|
| 68 |
-
##
|
| 69 |
|
| 70 |
The model was evaluated using standard sequence-to-sequence metrics:
|
| 71 |
|
|
@@ -84,7 +84,7 @@ The two-phase model consistently outperformed the one-phase model, especially on
|
|
| 84 |
```python
|
| 85 |
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
| 86 |
|
| 87 |
-
repo = "
|
| 88 |
|
| 89 |
tokenizer = AutoTokenizer.from_pretrained(repo, use_fast=False)
|
| 90 |
model = AutoModelForSeq2SeqLM.from_pretrained(repo)
|
|
|
|
| 54 |
|
| 55 |
---
|
| 56 |
|
| 57 |
+
## Datasets Used
|
| 58 |
|
| 59 |
- **Phonetic dataset**
|
| 60 |
- Large-scale Singlish ↔ Sinhala phonetic pairs
|
|
|
|
| 65 |
|
| 66 |
---
|
| 67 |
|
| 68 |
+
## Evaluation Metrics
|
| 69 |
|
| 70 |
The model was evaluated using standard sequence-to-sequence metrics:
|
| 71 |
|
|
|
|
| 84 |
```python
|
| 85 |
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
| 86 |
|
| 87 |
+
repo = "Pudamya/mt5-singlish2sinhala"
|
| 88 |
|
| 89 |
tokenizer = AutoTokenizer.from_pretrained(repo, use_fast=False)
|
| 90 |
model = AutoModelForSeq2SeqLM.from_pretrained(repo)
|