|
|
--- |
|
|
title: README |
|
|
emoji: ⚡ |
|
|
colorFrom: green |
|
|
colorTo: green |
|
|
sdk: static |
|
|
pinned: false |
|
|
--- |
|
|
## Dataset Source: |
|
|
- Original Source: The English sentences were sourced from oubic domain sources such as https://www.gutenberg.org/ . |
|
|
- Translation Tool: Google Translate was used for translating the sentences from English to Yoruba. |
|
|
|
|
|
## Dataset Format: |
|
|
- english: The original English sentence. |
|
|
- yoruba: The Yoruba translation of the sentence. |
|
|
- source: the source of the English sentences. |
|
|
|
|
|
|
|
|
**Example**: |
|
|
|en |yo |source| |
|
|
|-----|----------------------------|--------------------------| |
|
|
|The subconscious offensiveness of their attitude has constituted old Jolyon's 'home' the psychological moment of the family history, made it the prelude of their drama.| Iwa ibinu èroÅ„gbà ti iá¹£esi wá»n ti jẹ “ile†atijá» ti Jolyon ni akoko imá»-jinlẹ ti itan-aká»á»lẹ ẹbi, jẹ ki o jẹ iá¹£aaju ti eré wá»n.| https://www.gutenberg.org/ebooks/2559.txt.utf-8| |
|
|
|The Forsytes were resentful of something, not individually, but as a family; this resentment expressed itself in an added perfection of raiment, an exuberance of family cordiality, an exaggeration of family importance, and--the sniff.| Awá»n Forsytes binu si nkan kan, kii á¹£e olukuluku, á¹£ugbá»n gẹgẹbi idile; ibinu yii á¹£e afihan ararẹ ni pipe ti aṣỠti a fi kun, igbadun ti ifarabalẹ idile, iá¹£aju ti pataki idile, ati --ifun.| https://www.gutenberg.org/ebooks/2559.txt.utf-8 | |
|
|
|Danger--so indispensable in bringing out the fundamental quality of any society, group, or individual--was what the Forsytes scented; the premonition of danger put a burnish on their armour.| Ewu - nitorinaa ko á¹£e pataki lati mu didara ipilẹ ti awujá», ẹgbẹ, tabi ẹni ká»á»kan jade - jẹ ohun ti awá»n Forsytes rùn; premonition ti ewu fi kan iná lori wá»n ihamá»ra.| https://www.gutenberg.org/ebooks/2559.txt.utf-8| |
|
|
|
|
|
## Dataset Size: |
|
|
- Number of Entries: 520,000 |
|
|
|
|
|
## Usage: |
|
|
This dataset can be used for: |
|
|
- Training machine translation models for Yoruba. |
|
|
- Analyzing translation quality and limitations in automated tools. |
|
|
- Supporting linguistic research and NLP projects for low-resource languages. |
|
|
|
|
|
## Limitations and Considerations: |
|
|
- **Quality of Translations**: As translations were generated using Google Translate, some sentences may not reflect perfect accuracy. Manual validation is recommended for critical applications. |
|
|
- **Cultural and Contextual Nuances**: Machine translations might miss idiomatic expressions or cultural nuances present in the source language. |
|
|
- **Biases**: Any biases inherent in Google Translate's model may propagate into this dataset. |
|
|
|
|
|
## Licensing: |
|
|
Source Material License: Public Domain |
|
|
|
|
|
## Tags: |
|
|
|
|
|
- machine-translation |
|
|
|
|
|
- speech-to-text |
|
|
|
|
|
- yoruba-language |
|
|
|
|
|
- african-languages |
|
|
|
|
|
## Task_categories: |
|
|
|
|
|
- text-classification |
|
|
|
|
|
- machine-translation |
|
|
|
|
|
--- |
|
|
|