| ## Rigel Pretrained Model | |
| Base and Fine tuned models | |
| ### Dataset | |
| * **Size:** Total 1921 hours of speech and vocals. | |
| * **Languages:** | |
| * Arabic: ~70 hours | |
| * Chinese (Mandarin): ~70 hours | |
| * English: ~800 hours | |
| * French: ~42 hours | |
| * German: ~35 hours | |
| * Hindi: ~30 hours | |
| * Indonesian: ~53 hours | |
| * Japanese: ~140 hours | |
| * Korean: ~80 hours | |
| * Portuguese: ~40 hours | |
| * Russian: ~188 hours | |
| * Singing (all languages): ~190 hours | |
| * Spanish: ~200 hours | |
| * Tagalog: ~30 hours | |
| * Common language: Unknown amount | |
| ### Sampling Frequency | |
| * **32kHz** (Done) | |
| * **40kHz** (Retraining) | |
| ### Models | |
| #### **Base Model** | |
| * **Data:** Total 1921 hours of low-mid quality data. | |
| * **Steps:** 3,890,220 | |
| * **Batch:** 40 | |
| * **Precision:** FP32 | |
| * **Sampling Rate:** 32k | |
| #### **Fine-Tuned Model** | |
| * **Data:** 102 hours of high-quality data. | |
| * **Steps:** 2,854,856 | |
| * **Batch:** 20 | |
| * **Precision:** FP32 | |
| * **Sampling Rate:** 32k | |
| ### Hardware Used | |
| * **CPU:** AMD EPYC 9754 | |
| * **RAM:** 256GB | |
| * **GPUs:** | |
| * 1 x H100 | |
| * 4 x L40s | |
| * 1 x RTX 4080 | |
| * 1 x RTX 4070 Ti | |
| ### Expected Release Date | |
| * July 22nd | |
|  | |