Update README.md
Browse files
README.md
CHANGED
|
@@ -10,6 +10,20 @@ language: multilingual
|
|
| 10 |
[Huggingface](https://huggingface.co/DataoceanAI)
|
| 11 |
[Modelscope](https://www.modelscope.cn/organization/DataoceanAI)
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
Dolphin is a multilingual, multitask ASR model developed through a collaboration between Dataocean AI and Tsinghua University. It supports 40 Eastern languages across East Asia, South Asia, Southeast Asia, and the Middle East, while also supporting 22 Chinese dialects. It is trained on over 210,000 hours of data, which includes both DataoceanAI's proprietary datasets and open-source datasets. The model can perform speech recognition, voice activity detection (VAD), segmentation, and language identification (LID).
|
| 14 |
|
| 15 |
## Approach
|
|
|
|
| 10 |
[Huggingface](https://huggingface.co/DataoceanAI)
|
| 11 |
[Modelscope](https://www.modelscope.cn/organization/DataoceanAI)
|
| 12 |
|
| 13 |
+
# Repository Notice
|
| 14 |
+
|
| 15 |
+
This model is officially maintained by **Dataocean AI**.
|
| 16 |
+
|
| 17 |
+
To ensure compatibility with existing user code and download links, we keep two official repositories for the same model:
|
| 18 |
+
|
| 19 |
+
- Original / legacy repository: DataoceanAI
|
| 20 |
+
- Organization / enterprise repository: DataoceanAI1
|
| 21 |
+
|
| 22 |
+
Both repositories are maintained by the same team and contain the same model files.
|
| 23 |
+
DataoceanAI1 is the newly created enterprise organization account, while DataoceanAI is kept to avoid breaking existing user download scripts and links.
|
| 24 |
+
|
| 25 |
+
Please do not regard either repository as an unofficial copy or unauthorized redistribution.
|
| 26 |
+
|
| 27 |
Dolphin is a multilingual, multitask ASR model developed through a collaboration between Dataocean AI and Tsinghua University. It supports 40 Eastern languages across East Asia, South Asia, Southeast Asia, and the Middle East, while also supporting 22 Chinese dialects. It is trained on over 210,000 hours of data, which includes both DataoceanAI's proprietary datasets and open-source datasets. The model can perform speech recognition, voice activity detection (VAD), segmentation, and language identification (LID).
|
| 28 |
|
| 29 |
## Approach
|