Audio-Text-to-Text
Transformers
English
Chinese
transformer
multimodal
vqa
text
audio
Eval Results (legacy)
Instructions to use zeroMN/SHMT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zeroMN/SHMT with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zeroMN/SHMT", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -76,8 +76,7 @@ The model can be fine-tuned for specific tasks such as visual question answering
|
|
| 76 |
|
| 77 |
### Out-of-Scope Use
|
| 78 |
|
| 79 |
-
The
|
| 80 |
-
|
| 81 |
## Bias, Risks, and Limitations
|
| 82 |
|
| 83 |
### Recommendations
|
|
|
|
| 76 |
|
| 77 |
### Out-of-Scope Use
|
| 78 |
|
| 79 |
+
The Evolved Multimodal Model is not suitable for tasks that require high expertise or domain-specific expertise beyond its current capabilities. The number of speech frames still needs to be fine-tuned by yourself.
|
|
|
|
| 80 |
## Bias, Risks, and Limitations
|
| 81 |
|
| 82 |
### Recommendations
|