Audio-Text-to-Text
Transformers
English
Chinese
transformer
multimodal
vqa
text
audio
Eval Results (legacy)
Instructions to use zeroMN/SHMT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zeroMN/SHMT with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zeroMN/SHMT", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -321,22 +321,11 @@ model-index:
|
|
| 321 |
value: 85
|
| 322 |
pipeline_tag: audio-text-to-text
|
| 323 |
widget:
|
| 324 |
-
-
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
| 329 |
-
Barack Obama nominated Hilary Clinton as his secretary of state on Monday.
|
| 330 |
-
He chose her because she had ...
|
| 331 |
-
example_title: Coreference resolution
|
| 332 |
-
- text: >-
|
| 333 |
-
On a shelf, there are five books: a gray book, a red book, a purple book, a
|
| 334 |
-
blue book, and a black book ...
|
| 335 |
-
example_title: Logic puzzles
|
| 336 |
-
- text: >-
|
| 337 |
-
The two men running to become New York City's next mayor will face off in
|
| 338 |
-
their first debate Wednesday night ...
|
| 339 |
-
example_title: Reading comprehension
|
| 340 |
---
|
| 341 |
### Model Sources
|
| 342 |
You need to use separate code, audio, text, and natural language together with the model. Because the model will use separate word segmenters and vocabularies to achieve the best results when dealing with special cases.
|
|
|
|
| 321 |
value: 85
|
| 322 |
pipeline_tag: audio-text-to-text
|
| 323 |
widget:
|
| 324 |
+
- src: sample1.flac
|
| 325 |
+
output:
|
| 326 |
+
text: "Hello my name is Julien"
|
| 327 |
+
- src: https://huggingface.co/username/model_repo/resolve/main/sample1.flac
|
| 328 |
+
example_title: Custom Speech Sample 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 329 |
---
|
| 330 |
### Model Sources
|
| 331 |
You need to use separate code, audio, text, and natural language together with the model. Because the model will use separate word segmenters and vocabularies to achieve the best results when dealing with special cases.
|