Instructions to use declare-lab/tango with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use declare-lab/tango with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-audio", model="declare-lab/tango")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("declare-lab/tango", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Commit ·
8f0a25f
1
Parent(s): 1f187f7
Update README.md
Browse files
README.md
CHANGED
|
@@ -58,4 +58,8 @@ prompts = [
|
|
| 58 |
]
|
| 59 |
audios = tango.generate_for_batch(prompts, samples=2)
|
| 60 |
```
|
| 61 |
-
This will generate two samples for each of the three text prompts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
]
|
| 59 |
audios = tango.generate_for_batch(prompts, samples=2)
|
| 60 |
```
|
| 61 |
+
This will generate two samples for each of the three text prompts.
|
| 62 |
+
|
| 63 |
+
# Limitations
|
| 64 |
+
|
| 65 |
+
TANGO is not always able to finely control its generations over textual control prompts as it is trained only on the small AudioCaps dataset. For example, the generations from TANGO for prompts Chopping tomatoes on a wooden table and Chopping potatoes on a metal table are very similar. Chopping vegetables on a table also produces similar audio samples. Training text-to-audio generation models on larger datasets is thus required for the model to learn the composition of textual concepts and varied text-audio mappings. In the future, we plan to improve TANGO by training it on larger datasets and enhancing its compositional and controllable generation ability.
|