Buckets:
| Returning existing local_dir `/home/barberb/IndexTTS-2-Demo/checkpoints` as remote repo cannot be accessed in `snapshot_download` (Cannot reach https://huggingface.co/api/models/IndexTeam/IndexTTS-2/revision/main: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.). | |
| GPT2InferenceModel has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From ๐v4.50๐ onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions. | |
| - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes | |
| - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception). | |
| - If you are not the owner of the model architecture class, please contact the model code owner to update it. | |
| /home/barberb/IndexTTS-2-Demo/indextts/utils/front.py:113: RuntimeWarning: Text normalization dependencies unavailable; using passthrough normalization. Original error: No module named 'tn' | |
| warnings.warn( | |
| Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.53.0. You should pass an instance of `Cache` instead, e.g. `past_key_values=DynamicCache.from_legacy_cache(past_key_values)`. | |
| [HuggingFace] Downloading models to /home/barberb/IndexTTS-2-Demo/checkpoints,model cache dir=/home/barberb/IndexTTS-2-Demo/checkpoints/hf_cache | |
| >> Be patient, it may take a while to run in CPU mode. | |
| >> GPT weights restored from: /home/barberb/IndexTTS-2-Demo/checkpoints/gpt.pth | |
| >> semantic_codec weights restored from: ./checkpoints/hf_cache/models--amphion--MaskGCT/snapshots/265c6cef07625665d0c28d2faafb1415562379dc/semantic_codec/model.safetensors | |
| cfm loaded | |
| length_regulator loaded | |
| gpt_layer loaded | |
| >> s2mel weights restored from: /home/barberb/IndexTTS-2-Demo/checkpoints/s2mel.pth | |
| >> campplus_model weights restored from: ./checkpoints/hf_cache/models--funasr--campplus/snapshots/81a8afba4ca420cf6f845f157d5fc1d365286821/campplus_cn_common.bin | |
| Loading config.json from local directory | |
| Loading weights from local directory | |
| Removing weight norm... | |
| >> bigvgan weights restored from: ./checkpoints/hf_cache/models--nvidia--bigvgan_v2_22khz_80band_256x/snapshots/633ff708ed5b74903e86ff1298cf4a98e921c513 | |
| >> TextNormalizer loaded | |
| >> bpe model loaded from: /home/barberb/IndexTTS-2-Demo/checkpoints/bpe.model | |
| * Running on local URL: http://127.0.0.1:7861 | |
| * To create a public link, set `share=True` in `launch()`. | |
| Emo control mode:0,weight:0.8,vec:None | |
| >> starting inference... | |
| Use the specified emotion vector | |
| 0%| | 0/25 [00:00<?, ?it/s] 4%|โ | 1/25 [00:00<00:16, 1.43it/s] 8%|โ | 2/25 [00:01<00:15, 1.46it/s] 12%|โโ | 3/25 [00:02<00:15, 1.46it/s] 16%|โโ | 4/25 [00:02<00:14, 1.47it/s] 20%|โโ | 5/25 [00:03<00:13, 1.49it/s] 24%|โโโ | 6/25 [00:04<00:12, 1.47it/s] 28%|โโโ | 7/25 [00:04<00:12, 1.46it/s] 32%|โโโโ | 8/25 [00:05<00:11, 1.47it/s] 36%|โโโโ | 9/25 [00:06<00:10, 1.48it/s] 40%|โโโโ | 10/25 [00:06<00:10, 1.49it/s] 44%|โโโโโ | 11/25 [00:07<00:09, 1.47it/s] 48%|โโโโโ | 12/25 [00:08<00:08, 1.48it/s] 52%|โโโโโโ | 13/25 [00:08<00:08, 1.49it/s] 56%|โโโโโโ | 14/25 [00:09<00:07, 1.48it/s] 60%|โโโโโโ | 15/25 [00:10<00:06, 1.47it/s] 64%|โโโโโโโ | 16/25 [00:10<00:06, 1.47it/s] 68%|โโโโโโโ | 17/25 [00:11<00:05, 1.48it/s] 72%|โโโโโโโโ | 18/25 [00:12<00:04, 1.48it/s] 76%|โโโโโโโโ | 19/25 [00:12<00:04, 1.47it/s] 80%|โโโโโโโโ | 20/25 [00:13<00:03, 1.47it/s] 84%|โโโโโโโโโ | 21/25 [00:14<00:02, 1.46it/s] 88%|โโโโโโโโโ | 22/25 [00:14<00:02, 1.46it/s] 92%|โโโโโโโโโโ| 23/25 [00:15<00:01, 1.47it/s] 96%|โโโโโโโโโโ| 24/25 [00:16<00:00, 1.46it/s] 100%|โโโโโโโโโโ| 25/25 [00:16<00:00, 1.46it/s] 100%|โโโโโโโโโโ| 25/25 [00:16<00:00, 1.47it/s] | |
| >> gpt_gen_time: 83.86 seconds | |
| >> gpt_forward_time: 3.20 seconds | |
| >> s2mel_time: 17.02 seconds | |
| >> bigvgan_time: 2.91 seconds | |
| >> Total inference time: 110.27 seconds | |
| >> Generated audio length: 2.79 seconds | |
| >> [batch] batch_num: 1 bucket_max_size: 1 bucket_count: 1 | |
| >> RTF: 39.5747 | |
| >> wav file saved to: outputs/spk_1779909641-item-1.wav | |
| >> starting inference... | |
| Use the specified emotion vector | |
| 0%| | 0/25 [00:00<?, ?it/s] 4%|โ | 1/25 [00:00<00:20, 1.17it/s] 8%|โ | 2/25 [00:01<00:19, 1.16it/s] 12%|โโ | 3/25 [00:02<00:19, 1.15it/s] 16%|โโ | 4/25 [00:03<00:18, 1.16it/s] 20%|โโ | 5/25 [00:04<00:17, 1.15it/s] 24%|โโโ | 6/25 [00:05<00:16, 1.13it/s] 28%|โโโ | 7/25 [00:06<00:15, 1.13it/s] 32%|โโโโ | 8/25 [00:07<00:15, 1.13it/s] 36%|โโโโ | 9/25 [00:07<00:14, 1.13it/s] 40%|โโโโ | 10/25 [00:08<00:13, 1.13it/s] 44%|โโโโโ | 11/25 [00:09<00:12, 1.12it/s] 48%|โโโโโ | 12/25 [00:10<00:11, 1.12it/s] 52%|โโโโโโ | 13/25 [00:11<00:10, 1.12it/s] 56%|โโโโโโ | 14/25 [00:12<00:09, 1.12it/s] 60%|โโโโโโ | 15/25 [00:13<00:08, 1.11it/s] 64%|โโโโโโโ | 16/25 [00:14<00:08, 1.12it/s] 68%|โโโโโโโ | 17/25 [00:15<00:07, 1.13it/s] 72%|โโโโโโโโ | 18/25 [00:15<00:06, 1.14it/s] 76%|โโโโโโโโ | 19/25 [00:16<00:05, 1.14it/s] 80%|โโโโโโโโ | 20/25 [00:17<00:04, 1.14it/s] 84%|โโโโโโโโโ | 21/25 [00:18<00:03, 1.13it/s] 88%|โโโโโโโโโ | 22/25 [00:19<00:02, 1.13it/s] 92%|โโโโโโโโโโ| 23/25 [00:20<00:01, 1.11it/s] 96%|โโโโโโโโโโ| 24/25 [00:21<00:00, 1.12it/s] 100%|โโโโโโโโโโ| 25/25 [00:22<00:00, 1.11it/s] 100%|โโโโโโโโโโ| 25/25 [00:22<00:00, 1.13it/s] | |
Xet Storage Details
- Size:
- 6.44 kB
- Xet hash:
- acecea02ab4c56c6fc9a5d88d64db95acaa90286023f497bb6729cb59249d3c9
ยท
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.