Buckets:

Publicus/abby-voice / tasks /gen_batch_e2e_server.log
endomorphosis's picture
download
raw
6.44 kB
Returning existing local_dir `/home/barberb/IndexTTS-2-Demo/checkpoints` as remote repo cannot be accessed in `snapshot_download` (Cannot reach https://huggingface.co/api/models/IndexTeam/IndexTTS-2/revision/main: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.).
GPT2InferenceModel has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From ๐Ÿ‘‰v4.50๐Ÿ‘ˆ onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
- If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
- If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
- If you are not the owner of the model architecture class, please contact the model code owner to update it.
/home/barberb/IndexTTS-2-Demo/indextts/utils/front.py:113: RuntimeWarning: Text normalization dependencies unavailable; using passthrough normalization. Original error: No module named 'tn'
warnings.warn(
Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.53.0. You should pass an instance of `Cache` instead, e.g. `past_key_values=DynamicCache.from_legacy_cache(past_key_values)`.
[HuggingFace] Downloading models to /home/barberb/IndexTTS-2-Demo/checkpoints,model cache dir=/home/barberb/IndexTTS-2-Demo/checkpoints/hf_cache
>> Be patient, it may take a while to run in CPU mode.
>> GPT weights restored from: /home/barberb/IndexTTS-2-Demo/checkpoints/gpt.pth
>> semantic_codec weights restored from: ./checkpoints/hf_cache/models--amphion--MaskGCT/snapshots/265c6cef07625665d0c28d2faafb1415562379dc/semantic_codec/model.safetensors
cfm loaded
length_regulator loaded
gpt_layer loaded
>> s2mel weights restored from: /home/barberb/IndexTTS-2-Demo/checkpoints/s2mel.pth
>> campplus_model weights restored from: ./checkpoints/hf_cache/models--funasr--campplus/snapshots/81a8afba4ca420cf6f845f157d5fc1d365286821/campplus_cn_common.bin
Loading config.json from local directory
Loading weights from local directory
Removing weight norm...
>> bigvgan weights restored from: ./checkpoints/hf_cache/models--nvidia--bigvgan_v2_22khz_80band_256x/snapshots/633ff708ed5b74903e86ff1298cf4a98e921c513
>> TextNormalizer loaded
>> bpe model loaded from: /home/barberb/IndexTTS-2-Demo/checkpoints/bpe.model
* Running on local URL: http://127.0.0.1:7861
* To create a public link, set `share=True` in `launch()`.
Emo control mode:0,weight:0.8,vec:None
>> starting inference...
Use the specified emotion vector
0%| | 0/25 [00:00<?, ?it/s] 4%|โ– | 1/25 [00:00<00:16, 1.43it/s] 8%|โ–Š | 2/25 [00:01<00:15, 1.46it/s] 12%|โ–ˆโ– | 3/25 [00:02<00:15, 1.46it/s] 16%|โ–ˆโ–Œ | 4/25 [00:02<00:14, 1.47it/s] 20%|โ–ˆโ–ˆ | 5/25 [00:03<00:13, 1.49it/s] 24%|โ–ˆโ–ˆโ– | 6/25 [00:04<00:12, 1.47it/s] 28%|โ–ˆโ–ˆโ–Š | 7/25 [00:04<00:12, 1.46it/s] 32%|โ–ˆโ–ˆโ–ˆโ– | 8/25 [00:05<00:11, 1.47it/s] 36%|โ–ˆโ–ˆโ–ˆโ–Œ | 9/25 [00:06<00:10, 1.48it/s] 40%|โ–ˆโ–ˆโ–ˆโ–ˆ | 10/25 [00:06<00:10, 1.49it/s] 44%|โ–ˆโ–ˆโ–ˆโ–ˆโ– | 11/25 [00:07<00:09, 1.47it/s] 48%|โ–ˆโ–ˆโ–ˆโ–ˆโ–Š | 12/25 [00:08<00:08, 1.48it/s] 52%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 13/25 [00:08<00:08, 1.49it/s] 56%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ | 14/25 [00:09<00:07, 1.48it/s] 60%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ | 15/25 [00:10<00:06, 1.47it/s] 64%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 16/25 [00:10<00:06, 1.47it/s] 68%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š | 17/25 [00:11<00:05, 1.48it/s] 72%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 18/25 [00:12<00:04, 1.48it/s] 76%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ | 19/25 [00:12<00:04, 1.47it/s] 80%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ | 20/25 [00:13<00:03, 1.47it/s] 84%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 21/25 [00:14<00:02, 1.46it/s] 88%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š | 22/25 [00:14<00:02, 1.46it/s] 92%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–| 23/25 [00:15<00:01, 1.47it/s] 96%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ| 24/25 [00:16<00:00, 1.46it/s] 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 25/25 [00:16<00:00, 1.46it/s] 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 25/25 [00:16<00:00, 1.47it/s]
>> gpt_gen_time: 83.86 seconds
>> gpt_forward_time: 3.20 seconds
>> s2mel_time: 17.02 seconds
>> bigvgan_time: 2.91 seconds
>> Total inference time: 110.27 seconds
>> Generated audio length: 2.79 seconds
>> [batch] batch_num: 1 bucket_max_size: 1 bucket_count: 1
>> RTF: 39.5747
>> wav file saved to: outputs/spk_1779909641-item-1.wav
>> starting inference...
Use the specified emotion vector
0%| | 0/25 [00:00<?, ?it/s] 4%|โ– | 1/25 [00:00<00:20, 1.17it/s] 8%|โ–Š | 2/25 [00:01<00:19, 1.16it/s] 12%|โ–ˆโ– | 3/25 [00:02<00:19, 1.15it/s] 16%|โ–ˆโ–Œ | 4/25 [00:03<00:18, 1.16it/s] 20%|โ–ˆโ–ˆ | 5/25 [00:04<00:17, 1.15it/s] 24%|โ–ˆโ–ˆโ– | 6/25 [00:05<00:16, 1.13it/s] 28%|โ–ˆโ–ˆโ–Š | 7/25 [00:06<00:15, 1.13it/s] 32%|โ–ˆโ–ˆโ–ˆโ– | 8/25 [00:07<00:15, 1.13it/s] 36%|โ–ˆโ–ˆโ–ˆโ–Œ | 9/25 [00:07<00:14, 1.13it/s] 40%|โ–ˆโ–ˆโ–ˆโ–ˆ | 10/25 [00:08<00:13, 1.13it/s] 44%|โ–ˆโ–ˆโ–ˆโ–ˆโ– | 11/25 [00:09<00:12, 1.12it/s] 48%|โ–ˆโ–ˆโ–ˆโ–ˆโ–Š | 12/25 [00:10<00:11, 1.12it/s] 52%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 13/25 [00:11<00:10, 1.12it/s] 56%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ | 14/25 [00:12<00:09, 1.12it/s] 60%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ | 15/25 [00:13<00:08, 1.11it/s] 64%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 16/25 [00:14<00:08, 1.12it/s] 68%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š | 17/25 [00:15<00:07, 1.13it/s] 72%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 18/25 [00:15<00:06, 1.14it/s] 76%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ | 19/25 [00:16<00:05, 1.14it/s] 80%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ | 20/25 [00:17<00:04, 1.14it/s] 84%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ– | 21/25 [00:18<00:03, 1.13it/s] 88%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š | 22/25 [00:19<00:02, 1.13it/s] 92%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–| 23/25 [00:20<00:01, 1.11it/s] 96%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ| 24/25 [00:21<00:00, 1.12it/s] 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 25/25 [00:22<00:00, 1.11it/s] 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 25/25 [00:22<00:00, 1.13it/s]

Xet Storage Details

Size:
6.44 kB
ยท
Xet hash:
acecea02ab4c56c6fc9a5d88d64db95acaa90286023f497bb6729cb59249d3c9

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.