rahul7star commited on
Commit
8223e9d
Β·
verified Β·
1 Parent(s): 2ee3ffb

Chatterbox fine-tuned model + logs

Browse files
runs/Feb07_05-31-06_r-rahul7star-chatterbox-train-gfkn44wm-f343a-8jf6z/events.out.tfevents.1770438666.r-rahul7star-chatterbox-train-gfkn44wm-f343a-8jf6z.44.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5faffe1cb2925e0fe23f69c644aa52f0c754d272a5c03b06545c8651e3e66090
3
+ size 4094
training.log CHANGED
@@ -1,7 +1,8 @@
 
1
 
2
  /usr/local/lib/python3.13/site-packages/perth/perth_net/__init__.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
3
  from pkg_resources import resource_filename
4
- 02/07/2026 05:21:50 - INFO - __main__ - Training/evaluation parameters CustomTrainingArguments(
5
  accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False},
6
  adam_beta1=0.9,
7
  adam_beta2=0.999,
@@ -113,123 +114,721 @@ warmup_ratio=None,
113
  warmup_steps=1.0,
114
  weight_decay=0.0,
115
  )
116
- 02/07/2026 05:21:50 - INFO - __main__ - Model parameters ModelArguments(model_name_or_path='ResembleAI/chatterbox', local_model_dir=None, cache_dir=None, freeze_voice_encoder=True, freeze_s3gen=True)
117
- 02/07/2026 05:21:50 - INFO - __main__ - Data parameters DataArguments(language='hi', dataset_dir=None, metadata_file=None, dataset_name='rahul7star/hindi-speech-dataset', dataset_config_name=None, train_split_name='train', eval_split_name='validation', text_column_name='text_scribe', audio_column_name='audio', max_text_len=256, max_speech_len=800, audio_prompt_duration_s=3.0, eval_split_size=0.0002, preprocessing_num_workers=None, ignore_verifications=False)
118
- 02/07/2026 05:21:50 - INFO - __main__ - Loading ChatterboxTTS model...
119
- 02/07/2026 05:21:50 - INFO - __main__ - Loading model from Hugging Face Hub: ResembleAI/chatterbox
120
  /usr/local/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py:202: UserWarning: The `local_dir_use_symlinks` argument is deprecated and ignored in `hf_hub_download`. Downloading to a local directory does not use symlinks anymore.
121
  warnings.warn(
122
- 02/07/2026 05:21:50 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/ve.safetensors "HTTP/1.1 302 Found"
123
- 02/07/2026 05:21:50 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/models/ResembleAI/chatterbox/xet-read-token/05e904af2b5c7f8e482687a9d7336c5c824467d9 "HTTP/1.1 200 OK"
124
 
125
 
126
  ve.safetensors: 0%| | 0.00/5.70M [00:00<?, ?B/s]
127
- ve.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5.70M/5.70M [00:00<00:00, 21.5MB/s]
128
- 02/07/2026 05:21:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/t3_mtl23ls_v2.safetensors "HTTP/1.1 302 Found"
129
 
130
 
131
  t3_mtl23ls_v2.safetensors: 0%| | 0.00/2.14G [00:00<?, ?B/s]
132
 
133
- t3_mtl23ls_v2.safetensors: 0%| | 7.60M/2.14G [00:01<05:35, 6.37MB/s]
134
 
135
- t3_mtl23ls_v2.safetensors: 4%|β–Ž | 78.6M/2.14G [00:04<01:49, 18.9MB/s]
136
 
137
- t3_mtl23ls_v2.safetensors: 27%|β–ˆβ–ˆβ–‹ | 576M/2.14G [00:06<00:15, 102MB/s] 
138
 
139
- t3_mtl23ls_v2.safetensors: 34%|β–ˆβ–ˆβ–ˆβ–Ž | 719M/2.14G [00:08<00:14, 97.7MB/s]
140
-
141
- t3_mtl23ls_v2.safetensors: 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 1.43G/2.14G [00:09<00:03, 224MB/s]
142
- t3_mtl23ls_v2.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2.14G/2.14G [00:10<00:00, 207MB/s]
143
- 02/07/2026 05:22:01 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/s3gen.safetensors "HTTP/1.1 302 Found"
144
 
145
 
146
  s3gen.safetensors: 0%| | 0.00/1.06G [00:00<?, ?B/s]
147
 
148
- s3gen.safetensors: 5%|▍ | 50.7M/1.06G [00:01<00:34, 29.2MB/s]
149
 
150
- s3gen.safetensors: 30%|β–ˆβ–ˆβ–ˆ | 319M/1.06G [00:02<00:05, 137MB/s] 
151
- s3gen.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06G/1.06G [00:03<00:00, 335MB/s]
152
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/mtl_tokenizer.json "HTTP/1.1 307 Temporary Redirect"
153
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/mtl_tokenizer.json "HTTP/1.1 200 OK"
154
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/mtl_tokenizer.json "HTTP/1.1 200 OK"
155
 
156
 
157
  mtl_tokenizer.json: 0%| | 0.00/68.1k [00:00<?, ?B/s]
158
- mtl_tokenizer.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 68.1k/68.1k [00:00<00:00, 86.5MB/s]
159
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/conds.pt "HTTP/1.1 302 Found"
160
 
161
 
162
  conds.pt: 0%| | 0.00/107k [00:00<?, ?B/s]
163
- conds.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 107k/107k [00:00<00:00, 1.14MB/s]
164
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/models/ResembleAI/chatterbox/revision/main "HTTP/1.1 200 OK"
165
 
166
 
167
  Downloading (incomplete total...): 0.00B [00:00, ?B/s]
168
 
169
- Fetching 5 files: 0%| | 0/5 [00:00<?, ?it/s]02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/conds.pt "HTTP/1.1 302 Found"
170
-
171
 
172
- Downloading (incomplete total...): 0%| | 0.00/107k [00:00<?, ?B/s]02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 307 Temporary Redirect"
173
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/s3gen.pt "HTTP/1.1 302 Found"
174
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/t3_mtl23ls_v2.safetensors "HTTP/1.1 302 Found"
175
 
 
 
176
 
177
- Downloading (incomplete total...): 0%| | 0.00/2.14G [00:00<?, ?B/s]02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/ve.pt "HTTP/1.1 302 Found"
178
 
 
179
 
180
- Downloading (incomplete total...): 0%| | 0.00/3.20G [00:00<?, ?B/s]
181
 
182
- Downloading (incomplete total...): 0%| | 0.00/3.21G [00:00<?, ?B/s]02/07/2026 05:22:04 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 200 OK"
183
- 02/07/2026 05:22:04 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 200 OK"
 
184
 
185
 
186
  Downloading (incomplete total...): 0%| | 0.00/3.21G [00:00<?, ?B/s]
187
 
188
- Downloading (incomplete total...): 0%| | 13.5M/3.21G [00:02<10:16, 5.18MB/s]
 
 
189
 
190
- Downloading (incomplete total...): 4%|▍ | 136M/3.21G [00:04<01:31, 33.7MB/s] 
191
 
192
- Downloading (incomplete total...): 13%|β–ˆβ–Ž | 412M/3.21G [00:05<00:28, 97.6MB/s]
193
 
194
- Downloading (incomplete total...): 28%|β–ˆβ–ˆβ–Š | 894M/3.21G [00:08<00:17, 135MB/s] 
195
 
196
- Downloading (incomplete total...): 34%|β–ˆβ–ˆβ–ˆβ– | 1.10G/3.21G [00:11<00:20, 103MB/s]
197
 
198
- Downloading (incomplete total...): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 2.14G/3.21G [00:12<00:04, 259MB/s]
199
 
200
- Fetching 5 files: 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 3/5 [00:13<00:08, 4.46s/it]
201
- Fetching 5 files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5/5 [00:13<00:00, 2.68s/it]
202
 
203
 
204
- Download complete: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.21G/3.21G [00:13<00:00, 259MB/s] 
205
- Download complete: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.21G/3.21G [00:22<00:00, 140MB/s]
206
  /usr/local/lib/python3.13/site-packages/diffusers/models/lora.py:393: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
207
  deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
208
- 02/07/2026 05:22:29 - INFO - root - input frame rate=25
209
- 02/07/2026 05:22:34 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/Cangjie5_TC.json "HTTP/1.1 307 Temporary Redirect"
210
- 02/07/2026 05:22:34 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
211
- 02/07/2026 05:22:34 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
212
 
213
 
214
  Cangjie5_TC.json: 0%| | 0.00/1.92M [00:00<?, ?B/s]
215
- Cangjie5_TC.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.92M/1.92M [00:00<00:00, 23.0MB/s]
216
  Downloading: "https://github.com/explosion/spacy-pkuseg/releases/download/v0.0.26/spacy_ontonotes.zip" to /root/.pkuseg/spacy_ontonotes.zip
217
 
218
 
219
  0%| | 0/34567143 [00:00<?, ?it/s]
220
- 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 34567143/34567143 [00:00<00:00, 159733455.87it/s]
221
- Traceback (most recent call last):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
222
  File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 849, in <module>
223
  main()
224
  ~~~~^^
225
- File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 616, in main
226
- chatterbox_model = ChatterboxMultilingualTTS.from_pretrained(device="cpu")
227
- File "/app/chatterbox-multilingual-finetuning/src/chatterbox/mtl_tts.py", line 201, in from_pretrained
228
- return cls.from_local(ckpt_dir, device)
229
- ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
230
- File "/app/chatterbox-multilingual-finetuning/src/chatterbox/mtl_tts.py", line 180, in from_local
231
- conds = Conditionals.load(builtin_voice).to(device)
232
- File "/app/chatterbox-multilingual-finetuning/src/chatterbox/mtl_tts.py", line 106, in to
233
- self.t3 = self.t3.to(device)
234
- ~~~~~~~~~~^^^^^^^^
235
- TypeError: T3Cond.to() takes 1 positional argument but 2 were given
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ loaded PerthNet (Implicit) at step 250,000
2
 
3
  /usr/local/lib/python3.13/site-packages/perth/perth_net/__init__.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
4
  from pkg_resources import resource_filename
5
+ 02/07/2026 05:27:34 - INFO - __main__ - Training/evaluation parameters CustomTrainingArguments(
6
  accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False},
7
  adam_beta1=0.9,
8
  adam_beta2=0.999,
 
114
  warmup_steps=1.0,
115
  weight_decay=0.0,
116
  )
117
+ 02/07/2026 05:27:34 - INFO - __main__ - Model parameters ModelArguments(model_name_or_path='ResembleAI/chatterbox', local_model_dir=None, cache_dir=None, freeze_voice_encoder=True, freeze_s3gen=True)
118
+ 02/07/2026 05:27:34 - INFO - __main__ - Data parameters DataArguments(language='hi', dataset_dir=None, metadata_file=None, dataset_name='rahul7star/hindi-speech-dataset', dataset_config_name=None, train_split_name='train', eval_split_name='validation', text_column_name='text_scribe', audio_column_name='audio', max_text_len=256, max_speech_len=800, audio_prompt_duration_s=3.0, eval_split_size=0.0002, preprocessing_num_workers=None, ignore_verifications=False)
119
+ 02/07/2026 05:27:34 - INFO - __main__ - Loading ChatterboxTTS model...
120
+ 02/07/2026 05:27:34 - INFO - __main__ - Loading model from Hugging Face Hub: ResembleAI/chatterbox
121
  /usr/local/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py:202: UserWarning: The `local_dir_use_symlinks` argument is deprecated and ignored in `hf_hub_download`. Downloading to a local directory does not use symlinks anymore.
122
  warnings.warn(
123
+ 02/07/2026 05:27:34 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/ve.safetensors "HTTP/1.1 302 Found"
124
+ 02/07/2026 05:27:34 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/models/ResembleAI/chatterbox/xet-read-token/05e904af2b5c7f8e482687a9d7336c5c824467d9 "HTTP/1.1 200 OK"
125
 
126
 
127
  ve.safetensors: 0%| | 0.00/5.70M [00:00<?, ?B/s]
128
+ ve.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5.70M/5.70M [00:00<00:00, 25.1MB/s]
129
+ 02/07/2026 05:27:34 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/t3_mtl23ls_v2.safetensors "HTTP/1.1 302 Found"
130
 
131
 
132
  t3_mtl23ls_v2.safetensors: 0%| | 0.00/2.14G [00:00<?, ?B/s]
133
 
134
+ t3_mtl23ls_v2.safetensors: 0%| | 7.60M/2.14G [00:01<07:01, 5.06MB/s]
135
 
136
+ t3_mtl23ls_v2.safetensors: 4%|β–Ž | 78.8M/2.14G [00:08<03:32, 9.73MB/s]
137
 
138
+ t3_mtl23ls_v2.safetensors: 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 932M/2.14G [00:09<00:08, 139MB/s] 
139
 
140
+ t3_mtl23ls_v2.safetensors: 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1.72G/2.14G [00:10<00:01, 254MB/s]
141
+ t3_mtl23ls_v2.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2.14G/2.14G [00:10<00:00, 202MB/s]
142
+ 02/07/2026 05:27:45 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/s3gen.safetensors "HTTP/1.1 302 Found"
 
 
143
 
144
 
145
  s3gen.safetensors: 0%| | 0.00/1.06G [00:00<?, ?B/s]
146
 
147
+ s3gen.safetensors: 6%|β–‹ | 67.0M/1.06G [00:01<00:24, 40.1MB/s]
148
 
149
+ s3gen.safetensors: 38%|β–ˆβ–ˆβ–ˆβ–Š | 402M/1.06G [00:02<00:03, 172MB/s] 
150
+ s3gen.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06G/1.06G [00:03<00:00, 316MB/s]
151
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/mtl_tokenizer.json "HTTP/1.1 307 Temporary Redirect"
152
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/mtl_tokenizer.json "HTTP/1.1 200 OK"
153
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/mtl_tokenizer.json "HTTP/1.1 200 OK"
154
 
155
 
156
  mtl_tokenizer.json: 0%| | 0.00/68.1k [00:00<?, ?B/s]
157
+ mtl_tokenizer.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 68.1k/68.1k [00:00<00:00, 102MB/s]
158
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/conds.pt "HTTP/1.1 302 Found"
159
 
160
 
161
  conds.pt: 0%| | 0.00/107k [00:00<?, ?B/s]
162
+ conds.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 107k/107k [00:00<00:00, 1.28MB/s]
163
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/models/ResembleAI/chatterbox/revision/main "HTTP/1.1 200 OK"
164
 
165
 
166
  Downloading (incomplete total...): 0.00B [00:00, ?B/s]
167
 
168
+ Fetching 5 files: 0%| | 0/5 [00:00<?, ?it/s]02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/conds.pt "HTTP/1.1 302 Found"
 
169
 
 
 
 
170
 
171
+ Downloading (incomplete total...): 0%| | 0.00/107k [00:00<?, ?B/s]02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/t3_mtl23ls_v2.safetensors "HTTP/1.1 302 Found"
172
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/ve.pt "HTTP/1.1 302 Found"
173
 
174
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/s3gen.pt "HTTP/1.1 302 Found"
175
 
176
+ Downloading (incomplete total...): 0%| | 0.00/2.15G [00:00<?, ?B/s]
177
 
178
+ Downloading (incomplete total...): 0%| | 0.00/2.15G [00:00<?, ?B/s]
179
 
180
+ Downloading (incomplete total...): 0%| | 0.00/3.21G [00:00<?, ?B/s]02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 307 Temporary Redirect"
181
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 200 OK"
182
+ 02/07/2026 05:27:48 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 200 OK"
183
 
184
 
185
  Downloading (incomplete total...): 0%| | 0.00/3.21G [00:00<?, ?B/s]
186
 
187
+ Downloading (incomplete total...): 0%| | 13.5M/3.21G [00:03<12:42, 4.19MB/s]
188
+
189
+ Downloading (incomplete total...): 5%|▍ | 148M/3.21G [00:05<01:38, 31.0MB/s] 
190
 
191
+ Downloading (incomplete total...): 24%|β–ˆβ–ˆβ– | 768M/3.21G [00:06<00:14, 168MB/s] 
192
 
193
+ Downloading (incomplete total...): 34%|β–ˆβ–ˆβ–ˆβ– | 1.10G/3.21G [00:11<00:19, 106MB/s]
194
 
195
+ Fetching 5 files: 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 3/5 [00:11<00:07, 3.98s/it]
196
 
197
+ Downloading (incomplete total...): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 1.57G/3.21G [00:12<00:10, 159MB/s]
198
 
199
+ Downloading (incomplete total...): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.21G/3.21G [00:13<00:00, 420MB/s]
200
 
201
+ Fetching 5 files: 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4/5 [00:13<00:03, 3.25s/it]
202
+ Fetching 5 files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5/5 [00:13<00:00, 2.73s/it]
203
 
204
 
205
+ Download complete: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.21G/3.21G [00:13<00:00, 420MB/s] 
206
+ Download complete: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.21G/3.21G [00:21<00:00, 146MB/s]
207
  /usr/local/lib/python3.13/site-packages/diffusers/models/lora.py:393: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
208
  deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
209
+ 02/07/2026 05:28:12 - INFO - root - input frame rate=25
210
+ 02/07/2026 05:28:14 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/Cangjie5_TC.json "HTTP/1.1 307 Temporary Redirect"
211
+ 02/07/2026 05:28:14 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
212
+ 02/07/2026 05:28:14 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
213
 
214
 
215
  Cangjie5_TC.json: 0%| | 0.00/1.92M [00:00<?, ?B/s]
216
+ Cangjie5_TC.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.92M/1.92M [00:00<00:00, 36.6MB/s]
217
  Downloading: "https://github.com/explosion/spacy-pkuseg/releases/download/v0.0.26/spacy_ontonotes.zip" to /root/.pkuseg/spacy_ontonotes.zip
218
 
219
 
220
  0%| | 0/34567143 [00:00<?, ?it/s]
221
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 34567143/34567143 [00:00<00:00, 176772301.39it/s]
222
+ 02/07/2026 05:28:16 - INFO - __main__ - Voice Encoder frozen.
223
+ 02/07/2026 05:28:16 - INFO - __main__ - S3Gen model frozen.
224
+ 02/07/2026 05:28:16 - INFO - __main__ - T3 model set to trainable.
225
+ 02/07/2026 05:28:16 - INFO - __main__ - Loading and processing dataset...
226
+ 02/07/2026 05:28:16 - INFO - __main__ - Loading dataset 'rahul7star/hindi-speech-dataset' from Hugging Face Hub.
227
+ 02/07/2026 05:28:16 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/main/README.md "HTTP/1.1 307 Temporary Redirect"
228
+ 02/07/2026 05:28:16 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/datasets/rahul7star/hindi-speech-dataset/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/README.md "HTTP/1.1 200 OK"
229
+ 02/07/2026 05:28:16 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/datasets/rahul7star/hindi-speech-dataset/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/README.md "HTTP/1.1 200 OK"
230
+
231
+
232
+ README.md: 0%| | 0.00/591 [00:00<?, ?B/s]
233
+ README.md: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 591/591 [00:00<00:00, 2.06MB/s]
234
+ 02/07/2026 05:28:16 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/hindi-speech-dataset.py "HTTP/1.1 404 Not Found"
235
+ 02/07/2026 05:28:16 - INFO - httpx - HTTP Request: HEAD https://s3.amazonaws.com/datasets.huggingface.co/datasets/datasets/rahul7star/hindi-speech-dataset/rahul7star/hindi-speech-dataset.py "HTTP/1.1 404 Not Found"
236
+ 02/07/2026 05:28:17 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/datasets/rahul7star/hindi-speech-dataset/revision/0bfd5e2e4555ec80d7dd74b10442836d2e169be6 "HTTP/1.1 200 OK"
237
+ 02/07/2026 05:28:17 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/.huggingface.yaml "HTTP/1.1 404 Not Found"
238
+ 02/07/2026 05:28:17 - INFO - httpx - HTTP Request: GET https://datasets-server.huggingface.co/info?dataset=rahul7star/hindi-speech-dataset "HTTP/1.1 200 OK"
239
+ 02/07/2026 05:28:17 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/datasets/rahul7star/hindi-speech-dataset/tree/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data?recursive=true&expand=false "HTTP/1.1 200 OK"
240
+ 02/07/2026 05:28:17 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/datasets/rahul7star/hindi-speech-dataset/tree/0bfd5e2e4555ec80d7dd74b10442836d2e169be6?recursive=false&expand=false "HTTP/1.1 200 OK"
241
+ 02/07/2026 05:28:17 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/dataset_infos.json "HTTP/1.1 404 Not Found"
242
+
243
+ Downloading data: 0%| | 0/36 [00:00<?, ?files/s]02/07/2026 05:28:17 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00000-of-00036.parquet "HTTP/1.1 302 Found"
244
+ 02/07/2026 05:28:17 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/datasets/rahul7star/hindi-speech-dataset/xet-read-token/0bfd5e2e4555ec80d7dd74b10442836d2e169be6 "HTTP/1.1 200 OK"
245
+
246
+
247
+ data/train-00000-of-00036.parquet: 0%| | 0.00/1.77G [00:00<?, ?B/s]
248
+
249
+ data/train-00000-of-00036.parquet: 0%| | 65.0k/1.77G [00:01<7:37:16, 64.3kB/s]
250
+
251
+ data/train-00000-of-00036.parquet: 7%|β–‹ | 126M/1.77G [00:02<00:22, 73.9MB/s] 
252
+
253
+ data/train-00000-of-00036.parquet: 25%|β–ˆβ–ˆβ–Œ | 449M/1.77G [00:03<00:07, 187MB/s] 
254
+
255
+ data/train-00000-of-00036.parquet: 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 762M/1.77G [00:04<00:04, 235MB/s]
256
+
257
+ data/train-00000-of-00036.parquet: 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1.10G/1.77G [00:05<00:02, 272MB/s]
258
+
259
+ data/train-00000-of-00036.parquet: 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1.56G/1.77G [00:06<00:00, 331MB/s]
260
+ data/train-00000-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.77G/1.77G [00:06<00:00, 275MB/s]
261
+
262
+ Downloading data: 3%|β–Ž | 1/36 [00:06<03:46, 6.47s/files]02/07/2026 05:28:23 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00001-of-00036.parquet "HTTP/1.1 302 Found"
263
+
264
+
265
+ data/train-00001-of-00036.parquet: 0%| | 0.00/496M [00:00<?, ?B/s]
266
+
267
+ data/train-00001-of-00036.parquet: 1%| | 4.05M/496M [00:01<02:01, 4.05MB/s]
268
+
269
+ data/train-00001-of-00036.parquet: 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 241M/496M [00:02<00:01, 141MB/s] 
270
+ data/train-00001-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 496M/496M [00:02<00:00, 189MB/s]
271
+
272
+ Downloading data: 6%|β–Œ | 2/36 [00:09<02:23, 4.23s/files]02/07/2026 05:28:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00002-of-00036.parquet "HTTP/1.1 302 Found"
273
+
274
+
275
+ data/train-00002-of-00036.parquet: 0%| | 0.00/461M [00:00<?, ?B/s]
276
+
277
+ data/train-00002-of-00036.parquet: 3%|β–Ž | 15.8M/461M [00:01<00:28, 15.7MB/s]
278
+
279
+ data/train-00002-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 461M/461M [00:02<00:00, 202MB/s] 
280
+ data/train-00002-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 461M/461M [00:02<00:00, 180MB/s]
281
+
282
+ Downloading data: 8%|β–Š | 3/36 [00:11<01:54, 3.48s/files]02/07/2026 05:28:29 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00003-of-00036.parquet "HTTP/1.1 302 Found"
283
+
284
+
285
+ data/train-00003-of-00036.parquet: 0%| | 0.00/372M [00:00<?, ?B/s]
286
+
287
+ data/train-00003-of-00036.parquet: 0%| | 1.68M/372M [00:01<03:43, 1.66MB/s]
288
+
289
+ data/train-00003-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 372M/372M [00:02<00:00, 178MB/s] 
290
+ data/train-00003-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 372M/372M [00:02<00:00, 155MB/s]
291
+
292
+ Downloading data: 11%|β–ˆ | 4/36 [00:14<01:38, 3.07s/files]02/07/2026 05:28:31 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00004-of-00036.parquet "HTTP/1.1 302 Found"
293
+
294
+
295
+ data/train-00004-of-00036.parquet: 0%| | 0.00/373M [00:00<?, ?B/s]
296
+
297
+ data/train-00004-of-00036.parquet: 2%|▏ | 8.18M/373M [00:01<00:44, 8.16MB/s]
298
+ data/train-00004-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 373M/373M [00:01<00:00, 196MB/s]
299
+
300
+ Downloading data: 14%|β–ˆβ– | 5/36 [00:16<01:22, 2.67s/files]02/07/2026 05:28:33 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00005-of-00036.parquet "HTTP/1.1 302 Found"
301
+
302
+
303
+ data/train-00005-of-00036.parquet: 0%| | 0.00/841M [00:00<?, ?B/s]
304
+
305
+ data/train-00005-of-00036.parquet: 1%| | 8.37M/841M [00:01<01:40, 8.30MB/s]
306
+
307
+ data/train-00005-of-00036.parquet: 30%|β–ˆβ–ˆβ–‰ | 249M/841M [00:02<00:04, 144MB/s] 
308
+
309
+ data/train-00005-of-00036.parquet: 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 710M/841M [00:03<00:00, 288MB/s]
310
+ data/train-00005-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 841M/841M [00:03<00:00, 219MB/s]
311
+
312
+ Downloading data: 17%|β–ˆβ–‹ | 6/36 [00:19<01:32, 3.08s/files]02/07/2026 05:28:37 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00006-of-00036.parquet "HTTP/1.1 302 Found"
313
+
314
+
315
+ data/train-00006-of-00036.parquet: 0%| | 0.00/837M [00:00<?, ?B/s]
316
+
317
+ data/train-00006-of-00036.parquet: 3%|β–Ž | 27.2M/837M [00:01<00:29, 27.2MB/s]
318
+
319
+ data/train-00006-of-00036.parquet: 34%|β–ˆβ–ˆβ–ˆβ– | 283M/837M [00:02<00:03, 161MB/s] 
320
+
321
+ data/train-00006-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 837M/837M [00:03<00:00, 250MB/s]
322
+ data/train-00006-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 837M/837M [00:03<00:00, 221MB/s]
323
+
324
+ Downloading data: 19%|β–ˆβ–‰ | 7/36 [00:23<01:36, 3.34s/files]02/07/2026 05:28:41 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00007-of-00036.parquet "HTTP/1.1 302 Found"
325
+
326
+
327
+ data/train-00007-of-00036.parquet: 0%| | 0.00/618M [00:00<?, ?B/s]
328
+
329
+ data/train-00007-of-00036.parquet: 0%| | 2.20M/618M [00:01<04:40, 2.20MB/s]
330
+
331
+ data/train-00007-of-00036.parquet: 29%|β–ˆβ–ˆβ–‰ | 178M/618M [00:02<00:04, 104MB/s] 
332
+
333
+ data/train-00007-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 618M/618M [00:03<00:00, 224MB/s]
334
+ data/train-00007-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 618M/618M [00:03<00:00, 186MB/s]
335
+
336
+ Downloading data: 22%|β–ˆβ–ˆβ– | 8/36 [00:27<01:33, 3.34s/files]02/07/2026 05:28:44 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00008-of-00036.parquet "HTTP/1.1 302 Found"
337
+
338
+
339
+ data/train-00008-of-00036.parquet: 0%| | 0.00/415M [00:00<?, ?B/s]
340
+
341
+ data/train-00008-of-00036.parquet: 5%|β–Œ | 21.6M/415M [00:01<00:18, 21.2MB/s]
342
+ data/train-00008-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 415M/415M [00:01<00:00, 214MB/s]
343
+
344
+ Downloading data: 25%|β–ˆβ–ˆβ–Œ | 9/36 [00:29<01:18, 2.91s/files]02/07/2026 05:28:46 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00009-of-00036.parquet "HTTP/1.1 302 Found"
345
+
346
+
347
+ data/train-00009-of-00036.parquet: 0%| | 0.00/303M [00:00<?, ?B/s]
348
+
349
+ data/train-00009-of-00036.parquet: 5%|▍ | 14.1M/303M [00:01<00:20, 14.0MB/s]
350
+ data/train-00009-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 303M/303M [00:01<00:00, 193MB/s]
351
+
352
+ Downloading data: 28%|β–ˆβ–ˆβ–Š | 10/36 [00:30<01:05, 2.50s/files]02/07/2026 05:28:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00010-of-00036.parquet "HTTP/1.1 302 Found"
353
+
354
+
355
+ data/train-00010-of-00036.parquet: 0%| | 0.00/341M [00:00<?, ?B/s]
356
+
357
+ data/train-00010-of-00036.parquet: 12%|β–ˆβ– | 40.2M/341M [00:01<00:07, 40.1MB/s]
358
+ data/train-00010-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 341M/341M [00:01<00:00, 182MB/s]
359
+
360
+ Downloading data: 31%|β–ˆβ–ˆβ–ˆ | 11/36 [00:32<00:58, 2.34s/files]02/07/2026 05:28:50 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00011-of-00036.parquet "HTTP/1.1 302 Found"
361
+
362
+
363
+ data/train-00011-of-00036.parquet: 0%| | 0.00/440M [00:00<?, ?B/s]
364
+
365
+ data/train-00011-of-00036.parquet: 2%|▏ | 10.8M/440M [00:01<00:39, 10.8MB/s]
366
+
367
+ data/train-00011-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 440M/440M [00:02<00:00, 244MB/s] 
368
+ data/train-00011-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 440M/440M [00:02<00:00, 211MB/s]
369
+
370
+ Downloading data: 33%|β–ˆβ–ˆβ–ˆβ–Ž | 12/36 [00:34<00:54, 2.27s/files]02/07/2026 05:28:52 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00012-of-00036.parquet "HTTP/1.1 302 Found"
371
+
372
+
373
+ data/train-00012-of-00036.parquet: 0%| | 0.00/412M [00:00<?, ?B/s]
374
+
375
+ data/train-00012-of-00036.parquet: 9%|β–Š | 35.9M/412M [00:01<00:10, 35.9MB/s]
376
+ data/train-00012-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 412M/412M [00:01<00:00, 240MB/s]
377
+
378
+ Downloading data: 36%|β–ˆβ–ˆβ–ˆβ–Œ | 13/36 [00:36<00:48, 2.11s/files]02/07/2026 05:28:53 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00013-of-00036.parquet "HTTP/1.1 302 Found"
379
+
380
+
381
+ data/train-00013-of-00036.parquet: 0%| | 0.00/397M [00:00<?, ?B/s]
382
+
383
+ data/train-00013-of-00036.parquet: 6%|β–Œ | 24.1M/397M [00:01<00:15, 24.0MB/s]
384
+ data/train-00013-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 397M/397M [00:01<00:00, 221MB/s]
385
+
386
+ Downloading data: 39%|β–ˆβ–ˆβ–ˆβ–‰ | 14/36 [00:38<00:44, 2.02s/files]02/07/2026 05:28:55 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00014-of-00036.parquet "HTTP/1.1 302 Found"
387
+
388
+
389
+ data/train-00014-of-00036.parquet: 0%| | 0.00/397M [00:00<?, ?B/s]
390
+
391
+ data/train-00014-of-00036.parquet: 2%|▏ | 8.19M/397M [00:01<00:47, 8.12MB/s]
392
+
393
+ data/train-00014-of-00036.parquet: 35%|β–ˆβ–ˆβ–ˆβ–Œ | 139M/397M [00:02<00:03, 80.0MB/s] 
394
+ data/train-00014-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 397M/397M [00:02<00:00, 136MB/s]
395
+
396
+ Downloading data: 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 15/36 [00:41<00:48, 2.31s/files]02/07/2026 05:28:58 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00015-of-00036.parquet "HTTP/1.1 302 Found"
397
+
398
+
399
+ data/train-00015-of-00036.parquet: 0%| | 0.00/389M [00:00<?, ?B/s]
400
+
401
+ data/train-00015-of-00036.parquet: 1%| | 2.39M/389M [00:01<02:42, 2.38MB/s]
402
+
403
+ data/train-00015-of-00036.parquet: 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 322M/389M [00:02<00:00, 186MB/s] 
404
+ data/train-00015-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 389M/389M [00:02<00:00, 181MB/s]
405
+
406
+ Downloading data: 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 16/36 [00:43<00:45, 2.27s/files]02/07/2026 05:29:00 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00016-of-00036.parquet "HTTP/1.1 302 Found"
407
+
408
+
409
+ data/train-00016-of-00036.parquet: 0%| | 0.00/384M [00:00<?, ?B/s]
410
+
411
+ data/train-00016-of-00036.parquet: 0%| | 14.6k/384M [00:01<9:36:49, 11.1kB/s]
412
+
413
+ data/train-00016-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 384M/384M [00:02<00:00, 178MB/s] 
414
+ data/train-00016-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 384M/384M [00:02<00:00, 151MB/s]
415
+
416
+ Downloading data: 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 17/36 [00:46<00:44, 2.36s/files]02/07/2026 05:29:03 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00017-of-00036.parquet "HTTP/1.1 302 Found"
417
+
418
+
419
+ data/train-00017-of-00036.parquet: 0%| | 0.00/441M [00:00<?, ?B/s]
420
+
421
+ data/train-00017-of-00036.parquet: 0%| | 8.34k/441M [00:01<14:43:47, 8.32kB/s]
422
+
423
+ data/train-00017-of-00036.parquet: 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 304M/441M [00:02<00:00, 178MB/s] 
424
+ data/train-00017-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 441M/441M [00:02<00:00, 161MB/s]
425
+
426
+ Downloading data: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 18/36 [00:48<00:44, 2.48s/files]02/07/2026 05:29:06 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00018-of-00036.parquet "HTTP/1.1 302 Found"
427
+
428
+
429
+ data/train-00018-of-00036.parquet: 0%| | 0.00/571M [00:00<?, ?B/s]
430
+
431
+ data/train-00018-of-00036.parquet: 4%|▍ | 25.1M/571M [00:01<00:21, 25.0MB/s]
432
+
433
+ data/train-00018-of-00036.parquet: 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 382M/571M [00:02<00:00, 220MB/s] 
434
+ data/train-00018-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 571M/571M [00:03<00:00, 168MB/s]
435
+
436
+ Downloading data: 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 19/36 [00:52<00:47, 2.77s/files]02/07/2026 05:29:09 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00019-of-00036.parquet "HTTP/1.1 302 Found"
437
+
438
+
439
+ data/train-00019-of-00036.parquet: 0%| | 0.00/413M [00:00<?, ?B/s]
440
+
441
+ data/train-00019-of-00036.parquet: 10%|β–‰ | 40.5M/413M [00:01<00:09, 40.5MB/s]
442
+ data/train-00019-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 413M/413M [00:01<00:00, 207MB/s]
443
+
444
+ Downloading data: 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 20/36 [00:54<00:40, 2.55s/files]02/07/2026 05:29:11 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00020-of-00036.parquet "HTTP/1.1 302 Found"
445
+
446
+
447
+ data/train-00020-of-00036.parquet: 0%| | 0.00/417M [00:00<?, ?B/s]
448
+
449
+ data/train-00020-of-00036.parquet: 7%|β–‹ | 29.3M/417M [00:01<00:13, 29.2MB/s]
450
+
451
+ data/train-00020-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 417M/417M [00:02<00:00, 221MB/s] 
452
+ data/train-00020-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 417M/417M [00:02<00:00, 194MB/s]
453
+
454
+ Downloading data: 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 21/36 [00:56<00:36, 2.43s/files]02/07/2026 05:29:13 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00021-of-00036.parquet "HTTP/1.1 302 Found"
455
+
456
+
457
+ data/train-00021-of-00036.parquet: 0%| | 0.00/401M [00:00<?, ?B/s]
458
+
459
+ data/train-00021-of-00036.parquet: 5%|▍ | 18.6M/401M [00:01<00:20, 18.5MB/s]
460
+
461
+ data/train-00021-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 401M/401M [00:02<00:00, 210MB/s] 
462
+ data/train-00021-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 401M/401M [00:02<00:00, 184MB/s]
463
+
464
+ Downloading data: 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 22/36 [00:58<00:33, 2.36s/files]02/07/2026 05:29:16 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00022-of-00036.parquet "HTTP/1.1 302 Found"
465
+
466
+
467
+ data/train-00022-of-00036.parquet: 0%| | 0.00/373M [00:00<?, ?B/s]
468
+
469
+ data/train-00022-of-00036.parquet: 4%|▍ | 16.3M/373M [00:01<00:22, 16.2MB/s]
470
+ data/train-00022-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 373M/373M [00:01<00:00, 194MB/s]
471
+
472
+ Downloading data: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 23/36 [01:00<00:29, 2.24s/files]02/07/2026 05:29:18 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00023-of-00036.parquet "HTTP/1.1 302 Found"
473
+
474
+
475
+ data/train-00023-of-00036.parquet: 0%| | 0.00/374M [00:00<?, ?B/s]
476
+
477
+ data/train-00023-of-00036.parquet: 4%|▍ | 14.6M/374M [00:01<00:24, 14.5MB/s]
478
+
479
+ data/train-00023-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 374M/374M [00:02<00:00, 186MB/s] 
480
+ data/train-00023-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 374M/374M [00:02<00:00, 164MB/s]
481
+
482
+ Downloading data: 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 24/36 [01:02<00:27, 2.26s/files]02/07/2026 05:29:20 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00024-of-00036.parquet "HTTP/1.1 302 Found"
483
+
484
+
485
+ data/train-00024-of-00036.parquet: 0%| | 0.00/531M [00:00<?, ?B/s]
486
+
487
+ data/train-00024-of-00036.parquet: 2%|▏ | 9.64M/531M [00:01<00:54, 9.54MB/s]
488
+
489
+ data/train-00024-of-00036.parquet: 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 326M/531M [00:02<00:01, 189MB/s] 
490
+ data/train-00024-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 531M/531M [00:02<00:00, 192MB/s]
491
+
492
+ Downloading data: 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 25/36 [01:05<00:26, 2.43s/files]02/07/2026 05:29:23 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00025-of-00036.parquet "HTTP/1.1 302 Found"
493
+
494
+
495
+ data/train-00025-of-00036.parquet: 0%| | 0.00/542M [00:00<?, ?B/s]
496
+
497
+ data/train-00025-of-00036.parquet: 2%|▏ | 10.4M/542M [00:01<00:51, 10.4MB/s]
498
+
499
+ data/train-00025-of-00036.parquet: 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 299M/542M [00:02<00:01, 174MB/s] 
500
+
501
+ data/train-00025-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 542M/542M [00:03<00:00, 196MB/s]
502
+ data/train-00025-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 542M/542M [00:03<00:00, 175MB/s]
503
+
504
+ Downloading data: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 26/36 [01:08<00:26, 2.64s/files]02/07/2026 05:29:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00026-of-00036.parquet "HTTP/1.1 302 Found"
505
+
506
+
507
+ data/train-00026-of-00036.parquet: 0%| | 0.00/434M [00:00<?, ?B/s]
508
+
509
+ data/train-00026-of-00036.parquet: 3%|β–Ž | 13.9M/434M [00:01<00:30, 13.9MB/s]
510
+
511
+ data/train-00026-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 434M/434M [00:02<00:00, 236MB/s] 
512
+ data/train-00026-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 434M/434M [00:02<00:00, 204MB/s]
513
+
514
+ Downloading data: 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 27/36 [01:11<00:22, 2.49s/files]02/07/2026 05:29:28 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00027-of-00036.parquet "HTTP/1.1 302 Found"
515
+
516
+
517
+ data/train-00027-of-00036.parquet: 0%| | 0.00/366M [00:00<?, ?B/s]
518
+
519
+ data/train-00027-of-00036.parquet: 7%|β–‹ | 24.2M/366M [00:01<00:14, 24.1MB/s]
520
+ data/train-00027-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 366M/366M [00:01<00:00, 193MB/s]
521
+
522
+ Downloading data: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 28/36 [01:12<00:18, 2.32s/files]02/07/2026 05:29:30 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00028-of-00036.parquet "HTTP/1.1 302 Found"
523
+
524
+
525
+ data/train-00028-of-00036.parquet: 0%| | 0.00/374M [00:00<?, ?B/s]
526
+
527
+ data/train-00028-of-00036.parquet: 0%| | 73.2k/374M [00:01<1:31:08, 68.4kB/s]
528
+ data/train-00028-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 374M/374M [00:02<00:00, 183MB/s]
529
+
530
+ Downloading data: 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 29/36 [01:15<00:15, 2.25s/files]02/07/2026 05:29:32 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00029-of-00036.parquet "HTTP/1.1 302 Found"
531
+
532
+
533
+ data/train-00029-of-00036.parquet: 0%| | 0.00/402M [00:00<?, ?B/s]
534
+
535
+ data/train-00029-of-00036.parquet: 7%|β–‹ | 26.9M/402M [00:01<00:13, 26.9MB/s]
536
+ data/train-00029-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 402M/402M [00:01<00:00, 223MB/s]
537
+
538
+ Downloading data: 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 30/36 [01:16<00:12, 2.12s/files]02/07/2026 05:29:34 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00030-of-00036.parquet "HTTP/1.1 302 Found"
539
+
540
+
541
+ data/train-00030-of-00036.parquet: 0%| | 0.00/400M [00:00<?, ?B/s]
542
+
543
+ data/train-00030-of-00036.parquet: 3%|β–Ž | 10.9M/400M [00:01<00:35, 10.9MB/s]
544
+
545
+ data/train-00030-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 400M/400M [00:02<00:00, 171MB/s] 
546
+ data/train-00030-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 400M/400M [00:02<00:00, 153MB/s]
547
+
548
+ Downloading data: 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 31/36 [01:19<00:11, 2.28s/files]02/07/2026 05:29:36 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00031-of-00036.parquet "HTTP/1.1 302 Found"
549
+
550
+
551
+ data/train-00031-of-00036.parquet: 0%| | 0.00/382M [00:00<?, ?B/s]
552
+
553
+ data/train-00031-of-00036.parquet: 0%| | 359k/382M [00:01<17:52, 356kB/s]
554
+
555
+ data/train-00031-of-00036.parquet: 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 334M/382M [00:02<00:00, 196MB/s]
556
+ data/train-00031-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 382M/382M [00:02<00:00, 170MB/s]
557
+
558
+ Downloading data: 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 32/36 [01:21<00:09, 2.28s/files]02/07/2026 05:29:39 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00032-of-00036.parquet "HTTP/1.1 302 Found"
559
+
560
+
561
+ data/train-00032-of-00036.parquet: 0%| | 0.00/380M [00:00<?, ?B/s]
562
+
563
+ data/train-00032-of-00036.parquet: 0%| | 94.5k/380M [00:01<1:07:00, 94.4kB/s]
564
+
565
+ data/train-00032-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 380M/380M [00:02<00:00, 170MB/s] 
566
+ data/train-00032-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 380M/380M [00:02<00:00, 150MB/s]
567
+
568
+ Downloading data: 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 33/36 [01:24<00:07, 2.37s/files]02/07/2026 05:29:41 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00033-of-00036.parquet "HTTP/1.1 302 Found"
569
+
570
+
571
+ data/train-00033-of-00036.parquet: 0%| | 0.00/376M [00:00<?, ?B/s]
572
+
573
+ data/train-00033-of-00036.parquet: 1%|▏ | 4.83M/376M [00:01<01:17, 4.81MB/s]
574
+ data/train-00033-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 376M/376M [00:01<00:00, 188MB/s]
575
+
576
+ Downloading data: 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 34/36 [01:26<00:04, 2.27s/files]02/07/2026 05:29:43 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00034-of-00036.parquet "HTTP/1.1 302 Found"
577
+
578
+
579
+ data/train-00034-of-00036.parquet: 0%| | 0.00/372M [00:00<?, ?B/s]
580
+
581
+ data/train-00034-of-00036.parquet: 2%|▏ | 6.35M/372M [00:01<00:57, 6.33MB/s]
582
+ data/train-00034-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 372M/372M [00:01<00:00, 192MB/s]
583
+
584
+ Downloading data: 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 35/36 [01:28<00:02, 2.18s/files]02/07/2026 05:29:45 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/train-00035-of-00036.parquet "HTTP/1.1 302 Found"
585
+
586
+
587
+ data/train-00035-of-00036.parquet: 0%| | 0.00/373M [00:00<?, ?B/s]
588
+
589
+ data/train-00035-of-00036.parquet: 3%|β–Ž | 9.34M/373M [00:01<00:39, 9.30MB/s]
590
+
591
+ data/train-00035-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 373M/373M [00:02<00:00, 210MB/s] 
592
+ data/train-00035-of-00036.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 373M/373M [00:02<00:00, 181MB/s]
593
+
594
+ Downloading data: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 36/36 [01:30<00:00, 2.15s/files]
595
+ Downloading data: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 36/36 [01:30<00:00, 2.51s/files]
596
+ 02/07/2026 05:29:47 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/validation-00000-of-00001.parquet "HTTP/1.1 302 Found"
597
+
598
+
599
+ data/validation-00000-of-00001.parquet: 0%| | 0.00/162M [00:00<?, ?B/s]
600
+
601
+ data/validation-00000-of-00001.parquet: 37%|β–ˆβ–ˆβ–ˆβ–‹ | 59.6M/162M [00:01<00:01, 58.8MB/s]
602
+ data/validation-00000-of-00001.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 162M/162M [00:01<00:00, 131MB/s]
603
+ 02/07/2026 05:29:49 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/datasets/rahul7star/hindi-speech-dataset/resolve/0bfd5e2e4555ec80d7dd74b10442836d2e169be6/data/test-00000-of-00001.parquet "HTTP/1.1 302 Found"
604
+
605
+
606
+ data/test-00000-of-00001.parquet: 0%| | 0.00/306M [00:00<?, ?B/s]
607
+
608
+ data/test-00000-of-00001.parquet: 3%|β–Ž | 9.32M/306M [00:01<00:31, 9.29MB/s]
609
+
610
+ data/test-00000-of-00001.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 306M/306M [00:02<00:00, 158MB/s] 
611
+ data/test-00000-of-00001.parquet: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 306M/306M [00:02<00:00, 138MB/s]
612
+
613
+
614
+ Generating train split: 0%| | 0/145152 [00:00<?, ? examples/s]
615
+
616
+ Generating train split: 0%| | 200/145152 [00:01<17:39, 136.87 examples/s]
617
+
618
+ Generating train split: 0%| | 500/145152 [00:02<12:29, 193.04 examples/s]
619
+
620
+ Generating train split: 1%| | 800/145152 [00:03<11:21, 211.79 examples/s]
621
+
622
+ Generating train split: 1%| | 1100/145152 [00:05<10:46, 222.75 examples/s]
623
+
624
+ Generating train split: 1%| | 1400/145152 [00:06<10:17, 232.66 examples/s]
625
+
626
+ Generating train split: 1%| | 1700/145152 [00:07<10:12, 234.05 examples/s]
627
+
628
+ Generating train split: 1%|▏ | 2000/145152 [00:09<10:15, 232.74 examples/s]
629
+
630
+ Generating train split: 3%|β–Ž | 3800/145152 [00:10<03:36, 651.42 examples/s]
631
+
632
+ Generating train split: 4%|▍ | 5632/145152 [00:11<02:24, 963.12 examples/s]
633
+
634
+ Generating train split: 5%|β–Œ | 7832/145152 [00:12<01:48, 1271.33 examples/s]
635
+
636
+ Generating train split: 7%|β–‹ | 9864/145152 [00:13<01:31, 1473.19 examples/s]
637
+
638
+ Generating train split: 8%|β–Š | 12196/145152 [00:14<01:18, 1694.99 examples/s]
639
+
640
+ Generating train split: 10%|β–ˆ | 14796/145152 [00:15<01:07, 1922.04 examples/s]
641
+
642
+ Generating train split: 12%|β–ˆβ– | 17628/145152 [00:16<00:58, 2168.74 examples/s]
643
+
644
+ Generating train split: 14%|β–ˆβ– | 20360/145152 [00:17<00:54, 2298.53 examples/s]
645
+
646
+ Generating train split: 16%|β–ˆβ–Œ | 22760/145152 [00:19<01:10, 1742.80 examples/s]
647
+
648
+ Generating train split: 17%|β–ˆβ–‹ | 24792/145152 [00:21<01:19, 1518.67 examples/s]
649
+
650
+ Generating train split: 18%|β–ˆβ–Š | 26592/145152 [00:22<01:24, 1401.27 examples/s]
651
+
652
+ Generating train split: 20%|β–ˆβ–‰ | 28324/145152 [00:24<01:29, 1311.22 examples/s]
653
+
654
+ Generating train split: 21%|β–ˆβ–ˆ | 29824/145152 [00:25<01:25, 1345.45 examples/s]
655
+
656
+ Generating train split: 22%|β–ˆβ–ˆβ– | 31624/145152 [00:26<01:18, 1447.83 examples/s]
657
+
658
+ Generating train split: 23%|β–ˆβ–ˆβ–Ž | 33456/145152 [00:27<01:12, 1537.19 examples/s]
659
+
660
+ Generating train split: 25%|β–ˆβ–ˆβ–Œ | 36688/145152 [00:28<00:55, 1938.33 examples/s]
661
+
662
+ Generating train split: 28%|β–ˆβ–ˆβ–Š | 40420/145152 [00:29<00:43, 2382.77 examples/s]
663
+
664
+ Generating train split: 30%|β–ˆβ–ˆβ–ˆ | 43720/145152 [00:30<00:39, 2554.01 examples/s]
665
+
666
+ Generating train split: 32%|β–ˆβ–ˆβ–ˆβ– | 46452/145152 [00:31<00:39, 2484.61 examples/s]
667
+
668
+ Generating train split: 34%|β–ˆβ–ˆβ–ˆβ–Ž | 48984/145152 [00:33<00:39, 2413.22 examples/s]
669
+
670
+ Generating train split: 36%|β–ˆβ–ˆβ–ˆβ–Œ | 51684/145152 [00:34<00:38, 2440.25 examples/s]
671
+
672
+ Generating train split: 38%|β–ˆβ–ˆβ–ˆβ–Š | 54516/145152 [00:35<00:36, 2470.40 examples/s]
673
+
674
+ Generating train split: 39%|β–ˆβ–ˆβ–ˆβ–‰ | 57048/145152 [00:36<00:35, 2465.42 examples/s]
675
+
676
+ Generating train split: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 59748/145152 [00:37<00:34, 2500.85 examples/s]
677
+
678
+ Generating train split: 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 62580/145152 [00:38<00:32, 2561.25 examples/s]
679
+
680
+ Generating train split: 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 65412/145152 [00:39<00:30, 2585.41 examples/s]
681
+
682
+ Generating train split: 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 68112/145152 [00:40<00:29, 2582.27 examples/s]
683
+
684
+ Generating train split: 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 70944/145152 [00:41<00:28, 2611.47 examples/s]
685
+
686
+ Generating train split: 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 73576/145152 [00:43<00:30, 2339.59 examples/s]
687
+
688
+ Generating train split: 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 75976/145152 [00:44<00:32, 2152.89 examples/s]
689
+
690
+ Generating train split: 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 78408/145152 [00:45<00:30, 2176.73 examples/s]
691
+
692
+ Generating train split: 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 80940/145152 [00:46<00:28, 2241.15 examples/s]
693
+
694
+ Generating train split: 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 83640/145152 [00:47<00:26, 2319.10 examples/s]
695
+
696
+ Generating train split: 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 86172/145152 [00:48<00:24, 2363.50 examples/s]
697
+
698
+ Generating train split: 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 89004/145152 [00:49<00:23, 2434.11 examples/s]
699
+
700
+ Generating train split: 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 92004/145152 [00:50<00:21, 2530.53 examples/s]
701
+
702
+ Generating train split: 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 94836/145152 [00:51<00:19, 2576.81 examples/s]
703
+
704
+ Generating train split: 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 97468/145152 [00:52<00:18, 2561.68 examples/s]
705
+
706
+ Generating train split: 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 100068/145152 [00:54<00:19, 2354.85 examples/s]
707
+
708
+ Generating train split: 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 102600/145152 [00:55<00:19, 2192.23 examples/s]
709
+
710
+ Generating train split: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 104932/145152 [00:56<00:18, 2118.58 examples/s]
711
+
712
+ Generating train split: 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 107132/145152 [00:57<00:17, 2122.32 examples/s]
713
+
714
+ Generating train split: 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 110064/145152 [00:58<00:15, 2331.15 examples/s]
715
+
716
+ Generating train split: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 112996/145152 [00:59<00:13, 2444.29 examples/s]
717
+
718
+ Generating train split: 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 115896/145152 [01:00<00:11, 2510.08 examples/s]
719
+
720
+ Generating train split: 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 118428/145152 [01:02<00:10, 2479.28 examples/s]
721
+
722
+ Generating train split: 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 121060/145152 [01:03<00:09, 2511.85 examples/s]
723
+
724
+ Generating train split: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 123660/145152 [01:04<00:08, 2514.99 examples/s]
725
+
726
+ Generating train split: 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 126192/145152 [01:05<00:07, 2496.21 examples/s]
727
+
728
+ Generating train split: 89%|β–ˆβ–ˆβ–ˆοΏ½οΏ½β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 128892/145152 [01:06<00:06, 2542.71 examples/s]
729
+
730
+ Generating train split: 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 131724/145152 [01:07<00:05, 2560.40 examples/s]
731
+
732
+ Generating train split: 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 134556/145152 [01:08<00:04, 2632.84 examples/s]
733
+
734
+ Generating train split: 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 137388/145152 [01:09<00:02, 2644.04 examples/s]
735
+
736
+ Generating train split: 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 140088/145152 [01:10<00:01, 2656.71 examples/s]
737
+
738
+ Generating train split: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 142920/145152 [01:11<00:00, 2685.45 examples/s]
739
+ Generating train split: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 145152/145152 [01:12<00:00, 2013.67 examples/s]
740
+
741
+
742
+ Generating validation split: 0%| | 0/239 [00:00<?, ? examples/s]
743
+ Generating validation split: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 239/239 [00:00<00:00, 253.06 examples/s]
744
+
745
+
746
+ Generating test split: 0%| | 0/418 [00:00<?, ? examples/s]
747
+
748
+ Generating test split: 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 200/418 [00:01<00:01, 160.93 examples/s]
749
+ Generating test split: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 418/418 [00:01<00:00, 242.41 examples/s]
750
+ 02/07/2026 05:31:06 - INFO - __main__ - *** Training T3 model ***
751
+
752
+
753
+ 0%| | 0/145152 [00:00<?, ?it/s]Traceback (most recent call last):
754
  File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 849, in <module>
755
  main()
756
  ~~~~^^
757
+ File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 796, in main
758
+ train_result = trainer_instance.train(
759
+ resume_from_checkpoint=training_args.resume_from_checkpoint
760
+ )
761
+ File "/usr/local/lib/python3.13/site-packages/transformers/trainer.py", line 2170, in train
762
+ return inner_training_loop(
763
+ args=args,
764
+ ...<2 lines>...
765
+ ignore_keys_for_eval=ignore_keys_for_eval,
766
+ )
767
+ File "/usr/local/lib/python3.13/site-packages/transformers/trainer.py", line 2442, in _inner_training_loop
768
+ self._evaluate(trial, ignore_keys_for_eval, skip_scheduler=True)
769
+ ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
770
+ File "/usr/local/lib/python3.13/site-packages/transformers/trainer.py", line 2970, in _evaluate
771
+ metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
772
+ File "/usr/local/lib/python3.13/site-packages/transformers/trainer.py", line 4290, in evaluate
773
+ output = self.evaluation_loop(
774
+ eval_dataloader,
775
+ ...<5 lines>...
776
+ metric_key_prefix=metric_key_prefix,
777
+ )
778
+ File "/usr/local/lib/python3.13/site-packages/transformers/trainer.py", line 4468, in evaluation_loop
779
+ for step, inputs in enumerate(dataloader):
780
+ ~~~~~~~~~^^^^^^^^^^^^
781
+ File "/usr/local/lib/python3.13/site-packages/accelerate/data_loader.py", line 567, in __iter__
782
+ current_batch = next(dataloader_iter)
783
+ File "/usr/local/lib/python3.13/site-packages/torch/utils/data/dataloader.py", line 741, in __next__
784
+ data = self._next_data()
785
+ File "/usr/local/lib/python3.13/site-packages/torch/utils/data/dataloader.py", line 1548, in _next_data
786
+ return self._process_data(data, worker_id)
787
+ ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
788
+ File "/usr/local/lib/python3.13/site-packages/torch/utils/data/dataloader.py", line 1586, in _process_data
789
+ data.reraise()
790
+ ~~~~~~~~~~~~^^
791
+ File "/usr/local/lib/python3.13/site-packages/torch/_utils.py", line 775, in reraise
792
+ raise exception
793
+ ImportError: Caught ImportError in DataLoader worker process 0.
794
+ Original Traceback (most recent call last):
795
+ File "/usr/local/lib/python3.13/site-packages/torch/utils/data/_utils/worker.py", line 358, in _worker_loop
796
+ data = fetcher.fetch(index) # type: ignore[possibly-undefined]
797
+ File "/usr/local/lib/python3.13/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
798
+ data = [self.dataset[idx] for idx in possibly_batched_index]
799
+ ~~~~~~~~~~~~^^^^^
800
+ File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 239, in __getitem__
801
+ wav_16k, text = self._load_audio_text_from_item(idx)
802
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^
803
+ File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 187, in _load_audio_text_from_item
804
+ item = self.dataset_source[idx]
805
+ ~~~~~~~~~~~~~~~~~~~^^^^^
806
+ File "/usr/local/lib/python3.13/site-packages/datasets/arrow_dataset.py", line 2878, in __getitem__
807
+ return self._getitem(key)
808
+ ~~~~~~~~~~~~~^^^^^
809
+ File "/usr/local/lib/python3.13/site-packages/datasets/arrow_dataset.py", line 2860, in _getitem
810
+ formatted_output = format_table(
811
+ pa_subtable, key, formatter=formatter, format_columns=format_columns, output_all_columns=output_all_columns
812
+ )
813
+ File "/usr/local/lib/python3.13/site-packages/datasets/formatting/formatting.py", line 658, in format_table
814
+ return formatter(pa_table, query_type=query_type)
815
+ File "/usr/local/lib/python3.13/site-packages/datasets/formatting/formatting.py", line 411, in __call__
816
+ return self.format_row(pa_table)
817
+ ~~~~~~~~~~~~~~~^^^^^^^^^^
818
+ File "/usr/local/lib/python3.13/site-packages/datasets/formatting/formatting.py", line 460, in format_row
819
+ row = self.python_features_decoder.decode_row(row)
820
+ File "/usr/local/lib/python3.13/site-packages/datasets/formatting/formatting.py", line 224, in decode_row
821
+ return self.features.decode_example(row, token_per_repo_id=self.token_per_repo_id) if self.features else row
822
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
823
+ File "/usr/local/lib/python3.13/site-packages/datasets/features/features.py", line 2111, in decode_example
824
+ column_name: decode_nested_example(feature, value, token_per_repo_id=token_per_repo_id)
825
+ ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
826
+ File "/usr/local/lib/python3.13/site-packages/datasets/features/features.py", line 1419, in decode_nested_example
827
+ return schema.decode_example(obj, token_per_repo_id=token_per_repo_id) if obj is not None else None
828
+ ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
829
+ File "/usr/local/lib/python3.13/site-packages/datasets/features/audio.py", line 186, in decode_example
830
+ raise ImportError("To support decoding audio data, please install 'torchcodec'.")
831
+ ImportError: To support decoding audio data, please install 'torchcodec'.
832
+
833
+
834
+ 0%| | 0/145152 [00:02<?, ?it/s]