jnjj commited on
Commit
9cf0f22
·
verified ·
1 Parent(s): 782e45f

Periodic upload

Browse files
Files changed (3) hide show
  1. README.md +4 -4
  2. model.safetensors +1 -1
  3. training.log +50 -0
README.md CHANGED
@@ -9,7 +9,7 @@ library_name: transformers
9
 
10
  ## Progreso de Entrenamiento
11
 
12
- - **Datasets procesados:** 40.0
13
- - **Ejemplos de texto procesados:** 120.0
14
- - **Tokens procesados:** 37085.0
15
- - **Última subida:** 2025-05-06 14:44:14 UTC
 
9
 
10
  ## Progreso de Entrenamiento
11
 
12
+ - **Datasets procesados:** 41.0
13
+ - **Ejemplos de texto procesados:** 123.0
14
+ - **Tokens procesados:** 38134.0
15
+ - **Última subida:** 2025-05-06 14:45:17 UTC
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b766940185bc50163423216b429620b8500277bbfaacddcbf523439a3a434270
3
  size 51957256
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a6cc39be8760d896c41d4321d1cd81bb976b34928f69a0dfffd8a4b0807f996
3
  size 51957256
training.log CHANGED
@@ -258,3 +258,53 @@ ValueError: Compression type zstd not supported
258
  2025-05-06 16:43:33,126 INFO: Finished training and saved model/tokenizer for deepmind/aqua_rat config raw
259
  2025-05-06 16:43:38,702 INFO: Starting model update for allenai/c4, config: en
260
  2025-05-06 16:43:40,978 INFO: Finished training and saved model/tokenizer for allenai/c4 config en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
258
  2025-05-06 16:43:33,126 INFO: Finished training and saved model/tokenizer for deepmind/aqua_rat config raw
259
  2025-05-06 16:43:38,702 INFO: Starting model update for allenai/c4, config: en
260
  2025-05-06 16:43:40,978 INFO: Finished training and saved model/tokenizer for allenai/c4 config en
261
+ 2025-05-06 16:44:17,507 INFO: Upload successful.
262
+ 2025-05-06 16:44:20,888 ERROR: Failed to get configs for gaia-benchmark/GAIA: Dataset 'gaia-benchmark/GAIA' is a gated dataset on the Hub. Visit the dataset page at https://huggingface.co/datasets/gaia-benchmark/GAIA to ask for access.
263
+ 2025-05-06 16:44:21,431 INFO: Preparing data for HuggingFaceH4/MATH-500, config: default
264
+ 2025-05-06 16:44:22,903 INFO: Starting model update for HuggingFaceH4/MATH-500, config: default
265
+ 2025-05-06 16:44:24,553 INFO: Finished training and saved model/tokenizer for HuggingFaceH4/MATH-500 config default
266
+ 2025-05-06 16:44:24,554 ERROR: Error in background_training_loop task scheduling: local variable 'merged_model' referenced before assignment
267
+ 2025-05-06 16:44:24,623 ERROR: Failed to get configs for cais/hle: Dataset 'cais/hle' is a gated dataset on the Hub. Visit the dataset page at https://huggingface.co/datasets/cais/hle to ask for access.
268
+ 2025-05-06 16:44:30,479 INFO: Preparing data for MLCommons/unsupervised_peoples_speech, config: default
269
+ 2025-05-06 16:44:43,691 ERROR: Error during data preparation for MLCommons/unsupervised_peoples_speech config default: To support encoding audio data, please install 'soundfile'.
270
+ Traceback (most recent call last):
271
+ File "/usr/local/lib/python3.10/site-packages/datasets/features/audio.py", line 88, in encode_example
272
+ import soundfile as sf # soundfile is a dependency of librosa, needed to decode audio files.
273
+ ModuleNotFoundError: No module named 'soundfile'
274
+
275
+ The above exception was the direct cause of the following exception:
276
+
277
+ Traceback (most recent call last):
278
+ File "/home/user/app/app.py", line 233, in process_and_train
279
+ first_item = await asyncio.to_thread(lambda: next(iter(train_ds_instance), None))
280
+ File "/usr/local/lib/python3.10/asyncio/threads.py", line 25, in to_thread
281
+ return await loop.run_in_executor(None, func_call)
282
+ File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
283
+ result = self.fn(*self.args, **self.kwargs)
284
+ File "/home/user/app/app.py", line 233, in <lambda>
285
+ first_item = await asyncio.to_thread(lambda: next(iter(train_ds_instance), None))
286
+ File "/usr/local/lib/python3.10/site-packages/datasets/iterable_dataset.py", line 2266, in __iter__
287
+ for key, example in ex_iterable:
288
+ File "/usr/local/lib/python3.10/site-packages/datasets/iterable_dataset.py", line 222, in __iter__
289
+ for key_example in islice(self.generate_examples_fn(**gen_kwags), shard_example_idx_start, None):
290
+ File "/usr/local/lib/python3.10/site-packages/datasets/packaged_modules/generator/generator.py", line 33, in _generate_examples
291
+ yield from enumerate(self.config.generator(**gen_kwargs))
292
+ File "/home/user/app/app.py", line 214, in gen_data_for_cfg
293
+ for ex in dataset_split:
294
+ File "/usr/local/lib/python3.10/site-packages/datasets/iterable_dataset.py", line 2266, in __iter__
295
+ for key, example in ex_iterable:
296
+ File "/usr/local/lib/python3.10/site-packages/datasets/iterable_dataset.py", line 1869, in __iter__
297
+ example = _apply_feature_types_on_example(
298
+ File "/usr/local/lib/python3.10/site-packages/datasets/iterable_dataset.py", line 1779, in _apply_feature_types_on_example
299
+ encoded_example = features.encode_example(example)
300
+ File "/usr/local/lib/python3.10/site-packages/datasets/features/features.py", line 2049, in encode_example
301
+ return encode_nested_example(self, example)
302
+ File "/usr/local/lib/python3.10/site-packages/datasets/features/features.py", line 1292, in encode_nested_example
303
+ {k: encode_nested_example(schema[k], obj.get(k), level=level + 1) for k in schema}
304
+ File "/usr/local/lib/python3.10/site-packages/datasets/features/features.py", line 1292, in <dictcomp>
305
+ {k: encode_nested_example(schema[k], obj.get(k), level=level + 1) for k in schema}
306
+ File "/usr/local/lib/python3.10/site-packages/datasets/features/features.py", line 1362, in encode_nested_example
307
+ return schema.encode_example(obj) if obj is not None else None
308
+ File "/usr/local/lib/python3.10/site-packages/datasets/features/audio.py", line 90, in encode_example
309
+ raise ImportError("To support encoding audio data, please install 'soundfile'.") from err
310
+ ImportError: To support encoding audio data, please install 'soundfile'.