| ==> Checking internet connectivity... |
| ==> Internet + pip OK |
| ==> Installing dependencies |
| ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. |
| torchaudio 2.5.1+cu124 requires torch==2.5.1, but you have torch 2.6.0+cu124 which is incompatible. |
| WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. |
| WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. |
| ==> CUDA OK (torch 2.6.0+cu124, CUDA 12.4, NVIDIA GeForce RTX 3090) |
| ==> System info: |
| GPU: NVIDIA GeForce RTX 3090, 24576 MiB, 565.57.01 |
| RAM: 125Gi |
| CPU: 32 cores |
| Disk: 50G total, 46G free |
| ==> Cloning font-model repo |
| Cloning into 'font-model'... |
| ==> Downloading dataset from HuggingFace: dchen0/font_crops_test |
|
Fetching 3 files: 0%| | 0/3 [00:00<?, ?it/s]
Fetching 3 files: 33%|ββββ | 1/3 [00:00<00:00, 6.51it/s]
Fetching 3 files: 67%|βββββββ | 2/3 [00:01<00:00, 1.51it/s]
Fetching 3 files: 100%|ββββββββββ| 3/3 [00:01<00:00, 2.56it/s] |
| ==> Extracting data/train.tar... |
| ==> Extracting data/test.tar... |
| ==> Dataset ready: 3 train variants, 3 test variants |
| overlay 50G 4.1G 46G 9% / |
|
|
| ============================================ |
| Training: resnet50 (GPUs: 1) |
| ============================================ |
| 2026-03-31 03:19:53 - INFO - Loading dataset from data - train_model.py:163 |
| 2026-03-31 03:19:53 - INFO - Found 3 labels - train_model.py:167 |
| 2026-03-31 03:19:53 - INFO - Setting up image processor and augmentations - train_model.py:177 |
| 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/processor_config.json "HTTP/1.1 404 Not Found" - _client.py:1025 |
| 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/preprocessor_config.json "HTTP/1.1 307 Temporary Redirect" - _client.py:1025 |
| 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/facebook/dinov2-base-imagenet1k-1-layer/f9305d2c8048bd65783f64fabfa25429d13cbdbb/preprocessor_config.json "HTTP/1.1 200 OK" - _client.py:1025 |
| 2026-03-31 03:19:53 - INFO - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/facebook/dinov2-base-imagenet1k-1-layer/f9305d2c8048bd65783f64fabfa25429d13cbdbb/preprocessor_config.json "HTTP/1.1 200 OK" - _client.py:1025 |
| 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/processor_config.json "HTTP/1.1 404 Not Found" - _client.py:1025 |
| 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/preprocessor_config.json "HTTP/1.1 307 Temporary Redirect" - _client.py:1025 |
| 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/facebook/dinov2-base-imagenet1k-1-layer/f9305d2c8048bd65783f64fabfa25429d13cbdbb/preprocessor_config.json "HTTP/1.1 200 OK" - _client.py:1025 |
| 2026-03-31 03:19:54 - INFO - HTTP Request: HEAD https://s3.amazonaws.com/datasets.huggingface.co/datasets/datasets/imagefolder/imagefolder.py "HTTP/1.1 404 Not Found" - _client.py:1025 |
|
Downloading data: 0%| | 0/30 [00:00<?, ?files/s]
Downloading data: 100%|ββββββββββ| 30/30 [00:00<00:00, 72232.56files/s] |
|
Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 30 examples [00:00, 8651.02 examples/s] |
|
Generating test split: 0 examples [00:00, ? examples/s]
Generating test split: 9 examples [00:00, 7425.01 examples/s] |
| 2026-03-31 03:19:54 - INFO - Train size: 30, Validation size: 9 - train_model.py:187 |
| 2026-03-31 03:19:54 - INFO - Applying data transformations - train_model.py:191 |
|
Transforming training data: 0%| | 0/30 [00:00<?, ? examples/s]
Transforming training data: 100%|ββββββββββ| 30/30 [00:00<00:00, 181.44 examples/s]
Transforming training data: 100%|ββββββββββ| 30/30 [00:00<00:00, 179.41 examples/s] |
|
Transforming test data: 0%| | 0/9 [00:00<?, ? examples/s]
Transforming test data: 100%|ββββββββββ| 9/9 [00:00<00:00, 268.43 examples/s] |
| 2026-03-31 03:19:54 - INFO - Data preprocessing complete - train_model.py:207 |
| 2026-03-31 03:19:54 - INFO - Loading ResNet-50 (ImageNet-pretrained) as CNN baseline - train_model.py:211 |
| Downloading: "https://download.pytorch.org/models/resnet50-11ad3fa6.pth" to /root/.cache/torch/hub/checkpoints/resnet50-11ad3fa6.pth |
|
0%| | 0.00/97.8M [00:00<?, ?B/s]
9%|β | 8.50M/97.8M [00:00<00:01, 89.1MB/s]
19%|ββ | 18.4M/97.8M [00:00<00:00, 97.1MB/s]
29%|βββ | 28.8M/97.8M [00:00<00:00, 102MB/s]
40%|ββββ | 38.8M/97.8M [00:00<00:00, 102MB/s]
51%|ββββββ | 50.2M/97.8M [00:00<00:00, 104MB/s]
63%|βββββββ | 61.8M/97.8M [00:00<00:00, 107MB/s]
74%|ββββββββ | 72.2M/97.8M [00:00<00:00, 105MB/s]
86%|βββββββββ | 83.9M/97.8M [00:00<00:00, 104MB/s]
98%|ββββββββββ| 95.4M/97.8M [00:00<00:00, 109MB/s]
100%|ββββββββββ| 97.8M/97.8M [00:00<00:00, 106MB/s] |
| 2026-03-31 03:19:56 - INFO - trainable params: 23,514,179 || all params: 23,514,179 || trainable%: 100.0000 - train_model.py:237 |
| 2026-03-31 03:19:56 - INFO - Setting up training arguments - train_model.py:295 |
| 2026-03-31 03:19:56 - INFO - Using device: cuda - train_model.py:298 |
| `logging_dir` is deprecated and will be removed in v5.2. Please set `TENSORBOARD_LOGGING_DIR` instead. |
| 2026-03-31 03:19:56 - INFO - Starting training - train_model.py:337 |
|
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:01<00:00, 1.55s/it] |
|
0%| | 0/1 [00:00<?, ?it/s][A
|
|
[A
100%|ββββββββββ| 1/1 [00:02<00:00, 1.55s/it] |
|
100%|ββββββββββ| 1/1 [00:00<00:00, 466.86it/s][A |
|
[A
100%|ββββββββββ| 1/1 [00:02<00:00, 1.55s/it]
100%|ββββββββββ| 1/1 [00:02<00:00, 2.28s/it] |
| 2026-03-31 03:19:59 - INFO - Training complete - train_model.py:343 |
| 2026-03-31 03:19:59 - INFO - Saving result model to the output directory - train_model.py:350 |
| {'eval_loss': '1.079', 'eval_accuracy': '0.3333', 'eval_runtime': '0.5487', 'eval_samples_per_second': '16.4', 'eval_steps_per_second': '1.823', 'epoch': '1'} |
| {'train_runtime': '2.249', 'train_samples_per_second': '13.34', 'train_steps_per_second': '0.445', 'train_loss': '1.062', 'epoch': '1'} |
| ==> Finished: resnet50 |
|
|
| ============================================ |
| ALL TRAINING COMPLETE |
| Results in: /workspace/output/ |
| ============================================ |
| ==> Uploading results to HuggingFace: dchen0/font-model-dry-run |
|
Processing Files (0 / 0) : | | 0.00B / 0.00B |
|
New Data Upload : | | 0.00B / 0.00B [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A
Processing Files (2 / 2) : 0%| | 15.2kB / 189MB, 25.4kB/s |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 1%| | 556kB / 94.3MB [A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 1%| | 556kB / 94.3MB [A[A[A[A[A
Processing Files (2 / 4) : 0%| | 663kB / 189MB, 664kB/s |
|
New Data Upload : 0%| | 556kB / 134MB, 556kB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 1%| | 556kB / 94.3MB [A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 0%| | 184kB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 8%|β | 7.36MB / 94.3MB [A[A[A[A[A
Processing Files (2 / 4) : 4%|β | 7.55MB / 189MB, 5.40MB/s |
|
New Data Upload : 1%| | 1.11MB / 134MB, 795kB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 1%| | 552kB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 10%|β | 9.58MB / 94.3MB [A[A[A[A[A
Processing Files (2 / 4) : 5%|β | 10.1MB / 189MB, 6.34MB/s |
|
New Data Upload : 2%|β | 3.34MB / 134MB, 2.09MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 1%| | 829kB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 12%|ββ | 11.2MB / 94.3MB [A[A[A[A[A
Processing Files (2 / 4) : 6%|β | 12.1MB / 189MB, 6.72MB/s |
|
New Data Upload : 4%|β | 5.00MB / 134MB, 2.78MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 1%|β | 1.38MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 15%|ββ | 14.6MB / 94.3MB [A[A[A[A[A
Processing Files (2 / 4) : 8%|β | 16.0MB / 189MB, 7.99MB/s |
|
New Data Upload : 6%|β | 8.34MB / 134MB, 4.17MB/s [A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 3%|β | 2.39MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 22%|βββ | 20.7MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 12%|ββ | 23.1MB / 189MB, 10.5MB/s |
|
New Data Upload : 11%|β | 14.5MB / 134MB, 6.57MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 3%|β | 3.04MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 26%|βββ | 24.6MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 15%|ββ | 27.6MB / 189MB, 11.5MB/s |
|
New Data Upload : 14%|ββ | 18.3MB / 134MB, 7.65MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 4%|β | 4.23MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 34%|ββββ | 31.8MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 19%|ββ | 36.1MB / 189MB, 13.9MB/s |
|
New Data Upload : 19%|ββ | 25.6MB / 134MB, 9.84MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 5%|β | 4.79MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 37%|ββββ | 35.2MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 21%|ββ | 40.0MB / 189MB, 14.3MB/s |
|
New Data Upload : 22%|βββ | 28.9MB / 134MB, 10.3MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 6%|β | 5.98MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 45%|βββββ | 42.4MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 26%|βββ | 48.4MB / 189MB, 16.1MB/s |
|
New Data Upload : 27%|βββ | 36.1MB / 134MB, 12.0MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 7%|β | 6.63MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 49%|βββββ | 46.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 28%|βββ | 52.9MB / 189MB, 16.5MB/s |
|
New Data Upload : 30%|βββ | 40.0MB / 134MB, 12.5MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 8%|β | 7.92MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 57%|ββββββ | 54.1MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 33%|ββββ | 62.0MB / 189MB, 18.2MB/s |
|
New Data Upload : 36%|ββββ | 47.8MB / 134MB, 14.1MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 9%|β | 8.47MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 61%|ββββββ | 57.4MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 35%|ββββ | 65.9MB / 189MB, 18.3MB/s |
|
New Data Upload : 38%|ββββ | 51.2MB / 134MB, 14.2MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 10%|β | 9.57MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 68%|βββββββ | 64.1MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 39%|ββββ | 73.7MB / 189MB, 19.4MB/s |
|
New Data Upload : 43%|βββββ | 57.8MB / 134MB, 15.2MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 11%|β | 10.2MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 72%|ββββββββ | 68.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 41%|βββββ | 78.2MB / 189MB, 19.6MB/s |
|
New Data Upload : 46%|βββββ | 61.7MB / 134MB, 15.4MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 12%|ββ | 11.0MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 77%|ββββββββ | 73.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 45%|βββββ | 84.0MB / 189MB, 20.0MB/s |
|
New Data Upload : 50%|βββββ | 66.7MB / 134MB, 15.9MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 12%|ββ | 11.0MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 77%|ββββββββ | 73.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 12%|ββ | 11.0MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 77%|ββββββββ | 73.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 12%|ββ | 11.0MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 77%|ββββββββ | 73.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 12%|ββ | 11.0MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 77%|ββββββββ | 73.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 27%|βββ | 25.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 78%|ββββββββ | 73.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 52%|ββββββ | 98.6MB / 189MB, 19.0MB/s |
|
New Data Upload : 44%|βββββ | 67.1MB / 151MB, 12.9MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 31%|βββ | 29.2MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 78%|ββββββββ | 73.7MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A
Processing Files (3 / 5) : 55%|ββββββ | 103MB / 189MB, 19.1MB/s |
|
New Data Upload : 47%|βββββ | 71.0MB / 151MB, 13.1MB/s [A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 3%|β | 53.0B / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 3%|β | 164B / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 3%|β | 164B / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 3%|β | 156B / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 35%|ββββ | 32.6MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 79%|ββββββββ | 74.5MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 3%|β | 53.0B / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 3%|β | 164B / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 3%|β | 164B / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 3%|β | 156B / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 57%|ββββββ | 107MB / 189MB, 19.1MB/s |
|
New Data Upload : 50%|βββββ | 74.9MB / 151MB, 13.4MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 39%|ββββ | 36.5MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 79%|ββββββββ | 74.9MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 3%|β | 53.0B / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 3%|β | 164B / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 3%|β | 164B / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 3%|β | 156B / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 59%|ββββββ | 111MB / 189MB, 19.2MB/s |
|
New Data Upload : 52%|ββββββ | 78.8MB / 151MB, 13.6MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 46%|βββββ | 43.8MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 81%|ββββββββ | 76.5MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 10%|β | 159B / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 10%|β | 492B / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 10%|β | 492B / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 10%|β | 469B / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 64%|βββββββ | 120MB / 189MB, 20.1MB/s |
|
New Data Upload : 58%|ββββββ | 87.1MB / 151MB, 14.5MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 51%|βββββ | 48.1MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 85%|βββββββββ | 79.9MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 30%|βββ | 479B / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 30%|βββ | 1.48kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 30%|βββ | 1.48kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 30%|βββ | 1.41kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 68%|βββββββ | 128MB / 189MB, 20.7MB/s |
|
New Data Upload : 63%|βββββββ | 94.4MB / 151MB, 15.2MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 59%|ββββββ | 55.5MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 87%|βββββββββ | 82.1MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 41%|ββββ | 639B / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 41%|ββββ | 1.97kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 41%|ββββ | 1.97kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 41%|ββββ | 1.88kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 73%|ββββββββ | 138MB / 189MB, 21.5MB/s |
|
New Data Upload : 69%|βββββββ | 103MB / 151MB, 16.1MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 64%|βββββββ | 59.9MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 91%|βββββββββ | 85.9MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 64%|βββββββ | 1.01kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 64%|βββββββ | 3.12kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 64%|βββββββ | 3.12kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 64%|βββββββ | 2.97kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 77%|ββββββββ | 146MB / 189MB, 22.1MB/s |
|
New Data Upload : 74%|ββββββββ | 111MB / 151MB, 16.8MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 72%|ββββββββ | 67.5MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 95%|ββββββββββ| 89.6MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 85%|βββββββββ | 1.33kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 85%|βββββββββ | 4.11kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 85%|βββββββββ | 4.11kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 85%|βββββββββ | 3.91kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 83%|βββββββββ | 157MB / 189MB, 23.1MB/s |
|
New Data Upload : 81%|ββββββββ | 122MB / 151MB, 17.9MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 76%|ββββββββ | 71.7MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 97%|ββββββββββ| 91.9MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 98%|ββββββββββ| 1.54kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 98%|ββββββββββ| 4.76kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 98%|ββββββββββ| 4.76kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 98%|ββββββββββ| 4.54kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 87%|βββββββββ | 164MB / 189MB, 23.4MB/s |
|
New Data Upload : 85%|βββββββββ | 128MB / 151MB, 18.3MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 84%|βββββββββ | 78.9MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 98%|ββββββββββ| 92.6MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 98%|ββββββββββ| 1.54kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 98%|ββββββββββ| 4.76kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 98%|ββββββββββ| 4.76kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 98%|ββββββββββ| 4.54kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 91%|βββββββββ | 172MB / 189MB, 23.8MB/s |
|
New Data Upload : 90%|βββββββββ | 135MB / 151MB, 18.8MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 88%|βββββββββ | 82.8MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 99%|ββββββββββ| 93.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 98%|ββββββββββ| 1.54kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 98%|ββββββββββ| 4.76kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 98%|ββββββββββ| 4.76kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 98%|ββββββββββ| 4.54kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (3 / 9) : 93%|ββββββββββ| 176MB / 189MB, 23.8MB/s |
|
New Data Upload : 92%|ββββββββββ| 139MB / 151MB, 18.8MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 96%|ββββββββββ| 90.6MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.0MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (7 / 9) : 98%|ββββββββββ| 185MB / 189MB, 24.3MB/s |
|
New Data Upload : 98%|ββββββββββ| 147MB / 151MB, 19.4MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 93.9MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (7 / 9) : 100%|ββββββββββ| 188MB / 189MB, 24.1MB/s |
|
New Data Upload : 100%|ββββββββββ| 150MB / 151MB, 19.3MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 93.9MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 93.9MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 93.9MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 93.9MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (9 / 9) : 100%|ββββββββββ| 189MB / 189MB, 21.4MB/s |
|
New Data Upload : 100%|ββββββββββ| 151MB / 151MB, 17.1MB/s [A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A |
|
|
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B [A[A |
|
|
|
|
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB [A[A[A |
|
|
|
|
|
|
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A |
|
|
|
|
|
|
|
|
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB [A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB [A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB [A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB [A[A[A[A[A[A[A[A[A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB [A[A[A[A[A[A[A[A[A[A
Processing Files (9 / 9) : 100%|ββββββββββ| 189MB / 189MB, 19.2MB/s |
|
New Data Upload : 100%|ββββββββββ| 151MB / 151MB, 15.4MB/s |
|
...50/checkpoint-1/scaler.pt: 100%|ββββββββββ| 988B / 988B |
|
...heckpoint-1/rng_state.pth: 100%|ββββββββββ| 14.2kB / 14.2kB |
|
...point-1/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB |
|
...t_model/model.safetensors: 100%|ββββββββββ| 94.3MB / 94.3MB |
|
...checkpoint-1/scheduler.pt: 100%|ββββββββββ| 1.06kB / 1.06kB |
|
...checkpoint-1/optimizer.pt: 100%|ββββββββββ| 1.58kB / 1.58kB |
|
...point-1/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB |
|
...t_model/training_args.bin: 100%|ββββββββββ| 4.86kB / 4.86kB |
|
...927196.dadb3a5bb633.549.0: 100%|ββββββββββ| 4.63kB / 4.63kB |
| Upload complete. |
| ==> Uploading training log to HuggingFace... |
|
|