ent / ssl_dino /swin_tiny /train.log
coung21's picture
Upload folder using huggingface_hub
e26e16a verified
[2025-08-15 05:14:14,328][INFO] Args: {
"accelerator": "auto",
"batch_size": 32,
"callbacks": null,
"checkpoint": null,
"data": "data/kyucapsule",
"devices": "auto",
"embed_dim": null,
"epochs": 300,
"loader_args": null,
"loggers": {
"wandb": {
"project": "ent-endoscopy-ssl"
}
},
"method": "dino",
"method_args": null,
"model": "SwinTransformer",
"model_args": null,
"num_nodes": 1,
"num_workers": 64,
"optim": "auto",
"optim_args": null,
"out": "outputs/ssl_dino/swin_tiny",
"overwrite": true,
"precision": "32-true",
"resume": true,
"seed": 0,
"strategy": "auto",
"trainer_args": null,
"transform_args": {
"image_size": [
224,
224
],
"local_view": {
"num_views": 0
}
}
}
[2025-08-15 05:14:14,328][INFO] Using output directory '/workspace/ent-labotary/outputs/ssl_dino/swin_tiny'.
[2025-08-15 05:14:14,563][DEBUG] '/usr/local/lib/python3.12/dist-packages/lightly_train' is not a git repository.
[2025-08-15 05:14:14,573][DEBUG] Platform: Linux-6.5.0-18-generic-x86_64-with-glibc2.39
[2025-08-15 05:14:14,574][DEBUG] Python: 3.12.3
[2025-08-15 05:14:14,574][DEBUG] LightlyTrain: 0.6.1
[2025-08-15 05:14:14,574][DEBUG] LightlyTrain Git Information:
[2025-08-15 05:14:14,574][DEBUG] LightlyTrain is not installed from a git repository.
[2025-08-15 05:14:14,574][DEBUG] Run directory Git Information:
[2025-08-15 05:14:14,574][DEBUG] Branch: feat/hfup
[2025-08-15 05:14:14,574][DEBUG] Commit: a47f8b8122fc2efd40b0fdb826c07983a260b074
[2025-08-15 05:14:14,574][DEBUG] Uncommitted changes: ?? src/experiment/.ssl_dino.py.swp
[2025-08-15 05:14:14,574][DEBUG] Dependencies:
[2025-08-15 05:14:14,574][DEBUG] - torch 2.8.0
[2025-08-15 05:14:14,574][DEBUG] - torchvision 0.23.0
[2025-08-15 05:14:14,574][DEBUG] - pytorch-lightning 2.5.3
[2025-08-15 05:14:14,574][DEBUG] - Pillow 11.3.0
[2025-08-15 05:14:14,574][DEBUG] - pillow-simd x
[2025-08-15 05:14:14,574][DEBUG] Optional dependencies:
[2025-08-15 05:14:14,574][DEBUG] - super-gradients x
[2025-08-15 05:14:14,574][DEBUG] - timm 1.0.19
[2025-08-15 05:14:14,574][DEBUG] - ultralytics x
[2025-08-15 05:14:14,574][DEBUG] - wandb 0.21.1
[2025-08-15 05:14:14,574][DEBUG] CPUs: 384
[2025-08-15 05:14:14,574][DEBUG] GPUs: 1
[2025-08-15 05:14:14,574][DEBUG] - NVIDIA GeForce RTX 5090 12.0 (33669251072)
[2025-08-15 05:14:14,574][DEBUG] Environment variables:
[2025-08-15 05:14:14,575][DEBUG] Getting transform args for method 'dino'.
[2025-08-15 05:14:14,575][DEBUG] Using additional transform arguments {'image_size': (224, 224), 'local_view': {'num_views': 0}}.
[2025-08-15 05:14:14,576][DEBUG] Getting transform for method 'dino'.
[2025-08-15 05:14:14,582][DEBUG] Making sure data directory '/workspace/ent-labotary/data/kyucapsule' exists and is not empty.
[2025-08-15 05:14:14,582][INFO] Initializing dataset from '/workspace/ent-labotary/data/kyucapsule'.
[2025-08-15 05:14:14,582][DEBUG] Writing filenames to '/tmp/tmp1kte4dge' (chunk_size=10000)
[2025-08-15 05:14:15,124][DEBUG] Creating memory mapped sequence with 18481 'filenames'.
[2025-08-15 05:14:15,124][DEBUG] Found dataset size 18481.
[2025-08-15 05:14:15,124][DEBUG] Getting embedding model with embedding dimension None.
[2025-08-15 05:14:15,125][DEBUG] Using jsonl logger with args flush_logs_every_n_steps=100
[2025-08-15 05:14:15,126][DEBUG] Using tensorboard logger with args name='' version='' log_graph=False default_hp_metric=True prefix='' sub_dir=None
[2025-08-15 05:14:15,127][DEBUG] Using wandb logger with args name=None version=None offline=False anonymous=None project='ent-endoscopy-ssl' log_model=False prefix='' checkpoint_name=None
[2025-08-15 05:14:15,127][DEBUG] Using loggers ['JSONLLogger', 'TensorBoardLogger', 'WandbLogger'].
[2025-08-15 05:14:15,128][DEBUG] Getting accelerator for 'auto'.
[2025-08-15 05:14:15,128][DEBUG] CUDA is available, defaulting to CUDA.
[2025-08-15 05:14:15,128][DEBUG] Detected 1 devices.
[2025-08-15 05:14:15,128][DEBUG] Using strategy 'auto'.
[2025-08-15 05:14:15,128][DEBUG] Getting trainer.
[2025-08-15 05:14:15,128][DEBUG] Using sync_batchnorm 'True'.
[2025-08-15 05:14:15,140][INFO] GPU available: True (cuda), used: True
[2025-08-15 05:14:15,141][INFO] TPU available: False, using: 0 TPU cores
[2025-08-15 05:14:15,141][INFO] HPU available: False, using: 0 HPUs
[2025-08-15 05:14:15,141][DEBUG] Detected 1 nodes and 1 devices per node.
[2025-08-15 05:14:15,141][DEBUG] Total number of devices: 1.
[2025-08-15 05:14:15,141][DEBUG] Detected dataset size 18481.
[2025-08-15 05:14:15,141][DEBUG] Using batch size per device 32.
[2025-08-15 05:14:15,141][DEBUG] Using optimizer 'OptimizerType.SGD'.
[2025-08-15 05:14:15,141][DEBUG] Getting method args for 'DINO'
[2025-08-15 05:14:15,142][DEBUG] Getting method for 'DINO'
[2025-08-15 05:14:15,194][WARNING] /usr/local/lib/python3.12/dist-packages/torch/nn/utils/weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
WeightNorm.apply(module, name, dim)
[2025-08-15 05:14:15,783][INFO] Resolved configuration:
{
"accelerator": "CUDAAccelerator",
"batch_size": 32,
"callbacks": {
"device_stats_monitor": {},
"early_stopping": {
"check_finite": true,
"monitor": "train_loss",
"patience": 1000000000000
},
"learning_rate_monitor": {},
"model_checkpoint": {
"enable_version_counter": false,
"every_n_epochs": null,
"save_last": true,
"save_top_k": 1
}
},
"checkpoint": null,
"data": "data/kyucapsule",
"devices": 1,
"embed_dim": null,
"epochs": 300,
"loader_args": null,
"loggers": {
"jsonl": {
"flush_logs_every_n_steps": 100
},
"tensorboard": {
"default_hp_metric": true,
"log_graph": false,
"name": "",
"prefix": "",
"sub_dir": null,
"version": ""
},
"wandb": {
"anonymous": null,
"checkpoint_name": null,
"log_model": false,
"name": null,
"offline": false,
"prefix": "",
"project": "ent-endoscopy-ssl",
"version": null
}
},
"method": "dino",
"method_args": {
"batch_norm": false,
"bottleneck_dim": 256,
"center_momentum": 0.9,
"hidden_dim": 2048,
"momentum_end": 1.0,
"momentum_start": 0.99,
"norm_last_layer": true,
"output_dim": 1024,
"student_freeze_last_layer_epochs": 1,
"student_temp": 0.1,
"teacher_temp": 0.02,
"warmup_teacher_temp": 0.02,
"warmup_teacher_temp_epochs": 30,
"weight_decay_end": 0.0001,
"weight_decay_start": 0.0001
},
"model": "SwinTransformer",
"model_args": null,
"num_nodes": 1,
"num_workers": 64,
"optim": "sgd",
"optim_args": {
"lr": 0.03,
"momentum": 0.9,
"weight_decay": 0.0001
},
"out": "outputs/ssl_dino/swin_tiny",
"overwrite": true,
"precision": "32-true",
"resume": true,
"seed": 0,
"strategy": "SingleDeviceStrategy",
"trainer_args": null,
"transform_args": {
"color_jitter": {
"brightness": 0.8,
"contrast": 0.8,
"hue": 0.2,
"prob": 0.8,
"saturation": 0.4,
"strength": 0.5
},
"gaussian_blur": {
"blur_limit": 0,
"prob": 1.0,
"sigmas": [
0.1,
2.0
]
},
"global_view_1": {
"gaussian_blur": {
"blur_limit": 0,
"prob": 0.1,
"sigmas": [
0.1,
2.0
]
},
"solarize": {
"prob": 0.2,
"threshold": 0.5
}
},
"image_size": [
224,
224
],
"local_view": {
"gaussian_blur": {
"blur_limit": 0,
"prob": 0.5,
"sigmas": [
0.1,
2.0
]
},
"num_views": 0,
"random_resize": {
"max_scale": 0.14,
"min_scale": 0.05
},
"view_size": [
96,
96
]
},
"normalize": {
"mean": [
0.485,
0.456,
0.406
],
"std": [
0.229,
0.224,
0.225
]
},
"random_flip": {
"horizontal_prob": 0.5,
"vertical_prob": 0.0
},
"random_gray_scale": 0.2,
"random_resize": {
"max_scale": 1.0,
"min_scale": 0.14
},
"random_rotation": null,
"solarize": null
}
}
[2025-08-15 05:14:18,140][WARNING] /usr/local/lib/python3.12/dist-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py:190: .fit(ckpt_path="last") is set, but there is no last checkpoint available. No checkpoint will be loaded. HINT: Set `ModelCheckpoint(..., save_last=True)`.
[2025-08-15 05:14:18,141][INFO] You are using a CUDA device ('NVIDIA GeForce RTX 5090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[2025-08-15 05:14:18,297][INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2025-08-15 05:14:18,331][INFO] Loading `train_dataloader` to estimate number of stepping batches.
[2025-08-15 05:14:19,711][INFO]
| Name | Type | Params | Mode
-----------------------------------------------------------------------
0 | teacher_embedding_model | EmbeddingModel | 28.3 M | train
1 | teacher_projection_head | DINOProjectionHead | 6.6 M | train
2 | student_embedding_model | EmbeddingModel | 28.3 M | train
3 | student_projection_head | DINOProjectionHead | 6.6 M | train
4 | flatten | Flatten | 0 | train
5 | criterion | DINOLoss | 0 | train
-----------------------------------------------------------------------
69.8 M Trainable params
2.0 K Non-trainable params
69.8 M Total params
279.101 Total estimated model params size (MB)
520 Modules in train mode
0 Modules in eval mode
[2025-08-15 11:33:38,060][INFO] `Trainer.fit` stopped: `max_epochs=300` reached.
[2025-08-15 11:33:39,373][INFO] Training completed.
[2025-08-15 11:33:39,374][DEBUG] Exporting model to '/workspace/ent-labotary/outputs/ssl_dino/swin_tiny/exported_models/exported_last.pt' in format 'ModelFormat.PACKAGE_DEFAULT'.
[2025-08-15 11:33:39,460][INFO] Example: How to use the exported model
----------------------------------------------------------------------------------------
import timm

# Load the pretrained model
model = timm.create_model(
 model_name='swin_s3_tiny_224',
 checkpoint_path='/workspace/ent-labotary/outputs/ssl_dino/swin_tiny/exported_models/exported_last.pt',
)

# Finetune or evaluate the model
...
----------------------------------------------------------------------------------------
[2025-08-15 11:33:39,460][INFO] Model exported.