Upload folder using huggingface_hub

e26e16a verified 5 months ago

12.2 kB

	[2025-08-15 05:14:14,328][INFO] Args: {
	"accelerator": "auto",
	"batch_size": 32,
	"callbacks": null,
	"checkpoint": null,
	"data": "data/kyucapsule",
	"devices": "auto",
	"embed_dim": null,
	"epochs": 300,
	"loader_args": null,
	"loggers": {
	"wandb": {
	"project": "ent-endoscopy-ssl"
	}
	},
	"method": "dino",
	"method_args": null,
	"model": "SwinTransformer",
	"model_args": null,
	"num_nodes": 1,
	"num_workers": 64,
	"optim": "auto",
	"optim_args": null,
	"out": "outputs/ssl_dino/swin_tiny",
	"overwrite": true,
	"precision": "32-true",
	"resume": true,
	"seed": 0,
	"strategy": "auto",
	"trainer_args": null,
	"transform_args": {
	"image_size": [
	224,
	224
	],
	"local_view": {
	"num_views": 0
	}
	}
	}
	[2025-08-15 05:14:14,328][INFO] Using output directory '/workspace/ent-labotary/outputs/ssl_dino/swin_tiny'.
	[2025-08-15 05:14:14,563][DEBUG] '/usr/local/lib/python3.12/dist-packages/lightly_train' is not a git repository.
	[2025-08-15 05:14:14,573][DEBUG] Platform: Linux-6.5.0-18-generic-x86_64-with-glibc2.39
	[2025-08-15 05:14:14,574][DEBUG] Python: 3.12.3
	[2025-08-15 05:14:14,574][DEBUG] LightlyTrain: 0.6.1
	[2025-08-15 05:14:14,574][DEBUG] LightlyTrain Git Information:
	[2025-08-15 05:14:14,574][DEBUG] LightlyTrain is not installed from a git repository.
	[2025-08-15 05:14:14,574][DEBUG] Run directory Git Information:
	[2025-08-15 05:14:14,574][DEBUG] Branch: feat/hfup
	[2025-08-15 05:14:14,574][DEBUG] Commit: a47f8b8122fc2efd40b0fdb826c07983a260b074
	[2025-08-15 05:14:14,574][DEBUG] Uncommitted changes: ?? src/experiment/.ssl_dino.py.swp
	[2025-08-15 05:14:14,574][DEBUG] Dependencies:
	[2025-08-15 05:14:14,574][DEBUG] - torch 2.8.0
	[2025-08-15 05:14:14,574][DEBUG] - torchvision 0.23.0
	[2025-08-15 05:14:14,574][DEBUG] - pytorch-lightning 2.5.3
	[2025-08-15 05:14:14,574][DEBUG] - Pillow 11.3.0
	[2025-08-15 05:14:14,574][DEBUG] - pillow-simd x
	[2025-08-15 05:14:14,574][DEBUG] Optional dependencies:
	[2025-08-15 05:14:14,574][DEBUG] - super-gradients x
	[2025-08-15 05:14:14,574][DEBUG] - timm 1.0.19
	[2025-08-15 05:14:14,574][DEBUG] - ultralytics x
	[2025-08-15 05:14:14,574][DEBUG] - wandb 0.21.1
	[2025-08-15 05:14:14,574][DEBUG] CPUs: 384
	[2025-08-15 05:14:14,574][DEBUG] GPUs: 1
	[2025-08-15 05:14:14,574][DEBUG] - NVIDIA GeForce RTX 5090 12.0 (33669251072)
	[2025-08-15 05:14:14,574][DEBUG] Environment variables:
	[2025-08-15 05:14:14,575][DEBUG] Getting transform args for method 'dino'.
	[2025-08-15 05:14:14,575][DEBUG] Using additional transform arguments {'image_size': (224, 224), 'local_view': {'num_views': 0}}.
	[2025-08-15 05:14:14,576][DEBUG] Getting transform for method 'dino'.
	[2025-08-15 05:14:14,582][DEBUG] Making sure data directory '/workspace/ent-labotary/data/kyucapsule' exists and is not empty.
	[2025-08-15 05:14:14,582][INFO] Initializing dataset from '/workspace/ent-labotary/data/kyucapsule'.
	[2025-08-15 05:14:14,582][DEBUG] Writing filenames to '/tmp/tmp1kte4dge' (chunk_size=10000)
	[2025-08-15 05:14:15,124][DEBUG] Creating memory mapped sequence with 18481 'filenames'.
	[2025-08-15 05:14:15,124][DEBUG] Found dataset size 18481.
	[2025-08-15 05:14:15,124][DEBUG] Getting embedding model with embedding dimension None.
	[2025-08-15 05:14:15,125][DEBUG] Using jsonl logger with args flush_logs_every_n_steps=100
	[2025-08-15 05:14:15,126][DEBUG] Using tensorboard logger with args name='' version='' log_graph=False default_hp_metric=True prefix='' sub_dir=None
	[2025-08-15 05:14:15,127][DEBUG] Using wandb logger with args name=None version=None offline=False anonymous=None project='ent-endoscopy-ssl' log_model=False prefix='' checkpoint_name=None
	[2025-08-15 05:14:15,127][DEBUG] Using loggers ['JSONLLogger', 'TensorBoardLogger', 'WandbLogger'].
	[2025-08-15 05:14:15,128][DEBUG] Getting accelerator for 'auto'.
	[2025-08-15 05:14:15,128][DEBUG] CUDA is available, defaulting to CUDA.
	[2025-08-15 05:14:15,128][DEBUG] Detected 1 devices.
	[2025-08-15 05:14:15,128][DEBUG] Using strategy 'auto'.
	[2025-08-15 05:14:15,128][DEBUG] Getting trainer.
	[2025-08-15 05:14:15,128][DEBUG] Using sync_batchnorm 'True'.
	[2025-08-15 05:14:15,140][INFO] GPU available: True (cuda), used: True
	[2025-08-15 05:14:15,141][INFO] TPU available: False, using: 0 TPU cores
	[2025-08-15 05:14:15,141][INFO] HPU available: False, using: 0 HPUs
	[2025-08-15 05:14:15,141][DEBUG] Detected 1 nodes and 1 devices per node.
	[2025-08-15 05:14:15,141][DEBUG] Total number of devices: 1.
	[2025-08-15 05:14:15,141][DEBUG] Detected dataset size 18481.
	[2025-08-15 05:14:15,141][DEBUG] Using batch size per device 32.
	[2025-08-15 05:14:15,141][DEBUG] Using optimizer 'OptimizerType.SGD'.
	[2025-08-15 05:14:15,141][DEBUG] Getting method args for 'DINO'
	[2025-08-15 05:14:15,142][DEBUG] Getting method for 'DINO'
	[2025-08-15 05:14:15,194][WARNING] /usr/local/lib/python3.12/dist-packages/torch/nn/utils/weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
	WeightNorm.apply(module, name, dim)

	[2025-08-15 05:14:15,783][INFO] Resolved configuration:
	{
	"accelerator": "CUDAAccelerator",
	"batch_size": 32,
	"callbacks": {
	"device_stats_monitor": {},
	"early_stopping": {
	"check_finite": true,
	"monitor": "train_loss",
	"patience": 1000000000000
	},
	"learning_rate_monitor": {},
	"model_checkpoint": {
	"enable_version_counter": false,
	"every_n_epochs": null,
	"save_last": true,
	"save_top_k": 1
	}
	},
	"checkpoint": null,
	"data": "data/kyucapsule",
	"devices": 1,
	"embed_dim": null,
	"epochs": 300,
	"loader_args": null,
	"loggers": {
	"jsonl": {
	"flush_logs_every_n_steps": 100
	},
	"tensorboard": {
	"default_hp_metric": true,
	"log_graph": false,
	"name": "",
	"prefix": "",
	"sub_dir": null,
	"version": ""
	},
	"wandb": {
	"anonymous": null,
	"checkpoint_name": null,
	"log_model": false,
	"name": null,
	"offline": false,
	"prefix": "",
	"project": "ent-endoscopy-ssl",
	"version": null
	}
	},
	"method": "dino",
	"method_args": {
	"batch_norm": false,
	"bottleneck_dim": 256,
	"center_momentum": 0.9,
	"hidden_dim": 2048,
	"momentum_end": 1.0,
	"momentum_start": 0.99,
	"norm_last_layer": true,
	"output_dim": 1024,
	"student_freeze_last_layer_epochs": 1,
	"student_temp": 0.1,
	"teacher_temp": 0.02,
	"warmup_teacher_temp": 0.02,
	"warmup_teacher_temp_epochs": 30,
	"weight_decay_end": 0.0001,
	"weight_decay_start": 0.0001
	},
	"model": "SwinTransformer",
	"model_args": null,
	"num_nodes": 1,
	"num_workers": 64,
	"optim": "sgd",
	"optim_args": {
	"lr": 0.03,
	"momentum": 0.9,
	"weight_decay": 0.0001
	},
	"out": "outputs/ssl_dino/swin_tiny",
	"overwrite": true,
	"precision": "32-true",
	"resume": true,
	"seed": 0,
	"strategy": "SingleDeviceStrategy",
	"trainer_args": null,
	"transform_args": {
	"color_jitter": {
	"brightness": 0.8,
	"contrast": 0.8,
	"hue": 0.2,
	"prob": 0.8,
	"saturation": 0.4,
	"strength": 0.5
	},
	"gaussian_blur": {
	"blur_limit": 0,
	"prob": 1.0,
	"sigmas": [
	0.1,
	2.0
	]
	},
	"global_view_1": {
	"gaussian_blur": {
	"blur_limit": 0,
	"prob": 0.1,
	"sigmas": [
	0.1,
	2.0
	]
	},
	"solarize": {
	"prob": 0.2,
	"threshold": 0.5
	}
	},
	"image_size": [
	224,
	224
	],
	"local_view": {
	"gaussian_blur": {
	"blur_limit": 0,
	"prob": 0.5,
	"sigmas": [
	0.1,
	2.0
	]
	},
	"num_views": 0,
	"random_resize": {
	"max_scale": 0.14,
	"min_scale": 0.05
	},
	"view_size": [
	96,
	96
	]
	},
	"normalize": {
	"mean": [
	0.485,
	0.456,
	0.406
	],
	"std": [
	0.229,
	0.224,
	0.225
	]
	},
	"random_flip": {
	"horizontal_prob": 0.5,
	"vertical_prob": 0.0
	},
	"random_gray_scale": 0.2,
	"random_resize": {
	"max_scale": 1.0,
	"min_scale": 0.14
	},
	"random_rotation": null,
	"solarize": null
	}
	}
	[2025-08-15 05:14:18,140][WARNING] /usr/local/lib/python3.12/dist-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py:190: .fit(ckpt_path="last") is set, but there is no last checkpoint available. No checkpoint will be loaded. HINT: Set `ModelCheckpoint(..., save_last=True)`.

	[2025-08-15 05:14:18,141][INFO] You are using a CUDA device ('NVIDIA GeForce RTX 5090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' \| 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
	[2025-08-15 05:14:18,297][INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
	[2025-08-15 05:14:18,331][INFO] Loading `train_dataloader` to estimate number of stepping batches.
	[2025-08-15 05:14:19,711][INFO]
	\| Name \| Type \| Params \| Mode
	-----------------------------------------------------------------------
	0 \| teacher_embedding_model \| EmbeddingModel \| 28.3 M \| train
	1 \| teacher_projection_head \| DINOProjectionHead \| 6.6 M \| train
	2 \| student_embedding_model \| EmbeddingModel \| 28.3 M \| train
	3 \| student_projection_head \| DINOProjectionHead \| 6.6 M \| train
	4 \| flatten \| Flatten \| 0 \| train
	5 \| criterion \| DINOLoss \| 0 \| train
	-----------------------------------------------------------------------
	69.8 M Trainable params
	2.0 K Non-trainable params
	69.8 M Total params
	279.101 Total estimated model params size (MB)
	520 Modules in train mode
	0 Modules in eval mode
	[2025-08-15 11:33:38,060][INFO] `Trainer.fit` stopped: `max_epochs=300` reached.
	[2025-08-15 11:33:39,373][INFO] Training completed.
	[2025-08-15 11:33:39,374][DEBUG] Exporting model to '/workspace/ent-labotary/outputs/ssl_dino/swin_tiny/exported_models/exported_last.pt' in format 'ModelFormat.PACKAGE_DEFAULT'.
	[2025-08-15 11:33:39,460][INFO] [7mExample: How to use the exported model[0m
	----------------------------------------------------------------------------------------
	[48;5;235m[38;5;229mimport timm
	[0m[48;5;235m[38;5;229m
	[0m[48;5;235m[38;5;229m# Load the pretrained model
	[0m[48;5;235m[38;5;229mmodel = timm.create_model(
	[0m[48;5;235m[38;5;229m model_name='swin_s3_tiny_224',
	[0m[48;5;235m[38;5;229m checkpoint_path='/workspace/ent-labotary/outputs/ssl_dino/swin_tiny/exported_models/exported_last.pt',
	[0m[48;5;235m[38;5;229m)
	[0m[48;5;235m[38;5;229m
	[0m[48;5;235m[38;5;229m# Finetune or evaluate the model
	[0m[48;5;235m[38;5;229m...
	[0m----------------------------------------------------------------------------------------

	[2025-08-15 11:33:39,460][INFO] Model exported.