Upload vit-wee__baseline_seed=42 to labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42

Browse files

Files changed (4) hide show

labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/README.md +64 -0
labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/args.yaml +223 -0
labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/model_best.pth.tar +3 -0
labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/summary.csv +0 -0

labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+library_name: timm
+tags:
+  - image-classification
+  - labelmix
+---
+# labelmix/baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42
+This model is part of the [KonstantinGarbers/labelmix](https://huggingface.co/KonstantinGarbers/labelmix) repository.
+## Hyperparameters
+| Key | Value |
+|-----|-------|
+| `model` | `vit_wee_patch16_reg1_gap_256` |
+| `dataset` | `hfds/ILSVRC/imagenet-1k` |
+| `seed` | `42` |
+| `labelmix_alpha_min` | `0.1` |
+| `labelmix_alpha_max` | `1.0` |
+| `labelmix_reverse` | `False` |
+| `labelmix_schedule` | `fixed` |
+| `labelmix_k_min` | `None` |
+| `labelmix_k_max` | `None` |
+| `labelmix_k_reverse` | `False` |
+| `labelmix_k_schedule` | `fixed` |
+| `labelmix_sampling_max_aspect` | `10.0` |
+| `labelmix_sampling_min_side_px` | `6` |
+| `experiment` | `vit-wee__baseline_seed=42` |
+| `labelmix` | `False` |
+| `labelmix_loss` | `soft_ce` |
+| `labelmix_mix_k` | `5` |
+## Best Validation Metrics
+| Metric | Value |
+|--------|-------|
+| `epoch` | `108` |
+| `step` | `136125` |
+| `train_loss` | `3.382702350616455` |
+| `train_cross_entropy` | `2.2727866172790527` |
+| `train_grad_norm` | `1.6432865858078003` |
+| `eval_loss` | `1.017465326423645` |
+| `eval_top1` | `76.31400005859375` |
+| `eval_top5` | `93.33000002929687` |
+| `eval_ece` | `9.331677436828613` |
+| `lr` | `9.4763947085907e-07` |
+## Run Status
+- **status**: finished
+- **finished**: True
+- **start_step**: 0
+- **wandb_id**: gw4vrr3w
+## Note on Epoch Arguments
+> **Disclaimer:** The epoch-related arguments inside `args.yaml` (e.g. `epochs`, `warmup_epochs`) do **not** correspond to the actual number of epochs trained. Training duration is controlled via optimizer steps — see `num_steps` and `warmup_steps`. To convert these steps to epochs use:
+> ```
+> epochs = (num_steps + warmup_steps) / (int(balanced_mode) * num_classes / batch_size)
+> ```
+> where `batch_size` is the global batch size (default 1024).
+*Run name: `vit-wee__baseline_seed=42`*

labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/args.yaml ADDED Viewed

	@@ -0,0 +1,223 @@

+aa: rand-m6-inc1-mstd1.0-n3
+amp: true
+amp_dtype: bfloat16
+amp_impl: native
+aug_repeats: 0.0
+aug_splits: 0
+balanced_buffer: 4096
+balanced_buffer_steps: '4'
+balanced_cache_path: ''
+balanced_cache_threshold: 256
+balanced_cache_threshold_steps: '3'
+balanced_input_key: null
+balanced_mode: '1280'
+balanced_target_key: null
+batch_size: 512
+bce_loss: false
+bce_pos_weight: null
+bce_sum: false
+bce_target_thresh: null
+bn_eps: null
+bn_momentum: null
+channels_last: false
+check_resume: true
+check_resume_search_limit: 30
+checkpoint_hist: 10
+class_map: ''
+clip_grad: null
+clip_mode: norm
+color_jitter: null
+color_jitter_prob: null
+cooldown_epochs: 10
+cooldown_steps: 0
+crop_pct: 0.95
+cutmix: 1.0
+cutmix_minmax: null
+data: null
+data_dir: /dev/shm/imagenet-1k
+dataset: hfds/ILSVRC/imagenet-1k
+dataset_download: false
+dataset_trust_remote_code: false
+decay_epochs: 100
+decay_milestones:
+- 90
+- 180
+- 270
+decay_rate: 0.1
+decay_steps: 90
+device: cuda
+device_modules: null
+dist_bn: reduce
+distill_loss_weight: null
+drop: 0.0
+drop_block: null
+drop_connect: null
+drop_path: 0.2
+epoch_repeats: 0.0
+epochs: 450
+eval_metric: top1
+fast_norm: false
+force_cpu: false
+fuser: ''
+gaussian_blur_prob: null
+gp: null
+grad_accum_steps: 1
+grad_checkpointing: false
+grayscale_prob: null
+head_init_bias: null
+head_init_scale: 0.0
+hflip: 0.5
+img_size: 256
+in_chans: null
+initial_checkpoint: ''
+input_img_mode: null
+input_key: image
+input_size: null
+interpolation: ''
+jsd_loss: false
+kd_distill_type: logit
+kd_loss_type: kl
+kd_model_name: null
+kd_student_feature_dim: null
+kd_teacher_feature_dim: null
+kd_temperature: 4.0
+kd_token_distill_type: soft
+labelmix: false
+labelmix_alpha_max: 1.0
+labelmix_alpha_min: 0.1
+labelmix_k_max: null
+labelmix_k_min: null
+labelmix_k_reverse: false
+labelmix_k_schedule: fixed
+labelmix_k_total_epochs: null
+labelmix_k_warmup_epochs: 0
+labelmix_loss: soft_ce
+labelmix_mix_k: 5
+labelmix_producer_rank: -1
+labelmix_producer_workers: 0
+labelmix_reverse: false
+labelmix_sampling: false
+labelmix_sampling_bins: 16
+labelmix_sampling_low_watermark: 32
+labelmix_sampling_max_aspect: 10.0
+labelmix_sampling_max_attempts: 200
+labelmix_sampling_min_side_px: 6
+labelmix_sampling_pool_size: 128
+labelmix_schedule: fixed
+labelmix_step_mode: total
+labelmix_total_epochs: null
+labelmix_total_steps: null
+labelmix_warmup_steps: 0
+layer_decay: null
+layer_decay_min_scale: 0
+layer_decay_no_opt_scale: null
+loader_prefetch_factor: 2
+local_rank: 0
+log_dir: ''
+log_interval: 50
+log_wandb: true
+lr: null
+lr_base: 0.0015
+lr_base_scale: ''
+lr_base_size: 1024
+lr_cycle_decay: 0.5
+lr_cycle_limit: 1
+lr_cycle_mul: 1.0
+lr_k_decay: 1.0
+lr_noise: null
+lr_noise_pct: 0.67
+lr_noise_std: 1.0
+mean: null
+min_lr: 5.0e-07
+mixup: 0.8
+mixup_mode: batch
+mixup_off_epoch: 0
+mixup_off_step: 0
+mixup_prob: 1.0
+mixup_switch_prob: 0.5
+model: vit_wee_patch16_reg1_gap_256
+model_dtype: null
+model_ema: true
+model_ema_decay: 0.9998
+model_ema_force_cpu: false
+model_ema_warmup: true
+model_kwargs:
+  fix_init: true
+momentum: 0.9
+naflex_loader: false
+naflex_loss_scale: linear
+naflex_max_seq_len: 576
+naflex_patch_size_probs: null
+naflex_patch_sizes: null
+naflex_train_seq_lens:
+- 128
+- 256
+- 576
+- 784
+- 1024
+no_aug: false
+no_ddp_bb: false
+no_prefetcher: false
+no_resume_opt: false
+num_classes: 1000
+num_evals: 100
+num_logs: 1000
+num_saves: 20
+num_steps: 125000
+opt: nadamw
+opt_betas: null
+opt_eps: 1.0e-08
+opt_kwargs: {}
+patience_epochs: 10
+patience_steps: 10
+pin_mem: true
+pretrained: false
+pretrained_path: null
+ratio:
+- 0.75
+- 1.3333333333333333
+recount: 1
+recovery_interval: 5000
+remode: pixel
+reprob: 0.2
+resplit: false
+save_images: false
+scale:
+- 0.08
+- 1.0
+sched: cosine
+sched_on_updates: true
+seed: 42
+smoothing: 0.1
+split_bn: false
+start_epoch: null
+start_step: null
+std: null
+sync_bn: false
+synchronize_step: false
+target_key: label
+task_loss_weight: null
+torchcompile: inductor
+torchcompile_mode: null
+torchscript: false
+train_crop_mode: null
+train_interpolation: random
+train_num_samples: null
+train_split: train
+training_started_step: 2
+tta: 0
+use_multi_epochs_loader: false
+val_interval: 1
+val_num_samples: null
+val_split: validation
+validation_batch_size: null
+vflip: 0.0
+wandb_project: labelmix
+wandb_tags: []
+warmup_epochs: 20
+warmup_lr: 5.0e-07
+warmup_prefix: true
+warmup_steps: 12500
+weight_decay: 0.06
+worker_seeding: all
+workers: 8

labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/model_best.pth.tar ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c290fb3ddcc3a0ec7068d7b5d2448e632f5c0507b1401d72ead6a7c7fce66e95
+size 215091547

labelmix_baseline_vit_wee_patch16_reg1_gap_256_imagenet-1k_seed42/summary.csv ADDED Viewed

The diff for this file is too large to render. See raw diff