tky823 commited on
Commit
dee6e92
·
verified ·
1 Parent(s): 848183a

Upload folder using huggingface_hub

Browse files
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-143631/log/20250505-143633/.hydra/config.yaml ADDED
@@ -0,0 +1,232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: null
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: null
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 5
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
46
+ sample_rate: ${..audio.sample_rate}
47
+ n_fft: 1024
48
+ hop_length: 512
49
+ f_min: 20
50
+ f_max: 16000
51
+ pad: 0
52
+ n_mels: 128
53
+ window_fn:
54
+ _target_: torch.hann_window
55
+ _partial_: true
56
+ power: 1.0
57
+ normalized: false
58
+ wkwargs: null
59
+ center: true
60
+ pad_mode: constant
61
+ onesided: null
62
+ norm: slaney
63
+ mel_scale: slaney
64
+ take_log: true
65
+ freq_mask_param:
66
+ - 0.06
67
+ - 0.1
68
+ time_mask_param:
69
+ - 0.06
70
+ - 0.12
71
+ eps: null
72
+ train:
73
+ dataset:
74
+ train:
75
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
76
+ list_path: dump/birdclef2025_reshape_fft1024_5s/list/train.txt
77
+ feature_dir: /kaggle/input/birdclef-2025
78
+ audio_key: audio
79
+ sample_rate_key: sample_rate
80
+ label_name_key: primary_label
81
+ filename_key: filename
82
+ validation:
83
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
84
+ list_path: dump/birdclef2025_reshape_fft1024_5s/list/validation.txt
85
+ feature_dir: /kaggle/input/birdclef-2025
86
+ audio_key: ${..train.audio_key}
87
+ sample_rate_key: ${..train.sample_rate_key}
88
+ label_name_key: ${..train.label_name_key}
89
+ filename_key: ${..train.filename_key}
90
+ dataloader:
91
+ train:
92
+ _target_: torch.utils.data.DataLoader
93
+ batch_size: 64
94
+ shuffle: true
95
+ collate_fn:
96
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineCollator
97
+ composer:
98
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
99
+ melspectrogram_transform: ${data.melspectrogram}
100
+ audio_key: audio
101
+ sample_rate_key: sample_rate
102
+ label_name_key: primary_label
103
+ filename_key: filename
104
+ waveform_key: waveform
105
+ melspectrogram_key: log_melspectrogram
106
+ label_index_key: label_index
107
+ sample_rate: ${data.audio.sample_rate}
108
+ duration: ${data.audio.duration}
109
+ decode_audio_as_waveform: true
110
+ decode_audio_as_monoral: true
111
+ training: true
112
+ target_shape: 256
113
+ melspectrogram_key: ${.composer.melspectrogram_key}
114
+ label_index_key: ${.composer.label_index_key}
115
+ alpha: 0.4
116
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
117
+ validation:
118
+ _target_: torch.utils.data.DataLoader
119
+ batch_size: 64
120
+ shuffle: false
121
+ collate_fn:
122
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
123
+ composer:
124
+ _target_: ${....train.collate_fn.composer._target_}
125
+ melspectrogram_transform: ${....train.collate_fn.composer.melspectrogram_transform}
126
+ audio_key: ${....train.collate_fn.composer.audio_key}
127
+ sample_rate_key: ${....train.collate_fn.composer.sample_rate_key}
128
+ label_name_key: ${....train.collate_fn.composer.label_name_key}
129
+ filename_key: ${....train.collate_fn.composer.filename_key}
130
+ waveform_key: ${....train.collate_fn.composer.waveform_key}
131
+ melspectrogram_key: ${....train.collate_fn.composer.melspectrogram_key}
132
+ label_index_key: ${....train.collate_fn.composer.label_index_key}
133
+ sample_rate: ${....train.collate_fn.composer.sample_rate}
134
+ duration: ${....train.collate_fn.composer.duration}
135
+ decode_audio_as_waveform: ${....train.collate_fn.composer.decode_audio_as_waveform}
136
+ decode_audio_as_monoral: ${....train.collate_fn.composer.decode_audio_as_monoral}
137
+ training: false
138
+ target_shape: ${....train.collate_fn.composer.target_shape}
139
+ melspectrogram_key: ${...train.collate_fn.composer.melspectrogram_key}
140
+ label_index_key: ${...train.collate_fn.composer.label_index_key}
141
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
142
+ clip_gradient: {}
143
+ record: {}
144
+ trainer:
145
+ _target_: birdclef2025.utils.driver.BaseTrainer
146
+ key_mapping:
147
+ train:
148
+ input:
149
+ input: ${....dataloader.train.collate_fn.composer.melspectrogram_key}
150
+ output: logit
151
+ validation: ${.train}
152
+ inference: ${.validation}
153
+ ddp_kwargs: null
154
+ resume:
155
+ continue_from: ''
156
+ output:
157
+ exp_dir: ./exp/20250505-143631
158
+ tensorboard_dir: ./tensorboard/20250505-143631
159
+ save_checkpoint:
160
+ iteration:
161
+ every: 10000
162
+ path: ${...exp_dir}/model/iteration{iteration}.pth
163
+ epoch:
164
+ every: 10
165
+ path: ${...exp_dir}/model/epoch{epoch}.pth
166
+ last:
167
+ path: ${...exp_dir}/model/last.pth
168
+ best_epoch:
169
+ path: ${...exp_dir}/model/best_epoch.pth
170
+ steps:
171
+ epochs: 10
172
+ iterations: null
173
+ lr_scheduler: epoch
174
+ test:
175
+ dataset:
176
+ test:
177
+ _target_: torch.utils.data.Dataset
178
+ dataloader:
179
+ test:
180
+ _target_: torch.utils.data.DataLoader
181
+ batch_size: 1
182
+ shuffle: false
183
+ key_mapping:
184
+ inference:
185
+ input: null
186
+ output: null
187
+ identifier: null
188
+ checkpoint: null
189
+ remove_weight_norm: null
190
+ output:
191
+ exp_dir: ./exp
192
+ inference_dir: ${.exp_dir}/inference
193
+ audio:
194
+ sample_rate: ${data.audio.sample_rate}
195
+ key_mapping:
196
+ inference:
197
+ output: null
198
+ reference: null
199
+ transforms:
200
+ inference:
201
+ output: null
202
+ reference: null
203
+ model:
204
+ _target_: birdclef2025.models.EfficientNetB0
205
+ weights: ${const:torchvision.models.EfficientNet_B0_Weights.IMAGENET1K_V1}
206
+ num_classes: ${const:birdclef2025.utils.data.birdclef.num_birdclef2025_primary_labels}
207
+ optimizer:
208
+ _target_: torch.optim.Adam
209
+ lr_scheduler: {}
210
+ criterion:
211
+ _target_: audyn.criterion.MultiCriteria
212
+ cross_entropy:
213
+ _target_: audyn.criterion.BaseCriterionWrapper
214
+ criterion:
215
+ _target_: torch.nn.CrossEntropyLoss
216
+ reduction: mean
217
+ weight: 1
218
+ key_mapping:
219
+ estimated:
220
+ input: logit
221
+ target:
222
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
223
+ metrics:
224
+ roc_auc:
225
+ metric:
226
+ _target_: birdclef2025.metrics.ROCAUC
227
+ take_softmax: true
228
+ key_mapping:
229
+ estimated:
230
+ input: logit
231
+ target:
232
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-143631/log/20250505-143633/.hydra/hydra.yaml ADDED
@@ -0,0 +1,191 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ hydra:
2
+ run:
3
+ dir: ./exp/20250505-143631/log/20250505-143633
4
+ sweep:
5
+ dir: multirun/${now:%Y-%m-%d}/${now:%H-%M-%S}
6
+ subdir: ${hydra.job.num}
7
+ launcher:
8
+ _target_: hydra._internal.core_plugins.basic_launcher.BasicLauncher
9
+ sweeper:
10
+ _target_: hydra._internal.core_plugins.basic_sweeper.BasicSweeper
11
+ max_batch_size: null
12
+ params: null
13
+ help:
14
+ app_name: ${hydra.job.name}
15
+ header: '${hydra.help.app_name} is powered by Hydra.
16
+
17
+ '
18
+ footer: 'Powered by Hydra (https://hydra.cc)
19
+
20
+ Use --hydra-help to view Hydra specific help
21
+
22
+ '
23
+ template: '${hydra.help.header}
24
+
25
+ == Configuration groups ==
26
+
27
+ Compose your configuration from those groups (group=option)
28
+
29
+
30
+ $APP_CONFIG_GROUPS
31
+
32
+
33
+ == Config ==
34
+
35
+ Override anything in the config (foo.bar=value)
36
+
37
+
38
+ $CONFIG
39
+
40
+
41
+ ${hydra.help.footer}
42
+
43
+ '
44
+ hydra_help:
45
+ template: 'Hydra (${hydra.runtime.version})
46
+
47
+ See https://hydra.cc for more info.
48
+
49
+
50
+ == Flags ==
51
+
52
+ $FLAGS_HELP
53
+
54
+
55
+ == Configuration groups ==
56
+
57
+ Compose your configuration from those groups (For example, append hydra/job_logging=disabled
58
+ to command line)
59
+
60
+
61
+ $HYDRA_CONFIG_GROUPS
62
+
63
+
64
+ Use ''--cfg hydra'' to Show the Hydra config.
65
+
66
+ '
67
+ hydra_help: ???
68
+ hydra_logging:
69
+ version: 1
70
+ formatters:
71
+ simple:
72
+ format: '[%(asctime)s][HYDRA] %(message)s'
73
+ handlers:
74
+ console:
75
+ class: logging.StreamHandler
76
+ formatter: simple
77
+ stream: ext://sys.stdout
78
+ root:
79
+ level: INFO
80
+ handlers:
81
+ - console
82
+ loggers:
83
+ logging_example:
84
+ level: DEBUG
85
+ disable_existing_loggers: false
86
+ job_logging:
87
+ version: 1
88
+ formatters:
89
+ simple:
90
+ format: '[%(asctime)s][%(name)s][%(levelname)s] - %(message)s'
91
+ handlers:
92
+ console:
93
+ class: logging.StreamHandler
94
+ formatter: simple
95
+ stream: ext://sys.stdout
96
+ file:
97
+ class: logging.FileHandler
98
+ formatter: simple
99
+ filename: ${hydra.runtime.output_dir}/${hydra.job.name}.log
100
+ root:
101
+ level: INFO
102
+ handlers:
103
+ - console
104
+ - file
105
+ disable_existing_loggers: false
106
+ env: {}
107
+ mode: RUN
108
+ searchpath: []
109
+ callbacks: {}
110
+ output_subdir: .hydra
111
+ overrides:
112
+ hydra:
113
+ - hydra.run.dir=./exp/20250505-143631/log/20250505-143633
114
+ - hydra.mode=RUN
115
+ task:
116
+ - system=cuda
117
+ - preprocess=birdclef2025
118
+ - data=birdclef2025_reshape_fft1024_5s
119
+ - train=birdclef2025_reshape_efficientnet_b0
120
+ - model=birdclef2025_efficientnet_b0
121
+ - optimizer=adam
122
+ - lr_scheduler=none
123
+ - criterion=birdclef2025_categorical_cross_entropy
124
+ - +metrics=birdclef2025_categorical_cross_entropy
125
+ - preprocess.dump_format=birdclef2025
126
+ - train.dataset.train.list_path=dump/birdclef2025_reshape_fft1024_5s/list/train.txt
127
+ - train.dataset.train.feature_dir=/kaggle/input/birdclef-2025
128
+ - train.dataset.validation.list_path=dump/birdclef2025_reshape_fft1024_5s/list/validation.txt
129
+ - train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025
130
+ - train.resume.continue_from=
131
+ - train.output.exp_dir=./exp/20250505-143631
132
+ - train.output.tensorboard_dir=./tensorboard/20250505-143631
133
+ job:
134
+ name: train
135
+ chdir: false
136
+ override_dirname: +metrics=birdclef2025_categorical_cross_entropy,criterion=birdclef2025_categorical_cross_entropy,data=birdclef2025_reshape_fft1024_5s,lr_scheduler=none,model=birdclef2025_efficientnet_b0,optimizer=adam,preprocess.dump_format=birdclef2025,preprocess=birdclef2025,system=cuda,train.dataset.train.feature_dir=/kaggle/input/birdclef-2025,train.dataset.train.list_path=dump/birdclef2025_reshape_fft1024_5s/list/train.txt,train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025,train.dataset.validation.list_path=dump/birdclef2025_reshape_fft1024_5s/list/validation.txt,train.output.exp_dir=./exp/20250505-143631,train.output.tensorboard_dir=./tensorboard/20250505-143631,train.resume.continue_from=,train=birdclef2025_reshape_efficientnet_b0
137
+ id: ???
138
+ num: ???
139
+ config_name: config
140
+ env_set: {}
141
+ env_copy: []
142
+ config:
143
+ override_dirname:
144
+ kv_sep: '='
145
+ item_sep: ','
146
+ exclude_keys: []
147
+ runtime:
148
+ version: 1.3.2
149
+ version_base: '1.2'
150
+ cwd: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/EfficientNetB0
151
+ config_sources:
152
+ - path: hydra.conf
153
+ schema: pkg
154
+ provider: hydra
155
+ - path: /usr/local/lib/python3.10/dist-packages/audyn/configs
156
+ schema: file
157
+ provider: main
158
+ - path: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/EfficientNetB0/conf
159
+ schema: file
160
+ provider: command-line
161
+ - path: ''
162
+ schema: structured
163
+ provider: schema
164
+ output_dir: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-143631/log/20250505-143633
165
+ choices:
166
+ metrics: birdclef2025_categorical_cross_entropy
167
+ criterion: birdclef2025_categorical_cross_entropy
168
+ lr_scheduler: none
169
+ optimizer: adam
170
+ model: birdclef2025_efficientnet_b0
171
+ test: default
172
+ test/dataloader: default
173
+ test/dataset: default
174
+ train: birdclef2025_reshape_efficientnet_b0
175
+ train/record: default
176
+ train/clip_gradient: default
177
+ train/dataloader: default
178
+ train/dataset: birdclef2025_primary-label
179
+ data: birdclef2025_reshape_fft1024_5s
180
+ preprocess: birdclef2025
181
+ system: cuda
182
+ hydra/env: default
183
+ hydra/callbacks: null
184
+ hydra/job_logging: default
185
+ hydra/hydra_logging: default
186
+ hydra/hydra_help: default
187
+ hydra/help: default
188
+ hydra/sweeper: basic
189
+ hydra/launcher: basic
190
+ hydra/output: default
191
+ verbose: false
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-143631/log/20250505-143633/.hydra/overrides.yaml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ - system=cuda
2
+ - preprocess=birdclef2025
3
+ - data=birdclef2025_reshape_fft1024_5s
4
+ - train=birdclef2025_reshape_efficientnet_b0
5
+ - model=birdclef2025_efficientnet_b0
6
+ - optimizer=adam
7
+ - lr_scheduler=none
8
+ - criterion=birdclef2025_categorical_cross_entropy
9
+ - +metrics=birdclef2025_categorical_cross_entropy
10
+ - preprocess.dump_format=birdclef2025
11
+ - train.dataset.train.list_path=dump/birdclef2025_reshape_fft1024_5s/list/train.txt
12
+ - train.dataset.train.feature_dir=/kaggle/input/birdclef-2025
13
+ - train.dataset.validation.list_path=dump/birdclef2025_reshape_fft1024_5s/list/validation.txt
14
+ - train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025
15
+ - train.resume.continue_from=
16
+ - train.output.exp_dir=./exp/20250505-143631
17
+ - train.output.tensorboard_dir=./tensorboard/20250505-143631
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-143631/log/20250505-143633/.hydra/resolved_config.yaml ADDED
@@ -0,0 +1,293 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: false
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: 2
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 5
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
46
+ sample_rate: 32000
47
+ n_fft: 1024
48
+ hop_length: 512
49
+ f_min: 20
50
+ f_max: 16000
51
+ pad: 0
52
+ n_mels: 128
53
+ window_fn:
54
+ _target_: torch.hann_window
55
+ _partial_: true
56
+ power: 1.0
57
+ normalized: false
58
+ wkwargs: null
59
+ center: true
60
+ pad_mode: constant
61
+ onesided: null
62
+ norm: slaney
63
+ mel_scale: slaney
64
+ take_log: true
65
+ freq_mask_param:
66
+ - 0.06
67
+ - 0.1
68
+ time_mask_param:
69
+ - 0.06
70
+ - 0.12
71
+ eps: null
72
+ train:
73
+ dataset:
74
+ train:
75
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
76
+ list_path: dump/birdclef2025_reshape_fft1024_5s/list/train.txt
77
+ feature_dir: /kaggle/input/birdclef-2025
78
+ audio_key: audio
79
+ sample_rate_key: sample_rate
80
+ label_name_key: primary_label
81
+ filename_key: filename
82
+ validation:
83
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
84
+ list_path: dump/birdclef2025_reshape_fft1024_5s/list/validation.txt
85
+ feature_dir: /kaggle/input/birdclef-2025
86
+ audio_key: audio
87
+ sample_rate_key: sample_rate
88
+ label_name_key: primary_label
89
+ filename_key: filename
90
+ dataloader:
91
+ train:
92
+ _target_: torch.utils.data.DataLoader
93
+ batch_size: 64
94
+ shuffle: true
95
+ collate_fn:
96
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineCollator
97
+ composer:
98
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
99
+ melspectrogram_transform:
100
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
101
+ sample_rate: 32000
102
+ n_fft: 1024
103
+ hop_length: 512
104
+ f_min: 20
105
+ f_max: 16000
106
+ pad: 0
107
+ n_mels: 128
108
+ window_fn:
109
+ _target_: torch.hann_window
110
+ _partial_: true
111
+ power: 1.0
112
+ normalized: false
113
+ wkwargs: null
114
+ center: true
115
+ pad_mode: constant
116
+ onesided: null
117
+ norm: slaney
118
+ mel_scale: slaney
119
+ take_log: true
120
+ freq_mask_param:
121
+ - 0.06
122
+ - 0.1
123
+ time_mask_param:
124
+ - 0.06
125
+ - 0.12
126
+ eps: null
127
+ audio_key: audio
128
+ sample_rate_key: sample_rate
129
+ label_name_key: primary_label
130
+ filename_key: filename
131
+ waveform_key: waveform
132
+ melspectrogram_key: log_melspectrogram
133
+ label_index_key: label_index
134
+ sample_rate: 32000
135
+ duration: 5
136
+ decode_audio_as_waveform: true
137
+ decode_audio_as_monoral: true
138
+ training: true
139
+ target_shape: 256
140
+ melspectrogram_key: log_melspectrogram
141
+ label_index_key: label_index
142
+ alpha: 0.4
143
+ num_workers: 2
144
+ validation:
145
+ _target_: torch.utils.data.DataLoader
146
+ batch_size: 64
147
+ shuffle: false
148
+ collate_fn:
149
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
150
+ composer:
151
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
152
+ melspectrogram_transform:
153
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
154
+ sample_rate: 32000
155
+ n_fft: 1024
156
+ hop_length: 512
157
+ f_min: 20
158
+ f_max: 16000
159
+ pad: 0
160
+ n_mels: 128
161
+ window_fn:
162
+ _target_: torch.hann_window
163
+ _partial_: true
164
+ power: 1.0
165
+ normalized: false
166
+ wkwargs: null
167
+ center: true
168
+ pad_mode: constant
169
+ onesided: null
170
+ norm: slaney
171
+ mel_scale: slaney
172
+ take_log: true
173
+ freq_mask_param:
174
+ - 0.06
175
+ - 0.1
176
+ time_mask_param:
177
+ - 0.06
178
+ - 0.12
179
+ eps: null
180
+ audio_key: audio
181
+ sample_rate_key: sample_rate
182
+ label_name_key: primary_label
183
+ filename_key: filename
184
+ waveform_key: waveform
185
+ melspectrogram_key: log_melspectrogram
186
+ label_index_key: label_index
187
+ sample_rate: 32000
188
+ duration: 5
189
+ decode_audio_as_waveform: true
190
+ decode_audio_as_monoral: true
191
+ training: false
192
+ target_shape: 256
193
+ melspectrogram_key: log_melspectrogram
194
+ label_index_key: label_index
195
+ num_workers: 2
196
+ clip_gradient: {}
197
+ record: {}
198
+ trainer:
199
+ _target_: birdclef2025.utils.driver.BaseTrainer
200
+ key_mapping:
201
+ train:
202
+ input:
203
+ input: log_melspectrogram
204
+ output: logit
205
+ validation:
206
+ input:
207
+ input: log_melspectrogram
208
+ output: logit
209
+ inference:
210
+ input:
211
+ input: log_melspectrogram
212
+ output: logit
213
+ ddp_kwargs: null
214
+ resume:
215
+ continue_from: ''
216
+ output:
217
+ exp_dir: ./exp/20250505-143631
218
+ tensorboard_dir: ./tensorboard/20250505-143631
219
+ save_checkpoint:
220
+ iteration:
221
+ every: 10000
222
+ path: ./exp/20250505-143631/model/iteration{iteration}.pth
223
+ epoch:
224
+ every: 10
225
+ path: ./exp/20250505-143631/model/epoch{epoch}.pth
226
+ last:
227
+ path: ./exp/20250505-143631/model/last.pth
228
+ best_epoch:
229
+ path: ./exp/20250505-143631/model/best_epoch.pth
230
+ steps:
231
+ epochs: 10
232
+ iterations: null
233
+ lr_scheduler: epoch
234
+ test:
235
+ dataset:
236
+ test:
237
+ _target_: torch.utils.data.Dataset
238
+ dataloader:
239
+ test:
240
+ _target_: torch.utils.data.DataLoader
241
+ batch_size: 1
242
+ shuffle: false
243
+ key_mapping:
244
+ inference:
245
+ input: null
246
+ output: null
247
+ identifier: null
248
+ checkpoint: null
249
+ remove_weight_norm: null
250
+ output:
251
+ exp_dir: ./exp
252
+ inference_dir: ./exp/inference
253
+ audio:
254
+ sample_rate: 32000
255
+ key_mapping:
256
+ inference:
257
+ output: null
258
+ reference: null
259
+ transforms:
260
+ inference:
261
+ output: null
262
+ reference: null
263
+ ddp_kwargs: null
264
+ model:
265
+ _target_: birdclef2025.models.EfficientNetB0
266
+ weights: IMAGENET1K_V1
267
+ num_classes: 206
268
+ optimizer:
269
+ _target_: torch.optim.Adam
270
+ lr_scheduler: {}
271
+ criterion:
272
+ _target_: audyn.criterion.MultiCriteria
273
+ cross_entropy:
274
+ _target_: audyn.criterion.BaseCriterionWrapper
275
+ criterion:
276
+ _target_: torch.nn.CrossEntropyLoss
277
+ reduction: mean
278
+ weight: 1
279
+ key_mapping:
280
+ estimated:
281
+ input: logit
282
+ target:
283
+ target: label_index
284
+ metrics:
285
+ roc_auc:
286
+ metric:
287
+ _target_: birdclef2025.metrics.ROCAUC
288
+ take_softmax: true
289
+ key_mapping:
290
+ estimated:
291
+ input: logit
292
+ target:
293
+ target: label_index
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-143631/log/20250505-143633/train.log ADDED
@@ -0,0 +1,1043 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [2025-05-05 14:37:02,691][BaseTrainer][INFO] - system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: false
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: 2
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 5
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
46
+ sample_rate: 32000
47
+ n_fft: 1024
48
+ hop_length: 512
49
+ f_min: 20
50
+ f_max: 16000
51
+ pad: 0
52
+ n_mels: 128
53
+ window_fn:
54
+ _target_: torch.hann_window
55
+ _partial_: true
56
+ power: 1.0
57
+ normalized: false
58
+ wkwargs: null
59
+ center: true
60
+ pad_mode: constant
61
+ onesided: null
62
+ norm: slaney
63
+ mel_scale: slaney
64
+ take_log: true
65
+ freq_mask_param:
66
+ - 0.06
67
+ - 0.1
68
+ time_mask_param:
69
+ - 0.06
70
+ - 0.12
71
+ eps: null
72
+ train:
73
+ dataset:
74
+ train:
75
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
76
+ list_path: dump/birdclef2025_reshape_fft1024_5s/list/train.txt
77
+ feature_dir: /kaggle/input/birdclef-2025
78
+ audio_key: audio
79
+ sample_rate_key: sample_rate
80
+ label_name_key: primary_label
81
+ filename_key: filename
82
+ validation:
83
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
84
+ list_path: dump/birdclef2025_reshape_fft1024_5s/list/validation.txt
85
+ feature_dir: /kaggle/input/birdclef-2025
86
+ audio_key: ${..train.audio_key}
87
+ sample_rate_key: ${..train.sample_rate_key}
88
+ label_name_key: ${..train.label_name_key}
89
+ filename_key: ${..train.filename_key}
90
+ dataloader:
91
+ train:
92
+ _target_: torch.utils.data.DataLoader
93
+ batch_size: 64
94
+ shuffle: true
95
+ collate_fn:
96
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineCollator
97
+ composer:
98
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
99
+ melspectrogram_transform: ${data.melspectrogram}
100
+ audio_key: audio
101
+ sample_rate_key: sample_rate
102
+ label_name_key: primary_label
103
+ filename_key: filename
104
+ waveform_key: waveform
105
+ melspectrogram_key: log_melspectrogram
106
+ label_index_key: label_index
107
+ sample_rate: ${data.audio.sample_rate}
108
+ duration: ${data.audio.duration}
109
+ decode_audio_as_waveform: true
110
+ decode_audio_as_monoral: true
111
+ training: true
112
+ target_shape: 256
113
+ melspectrogram_key: ${.composer.melspectrogram_key}
114
+ label_index_key: ${.composer.label_index_key}
115
+ alpha: 0.4
116
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
117
+ validation:
118
+ _target_: torch.utils.data.DataLoader
119
+ batch_size: 64
120
+ shuffle: false
121
+ collate_fn:
122
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
123
+ composer:
124
+ _target_: ${....train.collate_fn.composer._target_}
125
+ melspectrogram_transform: ${....train.collate_fn.composer.melspectrogram_transform}
126
+ audio_key: ${....train.collate_fn.composer.audio_key}
127
+ sample_rate_key: ${....train.collate_fn.composer.sample_rate_key}
128
+ label_name_key: ${....train.collate_fn.composer.label_name_key}
129
+ filename_key: ${....train.collate_fn.composer.filename_key}
130
+ waveform_key: ${....train.collate_fn.composer.waveform_key}
131
+ melspectrogram_key: ${....train.collate_fn.composer.melspectrogram_key}
132
+ label_index_key: ${....train.collate_fn.composer.label_index_key}
133
+ sample_rate: ${....train.collate_fn.composer.sample_rate}
134
+ duration: ${....train.collate_fn.composer.duration}
135
+ decode_audio_as_waveform: ${....train.collate_fn.composer.decode_audio_as_waveform}
136
+ decode_audio_as_monoral: ${....train.collate_fn.composer.decode_audio_as_monoral}
137
+ training: false
138
+ target_shape: ${....train.collate_fn.composer.target_shape}
139
+ melspectrogram_key: ${...train.collate_fn.composer.melspectrogram_key}
140
+ label_index_key: ${...train.collate_fn.composer.label_index_key}
141
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
142
+ clip_gradient: {}
143
+ record: {}
144
+ trainer:
145
+ _target_: birdclef2025.utils.driver.BaseTrainer
146
+ _partial_: true
147
+ key_mapping:
148
+ train:
149
+ input:
150
+ input: ${....dataloader.train.collate_fn.composer.melspectrogram_key}
151
+ output: logit
152
+ validation: ${.train}
153
+ inference: ${.validation}
154
+ ddp_kwargs: null
155
+ resume:
156
+ continue_from: ''
157
+ output:
158
+ exp_dir: ./exp/20250505-143631
159
+ tensorboard_dir: ./tensorboard/20250505-143631
160
+ save_checkpoint:
161
+ iteration:
162
+ every: 10000
163
+ path: ${...exp_dir}/model/iteration{iteration}.pth
164
+ epoch:
165
+ every: 10
166
+ path: ${...exp_dir}/model/epoch{epoch}.pth
167
+ last:
168
+ path: ${...exp_dir}/model/last.pth
169
+ best_epoch:
170
+ path: ${...exp_dir}/model/best_epoch.pth
171
+ steps:
172
+ epochs: 10
173
+ iterations: null
174
+ lr_scheduler: epoch
175
+ test:
176
+ dataset:
177
+ test:
178
+ _target_: torch.utils.data.Dataset
179
+ dataloader:
180
+ test:
181
+ _target_: torch.utils.data.DataLoader
182
+ batch_size: 1
183
+ shuffle: false
184
+ key_mapping:
185
+ inference:
186
+ input: null
187
+ output: null
188
+ identifier: null
189
+ checkpoint: null
190
+ remove_weight_norm: null
191
+ output:
192
+ exp_dir: ./exp
193
+ inference_dir: ${.exp_dir}/inference
194
+ audio:
195
+ sample_rate: ${data.audio.sample_rate}
196
+ key_mapping:
197
+ inference:
198
+ output: null
199
+ reference: null
200
+ transforms:
201
+ inference:
202
+ output: null
203
+ reference: null
204
+ ddp_kwargs: null
205
+ model:
206
+ _target_: birdclef2025.models.EfficientNetB0
207
+ weights: ${const:torchvision.models.EfficientNet_B0_Weights.IMAGENET1K_V1}
208
+ num_classes: ${const:birdclef2025.utils.data.birdclef.num_birdclef2025_primary_labels}
209
+ optimizer:
210
+ _target_: torch.optim.Adam
211
+ lr_scheduler: {}
212
+ criterion:
213
+ _target_: audyn.criterion.MultiCriteria
214
+ cross_entropy:
215
+ _target_: audyn.criterion.BaseCriterionWrapper
216
+ criterion:
217
+ _target_: torch.nn.CrossEntropyLoss
218
+ reduction: mean
219
+ weight: 1
220
+ key_mapping:
221
+ estimated:
222
+ input: logit
223
+ target:
224
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
225
+ metrics:
226
+ roc_auc:
227
+ metric:
228
+ _target_: birdclef2025.metrics.ROCAUC
229
+ take_softmax: true
230
+ key_mapping:
231
+ estimated:
232
+ input: logit
233
+ target:
234
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
235
+
236
+ [2025-05-05 14:37:02,691][BaseTrainer][INFO] - EfficientNetB0(
237
+ (backbone): Sequential(
238
+ (0): Conv2dNormActivation(
239
+ (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
240
+ (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
241
+ (2): SiLU(inplace=True)
242
+ )
243
+ (1): Sequential(
244
+ (0): MBConv(
245
+ (block): Sequential(
246
+ (0): Conv2dNormActivation(
247
+ (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
248
+ (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
249
+ (2): SiLU(inplace=True)
250
+ )
251
+ (1): SqueezeExcitation(
252
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
253
+ (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
254
+ (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
255
+ (activation): SiLU(inplace=True)
256
+ (scale_activation): Sigmoid()
257
+ )
258
+ (2): Conv2dNormActivation(
259
+ (0): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
260
+ (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
261
+ )
262
+ )
263
+ (stochastic_depth): StochasticDepth(p=0.0, mode=row)
264
+ )
265
+ )
266
+ (2): Sequential(
267
+ (0): MBConv(
268
+ (block): Sequential(
269
+ (0): Conv2dNormActivation(
270
+ (0): Conv2d(16, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
271
+ (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
272
+ (2): SiLU(inplace=True)
273
+ )
274
+ (1): Conv2dNormActivation(
275
+ (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96, bias=False)
276
+ (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
277
+ (2): SiLU(inplace=True)
278
+ )
279
+ (2): SqueezeExcitation(
280
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
281
+ (fc1): Conv2d(96, 4, kernel_size=(1, 1), stride=(1, 1))
282
+ (fc2): Conv2d(4, 96, kernel_size=(1, 1), stride=(1, 1))
283
+ (activation): SiLU(inplace=True)
284
+ (scale_activation): Sigmoid()
285
+ )
286
+ (3): Conv2dNormActivation(
287
+ (0): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
288
+ (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
289
+ )
290
+ )
291
+ (stochastic_depth): StochasticDepth(p=0.0125, mode=row)
292
+ )
293
+ (1): MBConv(
294
+ (block): Sequential(
295
+ (0): Conv2dNormActivation(
296
+ (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
297
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
298
+ (2): SiLU(inplace=True)
299
+ )
300
+ (1): Conv2dNormActivation(
301
+ (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=144, bias=False)
302
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
303
+ (2): SiLU(inplace=True)
304
+ )
305
+ (2): SqueezeExcitation(
306
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
307
+ (fc1): Conv2d(144, 6, kernel_size=(1, 1), stride=(1, 1))
308
+ (fc2): Conv2d(6, 144, kernel_size=(1, 1), stride=(1, 1))
309
+ (activation): SiLU(inplace=True)
310
+ (scale_activation): Sigmoid()
311
+ )
312
+ (3): Conv2dNormActivation(
313
+ (0): Conv2d(144, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
314
+ (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
315
+ )
316
+ )
317
+ (stochastic_depth): StochasticDepth(p=0.025, mode=row)
318
+ )
319
+ )
320
+ (3): Sequential(
321
+ (0): MBConv(
322
+ (block): Sequential(
323
+ (0): Conv2dNormActivation(
324
+ (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
325
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
326
+ (2): SiLU(inplace=True)
327
+ )
328
+ (1): Conv2dNormActivation(
329
+ (0): Conv2d(144, 144, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=144, bias=False)
330
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
331
+ (2): SiLU(inplace=True)
332
+ )
333
+ (2): SqueezeExcitation(
334
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
335
+ (fc1): Conv2d(144, 6, kernel_size=(1, 1), stride=(1, 1))
336
+ (fc2): Conv2d(6, 144, kernel_size=(1, 1), stride=(1, 1))
337
+ (activation): SiLU(inplace=True)
338
+ (scale_activation): Sigmoid()
339
+ )
340
+ (3): Conv2dNormActivation(
341
+ (0): Conv2d(144, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
342
+ (1): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
343
+ )
344
+ )
345
+ (stochastic_depth): StochasticDepth(p=0.037500000000000006, mode=row)
346
+ )
347
+ (1): MBConv(
348
+ (block): Sequential(
349
+ (0): Conv2dNormActivation(
350
+ (0): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
351
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
352
+ (2): SiLU(inplace=True)
353
+ )
354
+ (1): Conv2dNormActivation(
355
+ (0): Conv2d(240, 240, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=240, bias=False)
356
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
357
+ (2): SiLU(inplace=True)
358
+ )
359
+ (2): SqueezeExcitation(
360
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
361
+ (fc1): Conv2d(240, 10, kernel_size=(1, 1), stride=(1, 1))
362
+ (fc2): Conv2d(10, 240, kernel_size=(1, 1), stride=(1, 1))
363
+ (activation): SiLU(inplace=True)
364
+ (scale_activation): Sigmoid()
365
+ )
366
+ (3): Conv2dNormActivation(
367
+ (0): Conv2d(240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
368
+ (1): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
369
+ )
370
+ )
371
+ (stochastic_depth): StochasticDepth(p=0.05, mode=row)
372
+ )
373
+ )
374
+ (4): Sequential(
375
+ (0): MBConv(
376
+ (block): Sequential(
377
+ (0): Conv2dNormActivation(
378
+ (0): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
379
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
380
+ (2): SiLU(inplace=True)
381
+ )
382
+ (1): Conv2dNormActivation(
383
+ (0): Conv2d(240, 240, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=240, bias=False)
384
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
385
+ (2): SiLU(inplace=True)
386
+ )
387
+ (2): SqueezeExcitation(
388
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
389
+ (fc1): Conv2d(240, 10, kernel_size=(1, 1), stride=(1, 1))
390
+ (fc2): Conv2d(10, 240, kernel_size=(1, 1), stride=(1, 1))
391
+ (activation): SiLU(inplace=True)
392
+ (scale_activation): Sigmoid()
393
+ )
394
+ (3): Conv2dNormActivation(
395
+ (0): Conv2d(240, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
396
+ (1): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
397
+ )
398
+ )
399
+ (stochastic_depth): StochasticDepth(p=0.0625, mode=row)
400
+ )
401
+ (1): MBConv(
402
+ (block): Sequential(
403
+ (0): Conv2dNormActivation(
404
+ (0): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
405
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
406
+ (2): SiLU(inplace=True)
407
+ )
408
+ (1): Conv2dNormActivation(
409
+ (0): Conv2d(480, 480, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=480, bias=False)
410
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
411
+ (2): SiLU(inplace=True)
412
+ )
413
+ (2): SqueezeExcitation(
414
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
415
+ (fc1): Conv2d(480, 20, kernel_size=(1, 1), stride=(1, 1))
416
+ (fc2): Conv2d(20, 480, kernel_size=(1, 1), stride=(1, 1))
417
+ (activation): SiLU(inplace=True)
418
+ (scale_activation): Sigmoid()
419
+ )
420
+ (3): Conv2dNormActivation(
421
+ (0): Conv2d(480, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
422
+ (1): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
423
+ )
424
+ )
425
+ (stochastic_depth): StochasticDepth(p=0.07500000000000001, mode=row)
426
+ )
427
+ (2): MBConv(
428
+ (block): Sequential(
429
+ (0): Conv2dNormActivation(
430
+ (0): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
431
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
432
+ (2): SiLU(inplace=True)
433
+ )
434
+ (1): Conv2dNormActivation(
435
+ (0): Conv2d(480, 480, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=480, bias=False)
436
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
437
+ (2): SiLU(inplace=True)
438
+ )
439
+ (2): SqueezeExcitation(
440
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
441
+ (fc1): Conv2d(480, 20, kernel_size=(1, 1), stride=(1, 1))
442
+ (fc2): Conv2d(20, 480, kernel_size=(1, 1), stride=(1, 1))
443
+ (activation): SiLU(inplace=True)
444
+ (scale_activation): Sigmoid()
445
+ )
446
+ (3): Conv2dNormActivation(
447
+ (0): Conv2d(480, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
448
+ (1): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
449
+ )
450
+ )
451
+ (stochastic_depth): StochasticDepth(p=0.08750000000000001, mode=row)
452
+ )
453
+ )
454
+ (5): Sequential(
455
+ (0): MBConv(
456
+ (block): Sequential(
457
+ (0): Conv2dNormActivation(
458
+ (0): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
459
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
460
+ (2): SiLU(inplace=True)
461
+ )
462
+ (1): Conv2dNormActivation(
463
+ (0): Conv2d(480, 480, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=480, bias=False)
464
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
465
+ (2): SiLU(inplace=True)
466
+ )
467
+ (2): SqueezeExcitation(
468
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
469
+ (fc1): Conv2d(480, 20, kernel_size=(1, 1), stride=(1, 1))
470
+ (fc2): Conv2d(20, 480, kernel_size=(1, 1), stride=(1, 1))
471
+ (activation): SiLU(inplace=True)
472
+ (scale_activation): Sigmoid()
473
+ )
474
+ (3): Conv2dNormActivation(
475
+ (0): Conv2d(480, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
476
+ (1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
477
+ )
478
+ )
479
+ (stochastic_depth): StochasticDepth(p=0.1, mode=row)
480
+ )
481
+ (1): MBConv(
482
+ (block): Sequential(
483
+ (0): Conv2dNormActivation(
484
+ (0): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
485
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
486
+ (2): SiLU(inplace=True)
487
+ )
488
+ (1): Conv2dNormActivation(
489
+ (0): Conv2d(672, 672, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=672, bias=False)
490
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
491
+ (2): SiLU(inplace=True)
492
+ )
493
+ (2): SqueezeExcitation(
494
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
495
+ (fc1): Conv2d(672, 28, kernel_size=(1, 1), stride=(1, 1))
496
+ (fc2): Conv2d(28, 672, kernel_size=(1, 1), stride=(1, 1))
497
+ (activation): SiLU(inplace=True)
498
+ (scale_activation): Sigmoid()
499
+ )
500
+ (3): Conv2dNormActivation(
501
+ (0): Conv2d(672, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
502
+ (1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
503
+ )
504
+ )
505
+ (stochastic_depth): StochasticDepth(p=0.1125, mode=row)
506
+ )
507
+ (2): MBConv(
508
+ (block): Sequential(
509
+ (0): Conv2dNormActivation(
510
+ (0): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
511
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
512
+ (2): SiLU(inplace=True)
513
+ )
514
+ (1): Conv2dNormActivation(
515
+ (0): Conv2d(672, 672, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=672, bias=False)
516
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
517
+ (2): SiLU(inplace=True)
518
+ )
519
+ (2): SqueezeExcitation(
520
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
521
+ (fc1): Conv2d(672, 28, kernel_size=(1, 1), stride=(1, 1))
522
+ (fc2): Conv2d(28, 672, kernel_size=(1, 1), stride=(1, 1))
523
+ (activation): SiLU(inplace=True)
524
+ (scale_activation): Sigmoid()
525
+ )
526
+ (3): Conv2dNormActivation(
527
+ (0): Conv2d(672, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
528
+ (1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
529
+ )
530
+ )
531
+ (stochastic_depth): StochasticDepth(p=0.125, mode=row)
532
+ )
533
+ )
534
+ (6): Sequential(
535
+ (0): MBConv(
536
+ (block): Sequential(
537
+ (0): Conv2dNormActivation(
538
+ (0): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
539
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
540
+ (2): SiLU(inplace=True)
541
+ )
542
+ (1): Conv2dNormActivation(
543
+ (0): Conv2d(672, 672, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=672, bias=False)
544
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
545
+ (2): SiLU(inplace=True)
546
+ )
547
+ (2): SqueezeExcitation(
548
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
549
+ (fc1): Conv2d(672, 28, kernel_size=(1, 1), stride=(1, 1))
550
+ (fc2): Conv2d(28, 672, kernel_size=(1, 1), stride=(1, 1))
551
+ (activation): SiLU(inplace=True)
552
+ (scale_activation): Sigmoid()
553
+ )
554
+ (3): Conv2dNormActivation(
555
+ (0): Conv2d(672, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
556
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
557
+ )
558
+ )
559
+ (stochastic_depth): StochasticDepth(p=0.1375, mode=row)
560
+ )
561
+ (1): MBConv(
562
+ (block): Sequential(
563
+ (0): Conv2dNormActivation(
564
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
565
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
566
+ (2): SiLU(inplace=True)
567
+ )
568
+ (1): Conv2dNormActivation(
569
+ (0): Conv2d(1152, 1152, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=1152, bias=False)
570
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
571
+ (2): SiLU(inplace=True)
572
+ )
573
+ (2): SqueezeExcitation(
574
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
575
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
576
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
577
+ (activation): SiLU(inplace=True)
578
+ (scale_activation): Sigmoid()
579
+ )
580
+ (3): Conv2dNormActivation(
581
+ (0): Conv2d(1152, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
582
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
583
+ )
584
+ )
585
+ (stochastic_depth): StochasticDepth(p=0.15000000000000002, mode=row)
586
+ )
587
+ (2): MBConv(
588
+ (block): Sequential(
589
+ (0): Conv2dNormActivation(
590
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
591
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
592
+ (2): SiLU(inplace=True)
593
+ )
594
+ (1): Conv2dNormActivation(
595
+ (0): Conv2d(1152, 1152, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=1152, bias=False)
596
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
597
+ (2): SiLU(inplace=True)
598
+ )
599
+ (2): SqueezeExcitation(
600
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
601
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
602
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
603
+ (activation): SiLU(inplace=True)
604
+ (scale_activation): Sigmoid()
605
+ )
606
+ (3): Conv2dNormActivation(
607
+ (0): Conv2d(1152, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
608
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
609
+ )
610
+ )
611
+ (stochastic_depth): StochasticDepth(p=0.1625, mode=row)
612
+ )
613
+ (3): MBConv(
614
+ (block): Sequential(
615
+ (0): Conv2dNormActivation(
616
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
617
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
618
+ (2): SiLU(inplace=True)
619
+ )
620
+ (1): Conv2dNormActivation(
621
+ (0): Conv2d(1152, 1152, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=1152, bias=False)
622
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
623
+ (2): SiLU(inplace=True)
624
+ )
625
+ (2): SqueezeExcitation(
626
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
627
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
628
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
629
+ (activation): SiLU(inplace=True)
630
+ (scale_activation): Sigmoid()
631
+ )
632
+ (3): Conv2dNormActivation(
633
+ (0): Conv2d(1152, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
634
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
635
+ )
636
+ )
637
+ (stochastic_depth): StochasticDepth(p=0.17500000000000002, mode=row)
638
+ )
639
+ )
640
+ (7): Sequential(
641
+ (0): MBConv(
642
+ (block): Sequential(
643
+ (0): Conv2dNormActivation(
644
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
645
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
646
+ (2): SiLU(inplace=True)
647
+ )
648
+ (1): Conv2dNormActivation(
649
+ (0): Conv2d(1152, 1152, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1152, bias=False)
650
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
651
+ (2): SiLU(inplace=True)
652
+ )
653
+ (2): SqueezeExcitation(
654
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
655
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
656
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
657
+ (activation): SiLU(inplace=True)
658
+ (scale_activation): Sigmoid()
659
+ )
660
+ (3): Conv2dNormActivation(
661
+ (0): Conv2d(1152, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)
662
+ (1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
663
+ )
664
+ )
665
+ (stochastic_depth): StochasticDepth(p=0.1875, mode=row)
666
+ )
667
+ )
668
+ (8): Conv2dNormActivation(
669
+ (0): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)
670
+ (1): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
671
+ (2): SiLU(inplace=True)
672
+ )
673
+ )
674
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
675
+ (classifier): Sequential(
676
+ (0): Dropout(p=0.2, inplace=False)
677
+ (1): Linear(in_features=1280, out_features=206, bias=True)
678
+ )
679
+ )
680
+ [2025-05-05 14:37:02,696][BaseTrainer][INFO] - # of parameters: 4271434.
681
+ [2025-05-05 14:37:10,798][BaseTrainer][INFO] - [Epoch 1/10, Iter 1/3560] 5.328178882598877, cross_entropy: 5.328178882598877
682
+ [2025-05-05 14:37:11,073][BaseTrainer][INFO] - [Epoch 1/10, Iter 2/3560] 5.33966588973999, cross_entropy: 5.33966588973999
683
+ [2025-05-05 14:37:11,353][BaseTrainer][INFO] - [Epoch 1/10, Iter 3/3560] 5.219983100891113, cross_entropy: 5.219983100891113
684
+ [2025-05-05 14:37:11,656][BaseTrainer][INFO] - [Epoch 1/10, Iter 4/3560] 5.240371227264404, cross_entropy: 5.240371227264404
685
+ [2025-05-05 14:37:11,947][BaseTrainer][INFO] - [Epoch 1/10, Iter 5/3560] 5.107341766357422, cross_entropy: 5.107341766357422
686
+ [2025-05-05 14:37:14,329][BaseTrainer][INFO] - [Epoch 1/10, Iter 6/3560] 5.0956878662109375, cross_entropy: 5.0956878662109375
687
+ [2025-05-05 14:37:14,728][BaseTrainer][INFO] - [Epoch 1/10, Iter 7/3560] 4.975931167602539, cross_entropy: 4.975931167602539
688
+ [2025-05-05 14:37:17,419][BaseTrainer][INFO] - [Epoch 1/10, Iter 8/3560] 5.039261817932129, cross_entropy: 5.039261817932129
689
+ [2025-05-05 14:37:17,696][BaseTrainer][INFO] - [Epoch 1/10, Iter 9/3560] 4.870660781860352, cross_entropy: 4.870660781860352
690
+ [2025-05-05 14:37:20,346][BaseTrainer][INFO] - [Epoch 1/10, Iter 10/3560] 4.842012882232666, cross_entropy: 4.842012882232666
691
+ [2025-05-05 14:37:20,619][BaseTrainer][INFO] - [Epoch 1/10, Iter 11/3560] 5.027741432189941, cross_entropy: 5.027741432189941
692
+ [2025-05-05 14:37:23,114][BaseTrainer][INFO] - [Epoch 1/10, Iter 12/3560] 4.691315650939941, cross_entropy: 4.691315650939941
693
+ [2025-05-05 14:37:23,395][BaseTrainer][INFO] - [Epoch 1/10, Iter 13/3560] 4.988541603088379, cross_entropy: 4.988541603088379
694
+ [2025-05-05 14:37:26,811][BaseTrainer][INFO] - [Epoch 1/10, Iter 14/3560] 4.8373517990112305, cross_entropy: 4.8373517990112305
695
+ [2025-05-05 14:37:27,084][BaseTrainer][INFO] - [Epoch 1/10, Iter 15/3560] 4.842232704162598, cross_entropy: 4.842232704162598
696
+ [2025-05-05 14:37:29,658][BaseTrainer][INFO] - [Epoch 1/10, Iter 16/3560] 4.837664604187012, cross_entropy: 4.837664604187012
697
+ [2025-05-05 14:37:29,927][BaseTrainer][INFO] - [Epoch 1/10, Iter 17/3560] 5.147849082946777, cross_entropy: 5.147849082946777
698
+ [2025-05-05 14:37:32,714][BaseTrainer][INFO] - [Epoch 1/10, Iter 18/3560] 5.2447309494018555, cross_entropy: 5.2447309494018555
699
+ [2025-05-05 14:37:32,990][BaseTrainer][INFO] - [Epoch 1/10, Iter 19/3560] 4.6186909675598145, cross_entropy: 4.6186909675598145
700
+ [2025-05-05 14:37:35,387][BaseTrainer][INFO] - [Epoch 1/10, Iter 20/3560] 4.760909080505371, cross_entropy: 4.760909080505371
701
+ [2025-05-05 14:37:36,116][BaseTrainer][INFO] - [Epoch 1/10, Iter 21/3560] 4.685481071472168, cross_entropy: 4.685481071472168
702
+ [2025-05-05 14:37:37,753][BaseTrainer][INFO] - [Epoch 1/10, Iter 22/3560] 4.950292587280273, cross_entropy: 4.950292587280273
703
+ [2025-05-05 14:37:38,975][BaseTrainer][INFO] - [Epoch 1/10, Iter 23/3560] 4.624570369720459, cross_entropy: 4.624570369720459
704
+ [2025-05-05 14:37:40,632][BaseTrainer][INFO] - [Epoch 1/10, Iter 24/3560] 4.730811595916748, cross_entropy: 4.730811595916748
705
+ [2025-05-05 14:37:41,731][BaseTrainer][INFO] - [Epoch 1/10, Iter 25/3560] 4.539666175842285, cross_entropy: 4.539666175842285
706
+ [2025-05-05 14:37:44,305][BaseTrainer][INFO] - [Epoch 1/10, Iter 26/3560] 4.674914360046387, cross_entropy: 4.674914360046387
707
+ [2025-05-05 14:37:44,577][BaseTrainer][INFO] - [Epoch 1/10, Iter 27/3560] 4.50570821762085, cross_entropy: 4.50570821762085
708
+ [2025-05-05 14:37:48,211][BaseTrainer][INFO] - [Epoch 1/10, Iter 28/3560] 4.580283164978027, cross_entropy: 4.580283164978027
709
+ [2025-05-05 14:37:48,489][BaseTrainer][INFO] - [Epoch 1/10, Iter 29/3560] 4.450684547424316, cross_entropy: 4.450684547424316
710
+ [2025-05-05 14:37:51,685][BaseTrainer][INFO] - [Epoch 1/10, Iter 30/3560] 4.619328498840332, cross_entropy: 4.619328498840332
711
+ [2025-05-05 14:37:51,964][BaseTrainer][INFO] - [Epoch 1/10, Iter 31/3560] 4.42674446105957, cross_entropy: 4.42674446105957
712
+ [2025-05-05 14:37:54,600][BaseTrainer][INFO] - [Epoch 1/10, Iter 32/3560] 4.826480865478516, cross_entropy: 4.826480865478516
713
+ [2025-05-05 14:37:54,879][BaseTrainer][INFO] - [Epoch 1/10, Iter 33/3560] 4.9267120361328125, cross_entropy: 4.9267120361328125
714
+ [2025-05-05 14:37:57,854][BaseTrainer][INFO] - [Epoch 1/10, Iter 34/3560] 4.484564781188965, cross_entropy: 4.484564781188965
715
+ [2025-05-05 14:37:58,122][BaseTrainer][INFO] - [Epoch 1/10, Iter 35/3560] 4.345626354217529, cross_entropy: 4.345626354217529
716
+ [2025-05-05 14:38:02,670][BaseTrainer][INFO] - [Epoch 1/10, Iter 36/3560] 4.146004676818848, cross_entropy: 4.146004676818848
717
+ [2025-05-05 14:38:02,941][BaseTrainer][INFO] - [Epoch 1/10, Iter 37/3560] 4.643206596374512, cross_entropy: 4.643206596374512
718
+ [2025-05-05 14:38:05,569][BaseTrainer][INFO] - [Epoch 1/10, Iter 38/3560] 4.401455879211426, cross_entropy: 4.401455879211426
719
+ [2025-05-05 14:38:05,837][BaseTrainer][INFO] - [Epoch 1/10, Iter 39/3560] 4.735471725463867, cross_entropy: 4.735471725463867
720
+ [2025-05-05 14:38:10,666][BaseTrainer][INFO] - [Epoch 1/10, Iter 40/3560] 4.4861955642700195, cross_entropy: 4.4861955642700195
721
+ [2025-05-05 14:38:10,934][BaseTrainer][INFO] - [Epoch 1/10, Iter 41/3560] 4.3141584396362305, cross_entropy: 4.3141584396362305
722
+ [2025-05-05 14:38:13,862][BaseTrainer][INFO] - [Epoch 1/10, Iter 42/3560] 4.38871955871582, cross_entropy: 4.38871955871582
723
+ [2025-05-05 14:38:14,139][BaseTrainer][INFO] - [Epoch 1/10, Iter 43/3560] 4.384570598602295, cross_entropy: 4.384570598602295
724
+ [2025-05-05 14:38:16,384][BaseTrainer][INFO] - [Epoch 1/10, Iter 44/3560] 4.807826995849609, cross_entropy: 4.807826995849609
725
+ [2025-05-05 14:38:16,662][BaseTrainer][INFO] - [Epoch 1/10, Iter 45/3560] 4.3263444900512695, cross_entropy: 4.3263444900512695
726
+ [2025-05-05 14:38:19,477][BaseTrainer][INFO] - [Epoch 1/10, Iter 46/3560] 4.420223236083984, cross_entropy: 4.420223236083984
727
+ [2025-05-05 14:38:19,747][BaseTrainer][INFO] - [Epoch 1/10, Iter 47/3560] 4.262032508850098, cross_entropy: 4.262032508850098
728
+ [2025-05-05 14:38:22,884][BaseTrainer][INFO] - [Epoch 1/10, Iter 48/3560] 4.451767444610596, cross_entropy: 4.451767444610596
729
+ [2025-05-05 14:38:23,314][BaseTrainer][INFO] - [Epoch 1/10, Iter 49/3560] 4.011248588562012, cross_entropy: 4.011248588562012
730
+ [2025-05-05 14:38:25,496][BaseTrainer][INFO] - [Epoch 1/10, Iter 50/3560] 4.409610271453857, cross_entropy: 4.409610271453857
731
+ [2025-05-05 14:38:26,194][BaseTrainer][INFO] - [Epoch 1/10, Iter 51/3560] 4.249641418457031, cross_entropy: 4.249641418457031
732
+ [2025-05-05 14:38:28,575][BaseTrainer][INFO] - [Epoch 1/10, Iter 52/3560] 4.518410682678223, cross_entropy: 4.518410682678223
733
+ [2025-05-05 14:38:29,259][BaseTrainer][INFO] - [Epoch 1/10, Iter 53/3560] 4.082416534423828, cross_entropy: 4.082416534423828
734
+ [2025-05-05 14:38:32,156][BaseTrainer][INFO] - [Epoch 1/10, Iter 54/3560] 4.237471580505371, cross_entropy: 4.237471580505371
735
+ [2025-05-05 14:38:32,426][BaseTrainer][INFO] - [Epoch 1/10, Iter 55/3560] 4.123418807983398, cross_entropy: 4.123418807983398
736
+ [2025-05-05 14:38:34,921][BaseTrainer][INFO] - [Epoch 1/10, Iter 56/3560] 4.482702255249023, cross_entropy: 4.482702255249023
737
+ [2025-05-05 14:38:35,201][BaseTrainer][INFO] - [Epoch 1/10, Iter 57/3560] 4.485210418701172, cross_entropy: 4.485210418701172
738
+ [2025-05-05 14:38:37,832][BaseTrainer][INFO] - [Epoch 1/10, Iter 58/3560] 4.0404052734375, cross_entropy: 4.0404052734375
739
+ [2025-05-05 14:38:38,115][BaseTrainer][INFO] - [Epoch 1/10, Iter 59/3560] 4.237558364868164, cross_entropy: 4.237558364868164
740
+ [2025-05-05 14:38:41,142][BaseTrainer][INFO] - [Epoch 1/10, Iter 60/3560] 3.791278839111328, cross_entropy: 3.791278839111328
741
+ [2025-05-05 14:38:41,419][BaseTrainer][INFO] - [Epoch 1/10, Iter 61/3560] 4.558656692504883, cross_entropy: 4.558656692504883
742
+ [2025-05-05 14:38:44,136][BaseTrainer][INFO] - [Epoch 1/10, Iter 62/3560] 4.246457576751709, cross_entropy: 4.246457576751709
743
+ [2025-05-05 14:38:44,404][BaseTrainer][INFO] - [Epoch 1/10, Iter 63/3560] 3.813282012939453, cross_entropy: 3.813282012939453
744
+ [2025-05-05 14:38:47,141][BaseTrainer][INFO] - [Epoch 1/10, Iter 64/3560] 3.7647242546081543, cross_entropy: 3.7647242546081543
745
+ [2025-05-05 14:38:47,421][BaseTrainer][INFO] - [Epoch 1/10, Iter 65/3560] 4.325078010559082, cross_entropy: 4.325078010559082
746
+ [2025-05-05 14:38:50,116][BaseTrainer][INFO] - [Epoch 1/10, Iter 66/3560] 4.272719860076904, cross_entropy: 4.272719860076904
747
+ [2025-05-05 14:38:50,384][BaseTrainer][INFO] - [Epoch 1/10, Iter 67/3560] 4.1269636154174805, cross_entropy: 4.1269636154174805
748
+ [2025-05-05 14:38:53,883][BaseTrainer][INFO] - [Epoch 1/10, Iter 68/3560] 4.2616376876831055, cross_entropy: 4.2616376876831055
749
+ [2025-05-05 14:38:54,151][BaseTrainer][INFO] - [Epoch 1/10, Iter 69/3560] 3.812692165374756, cross_entropy: 3.812692165374756
750
+ [2025-05-05 14:38:56,405][BaseTrainer][INFO] - [Epoch 1/10, Iter 70/3560] 3.7689599990844727, cross_entropy: 3.7689599990844727
751
+ [2025-05-05 14:38:56,683][BaseTrainer][INFO] - [Epoch 1/10, Iter 71/3560] 4.441872596740723, cross_entropy: 4.441872596740723
752
+ [2025-05-05 14:38:59,598][BaseTrainer][INFO] - [Epoch 1/10, Iter 72/3560] 3.825744390487671, cross_entropy: 3.825744390487671
753
+ [2025-05-05 14:38:59,878][BaseTrainer][INFO] - [Epoch 1/10, Iter 73/3560] 3.7579703330993652, cross_entropy: 3.7579703330993652
754
+ [2025-05-05 14:39:02,636][BaseTrainer][INFO] - [Epoch 1/10, Iter 74/3560] 4.119457244873047, cross_entropy: 4.119457244873047
755
+ [2025-05-05 14:39:02,911][BaseTrainer][INFO] - [Epoch 1/10, Iter 75/3560] 4.1348466873168945, cross_entropy: 4.1348466873168945
756
+ [2025-05-05 14:39:05,847][BaseTrainer][INFO] - [Epoch 1/10, Iter 76/3560] 4.458311080932617, cross_entropy: 4.458311080932617
757
+ [2025-05-05 14:39:06,121][BaseTrainer][INFO] - [Epoch 1/10, Iter 77/3560] 4.572954177856445, cross_entropy: 4.572954177856445
758
+ [2025-05-05 14:39:09,043][BaseTrainer][INFO] - [Epoch 1/10, Iter 78/3560] 3.984133005142212, cross_entropy: 3.984133005142212
759
+ [2025-05-05 14:39:09,312][BaseTrainer][INFO] - [Epoch 1/10, Iter 79/3560] 4.284179210662842, cross_entropy: 4.284179210662842
760
+ [2025-05-05 14:39:11,812][BaseTrainer][INFO] - [Epoch 1/10, Iter 80/3560] 4.226500034332275, cross_entropy: 4.226500034332275
761
+ [2025-05-05 14:39:12,089][BaseTrainer][INFO] - [Epoch 1/10, Iter 81/3560] 3.7766613960266113, cross_entropy: 3.7766613960266113
762
+ [2025-05-05 14:39:14,960][BaseTrainer][INFO] - [Epoch 1/10, Iter 82/3560] 4.343001842498779, cross_entropy: 4.343001842498779
763
+ [2025-05-05 14:39:15,299][BaseTrainer][INFO] - [Epoch 1/10, Iter 83/3560] 4.56613826751709, cross_entropy: 4.56613826751709
764
+ [2025-05-05 14:39:17,899][BaseTrainer][INFO] - [Epoch 1/10, Iter 84/3560] 4.267967224121094, cross_entropy: 4.267967224121094
765
+ [2025-05-05 14:39:18,333][BaseTrainer][INFO] - [Epoch 1/10, Iter 85/3560] 4.476839065551758, cross_entropy: 4.476839065551758
766
+ [2025-05-05 14:39:23,063][BaseTrainer][INFO] - [Epoch 1/10, Iter 86/3560] 4.031705379486084, cross_entropy: 4.031705379486084
767
+ [2025-05-05 14:39:23,740][BaseTrainer][INFO] - [Epoch 1/10, Iter 87/3560] 4.063455581665039, cross_entropy: 4.063455581665039
768
+ [2025-05-05 14:39:27,842][BaseTrainer][INFO] - [Epoch 1/10, Iter 88/3560] 3.944749593734741, cross_entropy: 3.944749593734741
769
+ [2025-05-05 14:39:28,180][BaseTrainer][INFO] - [Epoch 1/10, Iter 89/3560] 4.538140296936035, cross_entropy: 4.538140296936035
770
+ [2025-05-05 14:39:32,840][BaseTrainer][INFO] - [Epoch 1/10, Iter 90/3560] 3.9554660320281982, cross_entropy: 3.9554660320281982
771
+ [2025-05-05 14:39:33,107][BaseTrainer][INFO] - [Epoch 1/10, Iter 91/3560] 4.204513072967529, cross_entropy: 4.204513072967529
772
+ [2025-05-05 14:39:36,438][BaseTrainer][INFO] - [Epoch 1/10, Iter 92/3560] 3.485536813735962, cross_entropy: 3.485536813735962
773
+ [2025-05-05 14:39:37,354][BaseTrainer][INFO] - [Epoch 1/10, Iter 93/3560] 3.624830484390259, cross_entropy: 3.624830484390259
774
+ [2025-05-05 14:39:40,764][BaseTrainer][INFO] - [Epoch 1/10, Iter 94/3560] 4.362308979034424, cross_entropy: 4.362308979034424
775
+ [2025-05-05 14:39:42,133][BaseTrainer][INFO] - [Epoch 1/10, Iter 95/3560] 3.959744453430176, cross_entropy: 3.959744453430176
776
+ [2025-05-05 14:39:45,738][BaseTrainer][INFO] - [Epoch 1/10, Iter 96/3560] 3.487950086593628, cross_entropy: 3.487950086593628
777
+ [2025-05-05 14:39:46,171][BaseTrainer][INFO] - [Epoch 1/10, Iter 97/3560] 4.462384223937988, cross_entropy: 4.462384223937988
778
+ [2025-05-05 14:39:50,291][BaseTrainer][INFO] - [Epoch 1/10, Iter 98/3560] 3.4589223861694336, cross_entropy: 3.4589223861694336
779
+ [2025-05-05 14:39:50,654][BaseTrainer][INFO] - [Epoch 1/10, Iter 99/3560] 4.0036749839782715, cross_entropy: 4.0036749839782715
780
+ [2025-05-05 14:39:54,324][BaseTrainer][INFO] - [Epoch 1/10, Iter 100/3560] 3.3927807807922363, cross_entropy: 3.3927807807922363
781
+ [2025-05-05 14:39:54,592][BaseTrainer][INFO] - [Epoch 1/10, Iter 101/3560] 4.132311820983887, cross_entropy: 4.132311820983887
782
+ [2025-05-05 14:39:58,236][BaseTrainer][INFO] - [Epoch 1/10, Iter 102/3560] 3.495938777923584, cross_entropy: 3.495938777923584
783
+ [2025-05-05 14:39:58,505][BaseTrainer][INFO] - [Epoch 1/10, Iter 103/3560] 3.9452595710754395, cross_entropy: 3.9452595710754395
784
+ [2025-05-05 14:40:01,936][BaseTrainer][INFO] - [Epoch 1/10, Iter 104/3560] 3.7972192764282227, cross_entropy: 3.7972192764282227
785
+ [2025-05-05 14:40:02,357][BaseTrainer][INFO] - [Epoch 1/10, Iter 105/3560] 3.637284278869629, cross_entropy: 3.637284278869629
786
+ [2025-05-05 14:40:04,931][BaseTrainer][INFO] - [Epoch 1/10, Iter 106/3560] 4.375557899475098, cross_entropy: 4.375557899475098
787
+ [2025-05-05 14:40:05,289][BaseTrainer][INFO] - [Epoch 1/10, Iter 107/3560] 3.8202695846557617, cross_entropy: 3.8202695846557617
788
+ [2025-05-05 14:40:09,041][BaseTrainer][INFO] - [Epoch 1/10, Iter 108/3560] 4.1955790519714355, cross_entropy: 4.1955790519714355
789
+ [2025-05-05 14:40:09,363][BaseTrainer][INFO] - [Epoch 1/10, Iter 109/3560] 3.5387604236602783, cross_entropy: 3.5387604236602783
790
+ [2025-05-05 14:40:13,113][BaseTrainer][INFO] - [Epoch 1/10, Iter 110/3560] 4.3064069747924805, cross_entropy: 4.3064069747924805
791
+ [2025-05-05 14:40:13,388][BaseTrainer][INFO] - [Epoch 1/10, Iter 111/3560] 3.31304931640625, cross_entropy: 3.31304931640625
792
+ [2025-05-05 14:40:16,601][BaseTrainer][INFO] - [Epoch 1/10, Iter 112/3560] 3.5499391555786133, cross_entropy: 3.5499391555786133
793
+ [2025-05-05 14:40:16,868][BaseTrainer][INFO] - [Epoch 1/10, Iter 113/3560] 3.889066219329834, cross_entropy: 3.889066219329834
794
+ [2025-05-05 14:40:19,587][BaseTrainer][INFO] - [Epoch 1/10, Iter 114/3560] 3.955798625946045, cross_entropy: 3.955798625946045
795
+ [2025-05-05 14:40:19,911][BaseTrainer][INFO] - [Epoch 1/10, Iter 115/3560] 3.282543420791626, cross_entropy: 3.282543420791626
796
+ [2025-05-05 14:40:22,864][BaseTrainer][INFO] - [Epoch 1/10, Iter 116/3560] 3.3032946586608887, cross_entropy: 3.3032946586608887
797
+ [2025-05-05 14:40:23,468][BaseTrainer][INFO] - [Epoch 1/10, Iter 117/3560] 3.679957866668701, cross_entropy: 3.679957866668701
798
+ [2025-05-05 14:40:26,587][BaseTrainer][INFO] - [Epoch 1/10, Iter 118/3560] 3.7447991371154785, cross_entropy: 3.7447991371154785
799
+ [2025-05-05 14:40:27,467][BaseTrainer][INFO] - [Epoch 1/10, Iter 119/3560] 4.192959785461426, cross_entropy: 4.192959785461426
800
+ [2025-05-05 14:40:30,111][BaseTrainer][INFO] - [Epoch 1/10, Iter 120/3560] 3.6837351322174072, cross_entropy: 3.6837351322174072
801
+ [2025-05-05 14:40:30,759][BaseTrainer][INFO] - [Epoch 1/10, Iter 121/3560] 4.0145134925842285, cross_entropy: 4.0145134925842285
802
+ [2025-05-05 14:40:33,590][BaseTrainer][INFO] - [Epoch 1/10, Iter 122/3560] 3.4162988662719727, cross_entropy: 3.4162988662719727
803
+ [2025-05-05 14:40:33,889][BaseTrainer][INFO] - [Epoch 1/10, Iter 123/3560] 4.023228168487549, cross_entropy: 4.023228168487549
804
+ [2025-05-05 14:40:36,481][BaseTrainer][INFO] - [Epoch 1/10, Iter 124/3560] 3.950404644012451, cross_entropy: 3.950404644012451
805
+ [2025-05-05 14:40:36,752][BaseTrainer][INFO] - [Epoch 1/10, Iter 125/3560] 4.139768600463867, cross_entropy: 4.139768600463867
806
+ [2025-05-05 14:40:39,220][BaseTrainer][INFO] - [Epoch 1/10, Iter 126/3560] 3.834073781967163, cross_entropy: 3.834073781967163
807
+ [2025-05-05 14:40:39,665][BaseTrainer][INFO] - [Epoch 1/10, Iter 127/3560] 3.3856115341186523, cross_entropy: 3.3856115341186523
808
+ [2025-05-05 14:40:41,525][BaseTrainer][INFO] - [Epoch 1/10, Iter 128/3560] 3.567660331726074, cross_entropy: 3.567660331726074
809
+ [2025-05-05 14:40:43,590][BaseTrainer][INFO] - [Epoch 1/10, Iter 129/3560] 4.1861395835876465, cross_entropy: 4.1861395835876465
810
+ [2025-05-05 14:40:44,305][BaseTrainer][INFO] - [Epoch 1/10, Iter 130/3560] 3.624889850616455, cross_entropy: 3.624889850616455
811
+ [2025-05-05 14:40:46,122][BaseTrainer][INFO] - [Epoch 1/10, Iter 131/3560] 3.314370632171631, cross_entropy: 3.314370632171631
812
+ [2025-05-05 14:40:47,106][BaseTrainer][INFO] - [Epoch 1/10, Iter 132/3560] 3.0787272453308105, cross_entropy: 3.0787272453308105
813
+ [2025-05-05 14:40:48,917][BaseTrainer][INFO] - [Epoch 1/10, Iter 133/3560] 4.307720184326172, cross_entropy: 4.307720184326172
814
+ [2025-05-05 14:40:49,321][BaseTrainer][INFO] - [Epoch 1/10, Iter 134/3560] 3.7539265155792236, cross_entropy: 3.7539265155792236
815
+ [2025-05-05 14:40:51,835][BaseTrainer][INFO] - [Epoch 1/10, Iter 135/3560] 4.342618465423584, cross_entropy: 4.342618465423584
816
+ [2025-05-05 14:40:52,105][BaseTrainer][INFO] - [Epoch 1/10, Iter 136/3560] 3.715447425842285, cross_entropy: 3.715447425842285
817
+ [2025-05-05 14:40:54,435][BaseTrainer][INFO] - [Epoch 1/10, Iter 137/3560] 2.9267280101776123, cross_entropy: 2.9267280101776123
818
+ [2025-05-05 14:40:54,712][BaseTrainer][INFO] - [Epoch 1/10, Iter 138/3560] 3.765219211578369, cross_entropy: 3.765219211578369
819
+ [2025-05-05 14:40:57,211][BaseTrainer][INFO] - [Epoch 1/10, Iter 139/3560] 4.222902297973633, cross_entropy: 4.222902297973633
820
+ [2025-05-05 14:40:57,651][BaseTrainer][INFO] - [Epoch 1/10, Iter 140/3560] 4.032978057861328, cross_entropy: 4.032978057861328
821
+ [2025-05-05 14:41:00,105][BaseTrainer][INFO] - [Epoch 1/10, Iter 141/3560] 4.051009178161621, cross_entropy: 4.051009178161621
822
+ [2025-05-05 14:41:00,523][BaseTrainer][INFO] - [Epoch 1/10, Iter 142/3560] 3.2375054359436035, cross_entropy: 3.2375054359436035
823
+ [2025-05-05 14:41:03,594][BaseTrainer][INFO] - [Epoch 1/10, Iter 143/3560] 3.4340789318084717, cross_entropy: 3.4340789318084717
824
+ [2025-05-05 14:41:03,948][BaseTrainer][INFO] - [Epoch 1/10, Iter 144/3560] 3.434288740158081, cross_entropy: 3.434288740158081
825
+ [2025-05-05 14:41:06,970][BaseTrainer][INFO] - [Epoch 1/10, Iter 145/3560] 4.064283847808838, cross_entropy: 4.064283847808838
826
+ [2025-05-05 14:41:07,241][BaseTrainer][INFO] - [Epoch 1/10, Iter 146/3560] 3.572262763977051, cross_entropy: 3.572262763977051
827
+ [2025-05-05 14:41:10,621][BaseTrainer][INFO] - [Epoch 1/10, Iter 147/3560] 4.141509056091309, cross_entropy: 4.141509056091309
828
+ [2025-05-05 14:41:10,889][BaseTrainer][INFO] - [Epoch 1/10, Iter 148/3560] 3.321254253387451, cross_entropy: 3.321254253387451
829
+ [2025-05-05 14:41:15,304][BaseTrainer][INFO] - [Epoch 1/10, Iter 149/3560] 3.8579628467559814, cross_entropy: 3.8579628467559814
830
+ [2025-05-05 14:41:15,571][BaseTrainer][INFO] - [Epoch 1/10, Iter 150/3560] 3.3113813400268555, cross_entropy: 3.3113813400268555
831
+ [2025-05-05 14:41:18,493][BaseTrainer][INFO] - [Epoch 1/10, Iter 151/3560] 3.0404224395751953, cross_entropy: 3.0404224395751953
832
+ [2025-05-05 14:41:18,769][BaseTrainer][INFO] - [Epoch 1/10, Iter 152/3560] 3.9235267639160156, cross_entropy: 3.9235267639160156
833
+ [2025-05-05 14:41:22,329][BaseTrainer][INFO] - [Epoch 1/10, Iter 153/3560] 4.493441581726074, cross_entropy: 4.493441581726074
834
+ [2025-05-05 14:41:22,598][BaseTrainer][INFO] - [Epoch 1/10, Iter 154/3560] 3.0068750381469727, cross_entropy: 3.0068750381469727
835
+ [2025-05-05 14:41:25,707][BaseTrainer][INFO] - [Epoch 1/10, Iter 155/3560] 3.323556423187256, cross_entropy: 3.323556423187256
836
+ [2025-05-05 14:41:25,979][BaseTrainer][INFO] - [Epoch 1/10, Iter 156/3560] 3.423142433166504, cross_entropy: 3.423142433166504
837
+ [2025-05-05 14:41:29,283][BaseTrainer][INFO] - [Epoch 1/10, Iter 157/3560] 3.444603681564331, cross_entropy: 3.444603681564331
838
+ [2025-05-05 14:41:29,551][BaseTrainer][INFO] - [Epoch 1/10, Iter 158/3560] 3.0718321800231934, cross_entropy: 3.0718321800231934
839
+ [2025-05-05 14:41:34,225][BaseTrainer][INFO] - [Epoch 1/10, Iter 159/3560] 3.3229565620422363, cross_entropy: 3.3229565620422363
840
+ [2025-05-05 14:41:34,500][BaseTrainer][INFO] - [Epoch 1/10, Iter 160/3560] 4.278200149536133, cross_entropy: 4.278200149536133
841
+ [2025-05-05 14:41:37,508][BaseTrainer][INFO] - [Epoch 1/10, Iter 161/3560] 2.9530930519104004, cross_entropy: 2.9530930519104004
842
+ [2025-05-05 14:41:37,776][BaseTrainer][INFO] - [Epoch 1/10, Iter 162/3560] 3.664855718612671, cross_entropy: 3.664855718612671
843
+ [2025-05-05 14:41:41,002][BaseTrainer][INFO] - [Epoch 1/10, Iter 163/3560] 4.009134769439697, cross_entropy: 4.009134769439697
844
+ [2025-05-05 14:41:41,270][BaseTrainer][INFO] - [Epoch 1/10, Iter 164/3560] 3.1355433464050293, cross_entropy: 3.1355433464050293
845
+ [2025-05-05 14:41:45,332][BaseTrainer][INFO] - [Epoch 1/10, Iter 165/3560] 3.9814882278442383, cross_entropy: 3.9814882278442383
846
+ [2025-05-05 14:41:45,610][BaseTrainer][INFO] - [Epoch 1/10, Iter 166/3560] 4.411584854125977, cross_entropy: 4.411584854125977
847
+ [2025-05-05 14:41:48,281][BaseTrainer][INFO] - [Epoch 1/10, Iter 167/3560] 3.896604537963867, cross_entropy: 3.896604537963867
848
+ [2025-05-05 14:41:48,555][BaseTrainer][INFO] - [Epoch 1/10, Iter 168/3560] 3.168945789337158, cross_entropy: 3.168945789337158
849
+ [2025-05-05 14:41:50,961][BaseTrainer][INFO] - [Epoch 1/10, Iter 169/3560] 3.1954164505004883, cross_entropy: 3.1954164505004883
850
+ [2025-05-05 14:41:51,230][BaseTrainer][INFO] - [Epoch 1/10, Iter 170/3560] 2.842662811279297, cross_entropy: 2.842662811279297
851
+ [2025-05-05 14:41:54,548][BaseTrainer][INFO] - [Epoch 1/10, Iter 171/3560] 3.2933473587036133, cross_entropy: 3.2933473587036133
852
+ [2025-05-05 14:41:54,820][BaseTrainer][INFO] - [Epoch 1/10, Iter 172/3560] 3.191836357116699, cross_entropy: 3.191836357116699
853
+ [2025-05-05 14:41:58,342][BaseTrainer][INFO] - [Epoch 1/10, Iter 173/3560] 3.993854522705078, cross_entropy: 3.993854522705078
854
+ [2025-05-05 14:41:58,621][BaseTrainer][INFO] - [Epoch 1/10, Iter 174/3560] 4.257905960083008, cross_entropy: 4.257905960083008
855
+ [2025-05-05 14:42:01,516][BaseTrainer][INFO] - [Epoch 1/10, Iter 175/3560] 4.172269821166992, cross_entropy: 4.172269821166992
856
+ [2025-05-05 14:42:01,784][BaseTrainer][INFO] - [Epoch 1/10, Iter 176/3560] 3.813190460205078, cross_entropy: 3.813190460205078
857
+ [2025-05-05 14:42:05,587][BaseTrainer][INFO] - [Epoch 1/10, Iter 177/3560] 3.288473606109619, cross_entropy: 3.288473606109619
858
+ [2025-05-05 14:42:05,866][BaseTrainer][INFO] - [Epoch 1/10, Iter 178/3560] 2.6537506580352783, cross_entropy: 2.6537506580352783
859
+ [2025-05-05 14:42:09,021][BaseTrainer][INFO] - [Epoch 1/10, Iter 179/3560] 2.9164233207702637, cross_entropy: 2.9164233207702637
860
+ [2025-05-05 14:42:09,291][BaseTrainer][INFO] - [Epoch 1/10, Iter 180/3560] 2.9781413078308105, cross_entropy: 2.9781413078308105
861
+ [2025-05-05 14:42:12,133][BaseTrainer][INFO] - [Epoch 1/10, Iter 181/3560] 3.925520181655884, cross_entropy: 3.925520181655884
862
+ [2025-05-05 14:42:12,400][BaseTrainer][INFO] - [Epoch 1/10, Iter 182/3560] 3.347870349884033, cross_entropy: 3.347870349884033
863
+ [2025-05-05 14:42:16,353][BaseTrainer][INFO] - [Epoch 1/10, Iter 183/3560] 3.3412108421325684, cross_entropy: 3.3412108421325684
864
+ [2025-05-05 14:42:16,628][BaseTrainer][INFO] - [Epoch 1/10, Iter 184/3560] 3.8552846908569336, cross_entropy: 3.8552846908569336
865
+ [2025-05-05 14:42:20,304][BaseTrainer][INFO] - [Epoch 1/10, Iter 185/3560] 4.064785003662109, cross_entropy: 4.064785003662109
866
+ [2025-05-05 14:42:20,572][BaseTrainer][INFO] - [Epoch 1/10, Iter 186/3560] 3.260899305343628, cross_entropy: 3.260899305343628
867
+ [2025-05-05 14:42:23,908][BaseTrainer][INFO] - [Epoch 1/10, Iter 187/3560] 3.544830799102783, cross_entropy: 3.544830799102783
868
+ [2025-05-05 14:42:24,176][BaseTrainer][INFO] - [Epoch 1/10, Iter 188/3560] 3.555997848510742, cross_entropy: 3.555997848510742
869
+ [2025-05-05 14:42:27,318][BaseTrainer][INFO] - [Epoch 1/10, Iter 189/3560] 2.7024357318878174, cross_entropy: 2.7024357318878174
870
+ [2025-05-05 14:42:27,586][BaseTrainer][INFO] - [Epoch 1/10, Iter 190/3560] 3.7619073390960693, cross_entropy: 3.7619073390960693
871
+ [2025-05-05 14:42:30,791][BaseTrainer][INFO] - [Epoch 1/10, Iter 191/3560] 3.934603452682495, cross_entropy: 3.934603452682495
872
+ [2025-05-05 14:42:31,059][BaseTrainer][INFO] - [Epoch 1/10, Iter 192/3560] 3.078949451446533, cross_entropy: 3.078949451446533
873
+ [2025-05-05 14:42:34,409][BaseTrainer][INFO] - [Epoch 1/10, Iter 193/3560] 3.9036507606506348, cross_entropy: 3.9036507606506348
874
+ [2025-05-05 14:42:34,679][BaseTrainer][INFO] - [Epoch 1/10, Iter 194/3560] 3.2395691871643066, cross_entropy: 3.2395691871643066
875
+ [2025-05-05 14:42:38,312][BaseTrainer][INFO] - [Epoch 1/10, Iter 195/3560] 3.0276219844818115, cross_entropy: 3.0276219844818115
876
+ [2025-05-05 14:42:38,590][BaseTrainer][INFO] - [Epoch 1/10, Iter 196/3560] 3.7110679149627686, cross_entropy: 3.7110679149627686
877
+ [2025-05-05 14:42:41,555][BaseTrainer][INFO] - [Epoch 1/10, Iter 197/3560] 3.9205877780914307, cross_entropy: 3.9205877780914307
878
+ [2025-05-05 14:42:41,824][BaseTrainer][INFO] - [Epoch 1/10, Iter 198/3560] 3.039644718170166, cross_entropy: 3.039644718170166
879
+ [2025-05-05 14:42:44,565][BaseTrainer][INFO] - [Epoch 1/10, Iter 199/3560] 3.5501019954681396, cross_entropy: 3.5501019954681396
880
+ [2025-05-05 14:42:44,841][BaseTrainer][INFO] - [Epoch 1/10, Iter 200/3560] 2.961530923843384, cross_entropy: 2.961530923843384
881
+ [2025-05-05 14:42:47,377][BaseTrainer][INFO] - [Epoch 1/10, Iter 201/3560] 3.4348673820495605, cross_entropy: 3.4348673820495605
882
+ [2025-05-05 14:42:47,646][BaseTrainer][INFO] - [Epoch 1/10, Iter 202/3560] 3.859506130218506, cross_entropy: 3.859506130218506
883
+ [2025-05-05 14:42:50,658][BaseTrainer][INFO] - [Epoch 1/10, Iter 203/3560] 3.7532031536102295, cross_entropy: 3.7532031536102295
884
+ [2025-05-05 14:42:50,930][BaseTrainer][INFO] - [Epoch 1/10, Iter 204/3560] 3.171391725540161, cross_entropy: 3.171391725540161
885
+ [2025-05-05 14:42:54,123][BaseTrainer][INFO] - [Epoch 1/10, Iter 205/3560] 2.8401284217834473, cross_entropy: 2.8401284217834473
886
+ [2025-05-05 14:42:54,397][BaseTrainer][INFO] - [Epoch 1/10, Iter 206/3560] 2.787358045578003, cross_entropy: 2.787358045578003
887
+ [2025-05-05 14:42:58,138][BaseTrainer][INFO] - [Epoch 1/10, Iter 207/3560] 4.295827865600586, cross_entropy: 4.295827865600586
888
+ [2025-05-05 14:42:58,405][BaseTrainer][INFO] - [Epoch 1/10, Iter 208/3560] 2.6732420921325684, cross_entropy: 2.6732420921325684
889
+ [2025-05-05 14:43:01,809][BaseTrainer][INFO] - [Epoch 1/10, Iter 209/3560] 2.6489176750183105, cross_entropy: 2.6489176750183105
890
+ [2025-05-05 14:43:02,078][BaseTrainer][INFO] - [Epoch 1/10, Iter 210/3560] 3.272085189819336, cross_entropy: 3.272085189819336
891
+ [2025-05-05 14:43:04,162][BaseTrainer][INFO] - [Epoch 1/10, Iter 211/3560] 4.393229961395264, cross_entropy: 4.393229961395264
892
+ [2025-05-05 14:43:04,439][BaseTrainer][INFO] - [Epoch 1/10, Iter 212/3560] 3.178574800491333, cross_entropy: 3.178574800491333
893
+ [2025-05-05 14:43:06,593][BaseTrainer][INFO] - [Epoch 1/10, Iter 213/3560] 3.4294769763946533, cross_entropy: 3.4294769763946533
894
+ [2025-05-05 14:43:06,870][BaseTrainer][INFO] - [Epoch 1/10, Iter 214/3560] 3.652698040008545, cross_entropy: 3.652698040008545
895
+ [2025-05-05 14:43:10,251][BaseTrainer][INFO] - [Epoch 1/10, Iter 215/3560] 2.9959189891815186, cross_entropy: 2.9959189891815186
896
+ [2025-05-05 14:43:10,519][BaseTrainer][INFO] - [Epoch 1/10, Iter 216/3560] 3.8080697059631348, cross_entropy: 3.8080697059631348
897
+ [2025-05-05 14:43:13,244][BaseTrainer][INFO] - [Epoch 1/10, Iter 217/3560] 3.7126407623291016, cross_entropy: 3.7126407623291016
898
+ [2025-05-05 14:43:13,609][BaseTrainer][INFO] - [Epoch 1/10, Iter 218/3560] 3.2892630100250244, cross_entropy: 3.2892630100250244
899
+ [2025-05-05 14:43:17,273][BaseTrainer][INFO] - [Epoch 1/10, Iter 219/3560] 3.822512149810791, cross_entropy: 3.822512149810791
900
+ [2025-05-05 14:43:17,715][BaseTrainer][INFO] - [Epoch 1/10, Iter 220/3560] 2.828888416290283, cross_entropy: 2.828888416290283
901
+ [2025-05-05 14:43:21,411][BaseTrainer][INFO] - [Epoch 1/10, Iter 221/3560] 3.719757556915283, cross_entropy: 3.719757556915283
902
+ [2025-05-05 14:43:21,914][BaseTrainer][INFO] - [Epoch 1/10, Iter 222/3560] 3.7287940979003906, cross_entropy: 3.7287940979003906
903
+ [2025-05-05 14:43:24,173][BaseTrainer][INFO] - [Epoch 1/10, Iter 223/3560] 3.2202911376953125, cross_entropy: 3.2202911376953125
904
+ [2025-05-05 14:43:24,448][BaseTrainer][INFO] - [Epoch 1/10, Iter 224/3560] 3.175903558731079, cross_entropy: 3.175903558731079
905
+ [2025-05-05 14:43:26,995][BaseTrainer][INFO] - [Epoch 1/10, Iter 225/3560] 3.807609796524048, cross_entropy: 3.807609796524048
906
+ [2025-05-05 14:43:27,314][BaseTrainer][INFO] - [Epoch 1/10, Iter 226/3560] 2.890665292739868, cross_entropy: 2.890665292739868
907
+ [2025-05-05 14:43:29,789][BaseTrainer][INFO] - [Epoch 1/10, Iter 227/3560] 3.0320630073547363, cross_entropy: 3.0320630073547363
908
+ [2025-05-05 14:43:30,070][BaseTrainer][INFO] - [Epoch 1/10, Iter 228/3560] 3.089958906173706, cross_entropy: 3.089958906173706
909
+ [2025-05-05 14:43:32,668][BaseTrainer][INFO] - [Epoch 1/10, Iter 229/3560] 2.730226755142212, cross_entropy: 2.730226755142212
910
+ [2025-05-05 14:43:32,935][BaseTrainer][INFO] - [Epoch 1/10, Iter 230/3560] 3.8954219818115234, cross_entropy: 3.8954219818115234
911
+ [2025-05-05 14:43:36,532][BaseTrainer][INFO] - [Epoch 1/10, Iter 231/3560] 2.5334842205047607, cross_entropy: 2.5334842205047607
912
+ [2025-05-05 14:43:36,808][BaseTrainer][INFO] - [Epoch 1/10, Iter 232/3560] 3.913090944290161, cross_entropy: 3.913090944290161
913
+ [2025-05-05 14:43:39,566][BaseTrainer][INFO] - [Epoch 1/10, Iter 233/3560] 2.8045315742492676, cross_entropy: 2.8045315742492676
914
+ [2025-05-05 14:43:39,842][BaseTrainer][INFO] - [Epoch 1/10, Iter 234/3560] 2.7281808853149414, cross_entropy: 2.7281808853149414
915
+ [2025-05-05 14:43:43,072][BaseTrainer][INFO] - [Epoch 1/10, Iter 235/3560] 3.4898900985717773, cross_entropy: 3.4898900985717773
916
+ [2025-05-05 14:43:43,347][BaseTrainer][INFO] - [Epoch 1/10, Iter 236/3560] 3.1868462562561035, cross_entropy: 3.1868462562561035
917
+ [2025-05-05 14:43:46,503][BaseTrainer][INFO] - [Epoch 1/10, Iter 237/3560] 3.310131072998047, cross_entropy: 3.310131072998047
918
+ [2025-05-05 14:43:46,771][BaseTrainer][INFO] - [Epoch 1/10, Iter 238/3560] 3.8356642723083496, cross_entropy: 3.8356642723083496
919
+ [2025-05-05 14:43:50,435][BaseTrainer][INFO] - [Epoch 1/10, Iter 239/3560] 2.8274199962615967, cross_entropy: 2.8274199962615967
920
+ [2025-05-05 14:43:50,705][BaseTrainer][INFO] - [Epoch 1/10, Iter 240/3560] 3.5835189819335938, cross_entropy: 3.5835189819335938
921
+ [2025-05-05 14:43:54,136][BaseTrainer][INFO] - [Epoch 1/10, Iter 241/3560] 3.222989797592163, cross_entropy: 3.222989797592163
922
+ [2025-05-05 14:43:54,404][BaseTrainer][INFO] - [Epoch 1/10, Iter 242/3560] 3.7342541217803955, cross_entropy: 3.7342541217803955
923
+ [2025-05-05 14:43:57,127][BaseTrainer][INFO] - [Epoch 1/10, Iter 243/3560] 3.485039710998535, cross_entropy: 3.485039710998535
924
+ [2025-05-05 14:43:57,395][BaseTrainer][INFO] - [Epoch 1/10, Iter 244/3560] 3.1860527992248535, cross_entropy: 3.1860527992248535
925
+ [2025-05-05 14:44:00,127][BaseTrainer][INFO] - [Epoch 1/10, Iter 245/3560] 3.391864538192749, cross_entropy: 3.391864538192749
926
+ [2025-05-05 14:44:00,395][BaseTrainer][INFO] - [Epoch 1/10, Iter 246/3560] 3.5632526874542236, cross_entropy: 3.5632526874542236
927
+ [2025-05-05 14:44:04,422][BaseTrainer][INFO] - [Epoch 1/10, Iter 247/3560] 3.9243483543395996, cross_entropy: 3.9243483543395996
928
+ [2025-05-05 14:44:04,690][BaseTrainer][INFO] - [Epoch 1/10, Iter 248/3560] 3.1842687129974365, cross_entropy: 3.1842687129974365
929
+ [2025-05-05 14:44:08,383][BaseTrainer][INFO] - [Epoch 1/10, Iter 249/3560] 3.0103933811187744, cross_entropy: 3.0103933811187744
930
+ [2025-05-05 14:44:08,658][BaseTrainer][INFO] - [Epoch 1/10, Iter 250/3560] 2.7349205017089844, cross_entropy: 2.7349205017089844
931
+ [2025-05-05 14:44:11,733][BaseTrainer][INFO] - [Epoch 1/10, Iter 251/3560] 2.6002695560455322, cross_entropy: 2.6002695560455322
932
+ [2025-05-05 14:44:12,010][BaseTrainer][INFO] - [Epoch 1/10, Iter 252/3560] 2.744290828704834, cross_entropy: 2.744290828704834
933
+ [2025-05-05 14:44:14,892][BaseTrainer][INFO] - [Epoch 1/10, Iter 253/3560] 2.8992106914520264, cross_entropy: 2.8992106914520264
934
+ [2025-05-05 14:44:15,172][BaseTrainer][INFO] - [Epoch 1/10, Iter 254/3560] 2.458922863006592, cross_entropy: 2.458922863006592
935
+ [2025-05-05 14:44:18,520][BaseTrainer][INFO] - [Epoch 1/10, Iter 255/3560] 3.0419998168945312, cross_entropy: 3.0419998168945312
936
+ [2025-05-05 14:44:18,800][BaseTrainer][INFO] - [Epoch 1/10, Iter 256/3560] 3.9023687839508057, cross_entropy: 3.9023687839508057
937
+ [2025-05-05 14:44:21,700][BaseTrainer][INFO] - [Epoch 1/10, Iter 257/3560] 3.862330436706543, cross_entropy: 3.862330436706543
938
+ [2025-05-05 14:44:21,973][BaseTrainer][INFO] - [Epoch 1/10, Iter 258/3560] 2.548741340637207, cross_entropy: 2.548741340637207
939
+ [2025-05-05 14:44:25,519][BaseTrainer][INFO] - [Epoch 1/10, Iter 259/3560] 4.23280668258667, cross_entropy: 4.23280668258667
940
+ [2025-05-05 14:44:25,788][BaseTrainer][INFO] - [Epoch 1/10, Iter 260/3560] 4.037736415863037, cross_entropy: 4.037736415863037
941
+ [2025-05-05 14:44:28,346][BaseTrainer][INFO] - [Epoch 1/10, Iter 261/3560] 2.491283893585205, cross_entropy: 2.491283893585205
942
+ [2025-05-05 14:44:28,630][BaseTrainer][INFO] - [Epoch 1/10, Iter 262/3560] 3.5204434394836426, cross_entropy: 3.5204434394836426
943
+ [2025-05-05 14:44:32,197][BaseTrainer][INFO] - [Epoch 1/10, Iter 263/3560] 4.078241348266602, cross_entropy: 4.078241348266602
944
+ [2025-05-05 14:44:32,464][BaseTrainer][INFO] - [Epoch 1/10, Iter 264/3560] 2.7961766719818115, cross_entropy: 2.7961766719818115
945
+ [2025-05-05 14:44:36,105][BaseTrainer][INFO] - [Epoch 1/10, Iter 265/3560] 3.5801329612731934, cross_entropy: 3.5801329612731934
946
+ [2025-05-05 14:44:36,380][BaseTrainer][INFO] - [Epoch 1/10, Iter 266/3560] 3.2456274032592773, cross_entropy: 3.2456274032592773
947
+ [2025-05-05 14:44:38,908][BaseTrainer][INFO] - [Epoch 1/10, Iter 267/3560] 4.04139518737793, cross_entropy: 4.04139518737793
948
+ [2025-05-05 14:44:39,175][BaseTrainer][INFO] - [Epoch 1/10, Iter 268/3560] 3.198580741882324, cross_entropy: 3.198580741882324
949
+ [2025-05-05 14:44:42,010][BaseTrainer][INFO] - [Epoch 1/10, Iter 269/3560] 3.9409687519073486, cross_entropy: 3.9409687519073486
950
+ [2025-05-05 14:44:42,278][BaseTrainer][INFO] - [Epoch 1/10, Iter 270/3560] 3.631314754486084, cross_entropy: 3.631314754486084
951
+ [2025-05-05 14:44:45,053][BaseTrainer][INFO] - [Epoch 1/10, Iter 271/3560] 3.3704605102539062, cross_entropy: 3.3704605102539062
952
+ [2025-05-05 14:44:45,334][BaseTrainer][INFO] - [Epoch 1/10, Iter 272/3560] 2.9592432975769043, cross_entropy: 2.9592432975769043
953
+ [2025-05-05 14:44:48,864][BaseTrainer][INFO] - [Epoch 1/10, Iter 273/3560] 2.7452077865600586, cross_entropy: 2.7452077865600586
954
+ [2025-05-05 14:44:49,133][BaseTrainer][INFO] - [Epoch 1/10, Iter 274/3560] 3.583657741546631, cross_entropy: 3.583657741546631
955
+ [2025-05-05 14:44:51,539][BaseTrainer][INFO] - [Epoch 1/10, Iter 275/3560] 2.5386781692504883, cross_entropy: 2.5386781692504883
956
+ [2025-05-05 14:44:51,809][BaseTrainer][INFO] - [Epoch 1/10, Iter 276/3560] 3.104559898376465, cross_entropy: 3.104559898376465
957
+ [2025-05-05 14:44:55,288][BaseTrainer][INFO] - [Epoch 1/10, Iter 277/3560] 3.6079154014587402, cross_entropy: 3.6079154014587402
958
+ [2025-05-05 14:44:55,558][BaseTrainer][INFO] - [Epoch 1/10, Iter 278/3560] 2.549783706665039, cross_entropy: 2.549783706665039
959
+ [2025-05-05 14:44:58,211][BaseTrainer][INFO] - [Epoch 1/10, Iter 279/3560] 4.0889482498168945, cross_entropy: 4.0889482498168945
960
+ [2025-05-05 14:44:58,479][BaseTrainer][INFO] - [Epoch 1/10, Iter 280/3560] 3.355772018432617, cross_entropy: 3.355772018432617
961
+ [2025-05-05 14:45:01,350][BaseTrainer][INFO] - [Epoch 1/10, Iter 281/3560] 2.6926355361938477, cross_entropy: 2.6926355361938477
962
+ [2025-05-05 14:45:01,631][BaseTrainer][INFO] - [Epoch 1/10, Iter 282/3560] 4.046462059020996, cross_entropy: 4.046462059020996
963
+ [2025-05-05 14:45:04,434][BaseTrainer][INFO] - [Epoch 1/10, Iter 283/3560] 3.140268325805664, cross_entropy: 3.140268325805664
964
+ [2025-05-05 14:45:04,706][BaseTrainer][INFO] - [Epoch 1/10, Iter 284/3560] 3.676936626434326, cross_entropy: 3.676936626434326
965
+ [2025-05-05 14:45:07,416][BaseTrainer][INFO] - [Epoch 1/10, Iter 285/3560] 2.5142154693603516, cross_entropy: 2.5142154693603516
966
+ [2025-05-05 14:45:07,699][BaseTrainer][INFO] - [Epoch 1/10, Iter 286/3560] 4.099308967590332, cross_entropy: 4.099308967590332
967
+ [2025-05-05 14:45:10,717][BaseTrainer][INFO] - [Epoch 1/10, Iter 287/3560] 3.406856060028076, cross_entropy: 3.406856060028076
968
+ [2025-05-05 14:45:10,998][BaseTrainer][INFO] - [Epoch 1/10, Iter 288/3560] 3.9146933555603027, cross_entropy: 3.9146933555603027
969
+ [2025-05-05 14:45:13,877][BaseTrainer][INFO] - [Epoch 1/10, Iter 289/3560] 3.004814386367798, cross_entropy: 3.004814386367798
970
+ [2025-05-05 14:45:14,154][BaseTrainer][INFO] - [Epoch 1/10, Iter 290/3560] 3.9426722526550293, cross_entropy: 3.9426722526550293
971
+ [2025-05-05 14:45:16,781][BaseTrainer][INFO] - [Epoch 1/10, Iter 291/3560] 2.401880979537964, cross_entropy: 2.401880979537964
972
+ [2025-05-05 14:45:17,056][BaseTrainer][INFO] - [Epoch 1/10, Iter 292/3560] 2.6597084999084473, cross_entropy: 2.6597084999084473
973
+ [2025-05-05 14:45:19,908][BaseTrainer][INFO] - [Epoch 1/10, Iter 293/3560] 3.556488513946533, cross_entropy: 3.556488513946533
974
+ [2025-05-05 14:45:20,188][BaseTrainer][INFO] - [Epoch 1/10, Iter 294/3560] 3.065614938735962, cross_entropy: 3.065614938735962
975
+ [2025-05-05 14:45:23,449][BaseTrainer][INFO] - [Epoch 1/10, Iter 295/3560] 3.2297940254211426, cross_entropy: 3.2297940254211426
976
+ [2025-05-05 14:45:23,727][BaseTrainer][INFO] - [Epoch 1/10, Iter 296/3560] 3.6723713874816895, cross_entropy: 3.6723713874816895
977
+ [2025-05-05 14:45:26,683][BaseTrainer][INFO] - [Epoch 1/10, Iter 297/3560] 2.7709693908691406, cross_entropy: 2.7709693908691406
978
+ [2025-05-05 14:45:26,952][BaseTrainer][INFO] - [Epoch 1/10, Iter 298/3560] 3.6901214122772217, cross_entropy: 3.6901214122772217
979
+ [2025-05-05 14:45:30,018][BaseTrainer][INFO] - [Epoch 1/10, Iter 299/3560] 3.0277936458587646, cross_entropy: 3.0277936458587646
980
+ [2025-05-05 14:45:30,287][BaseTrainer][INFO] - [Epoch 1/10, Iter 300/3560] 3.6409151554107666, cross_entropy: 3.6409151554107666
981
+ [2025-05-05 14:45:34,000][BaseTrainer][INFO] - [Epoch 1/10, Iter 301/3560] 4.046877861022949, cross_entropy: 4.046877861022949
982
+ [2025-05-05 14:45:34,268][BaseTrainer][INFO] - [Epoch 1/10, Iter 302/3560] 3.3299200534820557, cross_entropy: 3.3299200534820557
983
+ [2025-05-05 14:45:37,304][BaseTrainer][INFO] - [Epoch 1/10, Iter 303/3560] 4.047910213470459, cross_entropy: 4.047910213470459
984
+ [2025-05-05 14:45:37,582][BaseTrainer][INFO] - [Epoch 1/10, Iter 304/3560] 2.930318593978882, cross_entropy: 2.930318593978882
985
+ [2025-05-05 14:45:40,780][BaseTrainer][INFO] - [Epoch 1/10, Iter 305/3560] 2.7713170051574707, cross_entropy: 2.7713170051574707
986
+ [2025-05-05 14:45:41,056][BaseTrainer][INFO] - [Epoch 1/10, Iter 306/3560] 2.9806249141693115, cross_entropy: 2.9806249141693115
987
+ [2025-05-05 14:45:43,487][BaseTrainer][INFO] - [Epoch 1/10, Iter 307/3560] 4.055927753448486, cross_entropy: 4.055927753448486
988
+ [2025-05-05 14:45:43,760][BaseTrainer][INFO] - [Epoch 1/10, Iter 308/3560] 3.0572047233581543, cross_entropy: 3.0572047233581543
989
+ [2025-05-05 14:45:46,626][BaseTrainer][INFO] - [Epoch 1/10, Iter 309/3560] 3.593059778213501, cross_entropy: 3.593059778213501
990
+ [2025-05-05 14:45:46,899][BaseTrainer][INFO] - [Epoch 1/10, Iter 310/3560] 3.6059350967407227, cross_entropy: 3.6059350967407227
991
+ [2025-05-05 14:45:49,774][BaseTrainer][INFO] - [Epoch 1/10, Iter 311/3560] 2.608541965484619, cross_entropy: 2.608541965484619
992
+ [2025-05-05 14:45:50,052][BaseTrainer][INFO] - [Epoch 1/10, Iter 312/3560] 2.8141307830810547, cross_entropy: 2.8141307830810547
993
+ [2025-05-05 14:45:53,379][BaseTrainer][INFO] - [Epoch 1/10, Iter 313/3560] 2.9257259368896484, cross_entropy: 2.9257259368896484
994
+ [2025-05-05 14:45:53,651][BaseTrainer][INFO] - [Epoch 1/10, Iter 314/3560] 3.4982352256774902, cross_entropy: 3.4982352256774902
995
+ [2025-05-05 14:45:57,154][BaseTrainer][INFO] - [Epoch 1/10, Iter 315/3560] 2.9640259742736816, cross_entropy: 2.9640259742736816
996
+ [2025-05-05 14:45:57,427][BaseTrainer][INFO] - [Epoch 1/10, Iter 316/3560] 3.914764881134033, cross_entropy: 3.914764881134033
997
+ [2025-05-05 14:46:00,707][BaseTrainer][INFO] - [Epoch 1/10, Iter 317/3560] 2.9601752758026123, cross_entropy: 2.9601752758026123
998
+ [2025-05-05 14:46:00,989][BaseTrainer][INFO] - [Epoch 1/10, Iter 318/3560] 3.2947206497192383, cross_entropy: 3.2947206497192383
999
+ [2025-05-05 14:46:04,240][BaseTrainer][INFO] - [Epoch 1/10, Iter 319/3560] 3.5406155586242676, cross_entropy: 3.5406155586242676
1000
+ [2025-05-05 14:46:04,514][BaseTrainer][INFO] - [Epoch 1/10, Iter 320/3560] 2.4789462089538574, cross_entropy: 2.4789462089538574
1001
+ [2025-05-05 14:46:07,603][BaseTrainer][INFO] - [Epoch 1/10, Iter 321/3560] 2.6648499965667725, cross_entropy: 2.6648499965667725
1002
+ [2025-05-05 14:46:07,881][BaseTrainer][INFO] - [Epoch 1/10, Iter 322/3560] 3.446200370788574, cross_entropy: 3.446200370788574
1003
+ [2025-05-05 14:46:11,019][BaseTrainer][INFO] - [Epoch 1/10, Iter 323/3560] 3.7917072772979736, cross_entropy: 3.7917072772979736
1004
+ [2025-05-05 14:46:11,287][BaseTrainer][INFO] - [Epoch 1/10, Iter 324/3560] 2.6264336109161377, cross_entropy: 2.6264336109161377
1005
+ [2025-05-05 14:46:14,487][BaseTrainer][INFO] - [Epoch 1/10, Iter 325/3560] 2.577727794647217, cross_entropy: 2.577727794647217
1006
+ [2025-05-05 14:46:14,756][BaseTrainer][INFO] - [Epoch 1/10, Iter 326/3560] 3.6849749088287354, cross_entropy: 3.6849749088287354
1007
+ [2025-05-05 14:46:17,755][BaseTrainer][INFO] - [Epoch 1/10, Iter 327/3560] 3.766862392425537, cross_entropy: 3.766862392425537
1008
+ [2025-05-05 14:46:18,034][BaseTrainer][INFO] - [Epoch 1/10, Iter 328/3560] 2.486441135406494, cross_entropy: 2.486441135406494
1009
+ [2025-05-05 14:46:20,874][BaseTrainer][INFO] - [Epoch 1/10, Iter 329/3560] 3.5917320251464844, cross_entropy: 3.5917320251464844
1010
+ [2025-05-05 14:46:21,159][BaseTrainer][INFO] - [Epoch 1/10, Iter 330/3560] 2.9070332050323486, cross_entropy: 2.9070332050323486
1011
+ [2025-05-05 14:46:24,176][BaseTrainer][INFO] - [Epoch 1/10, Iter 331/3560] 3.270543098449707, cross_entropy: 3.270543098449707
1012
+ [2025-05-05 14:46:24,448][BaseTrainer][INFO] - [Epoch 1/10, Iter 332/3560] 2.8748619556427, cross_entropy: 2.8748619556427
1013
+ [2025-05-05 14:46:27,735][BaseTrainer][INFO] - [Epoch 1/10, Iter 333/3560] 2.606078863143921, cross_entropy: 2.606078863143921
1014
+ [2025-05-05 14:46:28,013][BaseTrainer][INFO] - [Epoch 1/10, Iter 334/3560] 3.892674207687378, cross_entropy: 3.892674207687378
1015
+ [2025-05-05 14:46:31,359][BaseTrainer][INFO] - [Epoch 1/10, Iter 335/3560] 2.9187989234924316, cross_entropy: 2.9187989234924316
1016
+ [2025-05-05 14:46:31,629][BaseTrainer][INFO] - [Epoch 1/10, Iter 336/3560] 3.758178234100342, cross_entropy: 3.758178234100342
1017
+ [2025-05-05 14:46:34,423][BaseTrainer][INFO] - [Epoch 1/10, Iter 337/3560] 3.5721333026885986, cross_entropy: 3.5721333026885986
1018
+ [2025-05-05 14:46:34,704][BaseTrainer][INFO] - [Epoch 1/10, Iter 338/3560] 2.4708504676818848, cross_entropy: 2.4708504676818848
1019
+ [2025-05-05 14:46:37,452][BaseTrainer][INFO] - [Epoch 1/10, Iter 339/3560] 2.6186976432800293, cross_entropy: 2.6186976432800293
1020
+ [2025-05-05 14:46:37,720][BaseTrainer][INFO] - [Epoch 1/10, Iter 340/3560] 3.8197436332702637, cross_entropy: 3.8197436332702637
1021
+ [2025-05-05 14:46:41,073][BaseTrainer][INFO] - [Epoch 1/10, Iter 341/3560] 2.6677775382995605, cross_entropy: 2.6677775382995605
1022
+ [2025-05-05 14:46:41,340][BaseTrainer][INFO] - [Epoch 1/10, Iter 342/3560] 3.7246603965759277, cross_entropy: 3.7246603965759277
1023
+ [2025-05-05 14:46:45,442][BaseTrainer][INFO] - [Epoch 1/10, Iter 343/3560] 3.1751248836517334, cross_entropy: 3.1751248836517334
1024
+ [2025-05-05 14:46:45,717][BaseTrainer][INFO] - [Epoch 1/10, Iter 344/3560] 3.7615671157836914, cross_entropy: 3.7615671157836914
1025
+ [2025-05-05 14:46:47,829][BaseTrainer][INFO] - [Epoch 1/10, Iter 345/3560] 3.357501983642578, cross_entropy: 3.357501983642578
1026
+ [2025-05-05 14:46:48,106][BaseTrainer][INFO] - [Epoch 1/10, Iter 346/3560] 2.682670831680298, cross_entropy: 2.682670831680298
1027
+ [2025-05-05 14:46:50,699][BaseTrainer][INFO] - [Epoch 1/10, Iter 347/3560] 3.1317172050476074, cross_entropy: 3.1317172050476074
1028
+ [2025-05-05 14:46:50,969][BaseTrainer][INFO] - [Epoch 1/10, Iter 348/3560] 3.1673238277435303, cross_entropy: 3.1673238277435303
1029
+ [2025-05-05 14:46:54,317][BaseTrainer][INFO] - [Epoch 1/10, Iter 349/3560] 2.39094614982605, cross_entropy: 2.39094614982605
1030
+ [2025-05-05 14:46:54,606][BaseTrainer][INFO] - [Epoch 1/10, Iter 350/3560] 3.5036511421203613, cross_entropy: 3.5036511421203613
1031
+ [2025-05-05 14:46:57,108][BaseTrainer][INFO] - [Epoch 1/10, Iter 351/3560] 2.332580804824829, cross_entropy: 2.332580804824829
1032
+ [2025-05-05 14:46:57,378][BaseTrainer][INFO] - [Epoch 1/10, Iter 352/3560] 2.358558177947998, cross_entropy: 2.358558177947998
1033
+ [2025-05-05 14:46:59,723][BaseTrainer][INFO] - [Epoch 1/10, Iter 353/3560] 2.5316014289855957, cross_entropy: 2.5316014289855957
1034
+ [2025-05-05 14:47:00,001][BaseTrainer][INFO] - [Epoch 1/10, Iter 354/3560] 4.0149126052856445, cross_entropy: 4.0149126052856445
1035
+ [2025-05-05 14:47:01,953][BaseTrainer][INFO] - [Epoch 1/10, Iter 355/3560] 3.642197608947754, cross_entropy: 3.642197608947754
1036
+ [2025-05-05 14:47:04,897][BaseTrainer][INFO] - [Epoch 1/10, Iter 356/3560] 3.413801670074463, cross_entropy: 3.413801670074463
1037
+ [2025-05-05 14:49:59,861][BaseTrainer][INFO] - [Epoch 1/10] (train) 3.6912786960601807, cross_entropy: 3.6912786960601807
1038
+ [2025-05-05 14:49:59,861][BaseTrainer][INFO] - [Epoch 1/10] (validation) 2.487116575241089, cross_entropy: 2.487116575241089
1039
+ [2025-05-05 14:49:59,862][BaseTrainer][INFO] - [Epoch 1/10] (metrics) roc_auc: 0.9239352345466614
1040
+ [2025-05-05 14:50:00,032][BaseTrainer][INFO] - Save model: ./exp/20250505-143631/model/best_epoch.pth.
1041
+ [2025-05-05 14:50:00,197][BaseTrainer][INFO] - Save model: ./exp/20250505-143631/model/last.pth.
1042
+ [2025-05-05 14:50:03,548][BaseTrainer][INFO] - [Epoch 2/10, Iter 357/3560] 3.8065967559814453, cross_entropy: 3.8065967559814453
1043
+ [2025-05-05 14:50:03,879][BaseTrainer][INFO] - [Epoch 2/10, Iter 358/3560] 2.3291449546813965, cross_entropy: 2.3291449546813965