tky823 commited on
Commit
d1e2518
·
verified ·
1 Parent(s): 45d7ca5

Upload folder using huggingface_hub

Browse files
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-094415/log/20250505-094417/.hydra/config.yaml ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: null
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: null
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 15
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
46
+ sample_rate: ${..audio.sample_rate}
47
+ hop_length: 1253
48
+ f_min: 20
49
+ f_max: 16000
50
+ pad: 0
51
+ n_mels: 128
52
+ window_fn:
53
+ _target_: torch.hann_window
54
+ _partial_: true
55
+ power: 1.0
56
+ normalized: false
57
+ wkwargs: null
58
+ center: true
59
+ pad_mode: constant
60
+ onesided: null
61
+ norm: slaney
62
+ mel_scale: slaney
63
+ take_log: true
64
+ freq_mask_param:
65
+ - 0.06
66
+ - 0.1
67
+ time_mask_param:
68
+ - 0.06
69
+ - 0.12
70
+ eps: null
71
+ train:
72
+ dataset:
73
+ train:
74
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
75
+ list_path: dump/birdclef2025_15s/list/train.txt
76
+ feature_dir: /kaggle/input/birdclef-2025
77
+ audio_key: audio
78
+ sample_rate_key: sample_rate
79
+ label_name_key: primary_label
80
+ filename_key: filename
81
+ validation:
82
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
83
+ list_path: dump/birdclef2025_15s/list/validation.txt
84
+ feature_dir: /kaggle/input/birdclef-2025
85
+ audio_key: ${..train.audio_key}
86
+ sample_rate_key: ${..train.sample_rate_key}
87
+ label_name_key: ${..train.label_name_key}
88
+ filename_key: ${..train.filename_key}
89
+ dataloader:
90
+ train:
91
+ _target_: torch.utils.data.DataLoader
92
+ batch_size: 64
93
+ shuffle: true
94
+ collate_fn:
95
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
96
+ composer:
97
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelComposer
98
+ melspectrogram_transform: ${data.melspectrogram}
99
+ audio_key: audio
100
+ sample_rate_key: sample_rate
101
+ label_name_key: primary_label
102
+ filename_key: filename
103
+ waveform_key: waveform
104
+ melspectrogram_key: log_melspectrogram
105
+ label_index_key: label_index
106
+ sample_rate: ${data.audio.sample_rate}
107
+ duration: ${data.audio.duration}
108
+ decode_audio_as_waveform: true
109
+ decode_audio_as_monoral: true
110
+ training: false
111
+ melspectrogram_key: ${.composer.melspectrogram_key}
112
+ label_index_key: ${.composer.label_index_key}
113
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
114
+ validation:
115
+ _target_: torch.utils.data.DataLoader
116
+ batch_size: 64
117
+ shuffle: false
118
+ collate_fn:
119
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
120
+ composer:
121
+ _target_: ${....train.collate_fn.composer._target_}
122
+ melspectrogram_transform: ${....train.collate_fn.composer.melspectrogram_transform}
123
+ audio_key: ${....train.collate_fn.composer.audio_key}
124
+ sample_rate_key: ${....train.collate_fn.composer.sample_rate_key}
125
+ label_name_key: ${....train.collate_fn.composer.label_name_key}
126
+ filename_key: ${....train.collate_fn.composer.filename_key}
127
+ waveform_key: ${....train.collate_fn.composer.waveform_key}
128
+ melspectrogram_key: ${....train.collate_fn.composer.melspectrogram_key}
129
+ label_index_key: ${....train.collate_fn.composer.label_index_key}
130
+ sample_rate: ${....train.collate_fn.composer.sample_rate}
131
+ duration: ${....train.collate_fn.composer.duration}
132
+ decode_audio_as_waveform: ${....train.collate_fn.composer.decode_audio_as_waveform}
133
+ decode_audio_as_monoral: ${....train.collate_fn.composer.decode_audio_as_monoral}
134
+ training: false
135
+ melspectrogram_key: ${...train.collate_fn.composer.melspectrogram_key}
136
+ label_index_key: ${...train.collate_fn.composer.label_index_key}
137
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
138
+ clip_gradient: {}
139
+ record: {}
140
+ trainer:
141
+ _target_: birdclef2025.utils.driver.BaseTrainer
142
+ key_mapping:
143
+ train:
144
+ input:
145
+ input: ${....dataloader.train.collate_fn.composer.melspectrogram_key}
146
+ output: logit
147
+ validation: ${.train}
148
+ inference: ${.validation}
149
+ ddp_kwargs: null
150
+ resume:
151
+ continue_from: ''
152
+ output:
153
+ exp_dir: ./exp/20250505-094415
154
+ tensorboard_dir: ./tensorboard/20250505-094415
155
+ save_checkpoint:
156
+ iteration:
157
+ every: 10000
158
+ path: ${...exp_dir}/model/iteration{iteration}.pth
159
+ epoch:
160
+ every: 10
161
+ path: ${...exp_dir}/model/epoch{epoch}.pth
162
+ last:
163
+ path: ${...exp_dir}/model/last.pth
164
+ best_epoch:
165
+ path: ${...exp_dir}/model/best_epoch.pth
166
+ steps:
167
+ epochs: 10
168
+ iterations: null
169
+ lr_scheduler: epoch
170
+ test:
171
+ dataset:
172
+ test:
173
+ _target_: torch.utils.data.Dataset
174
+ dataloader:
175
+ test:
176
+ _target_: torch.utils.data.DataLoader
177
+ batch_size: 1
178
+ shuffle: false
179
+ key_mapping:
180
+ inference:
181
+ input: null
182
+ output: null
183
+ identifier: null
184
+ checkpoint: null
185
+ remove_weight_norm: null
186
+ output:
187
+ exp_dir: ./exp
188
+ inference_dir: ${.exp_dir}/inference
189
+ audio:
190
+ sample_rate: ${data.audio.sample_rate}
191
+ key_mapping:
192
+ inference:
193
+ output: null
194
+ reference: null
195
+ transforms:
196
+ inference:
197
+ output: null
198
+ reference: null
199
+ model:
200
+ _target_: birdclef2025.models.EfficientNetB0
201
+ weights: ${const:torchvision.models.EfficientNet_B0_Weights.IMAGENET1K_V1}
202
+ num_classes: ${const:birdclef2025.utils.data.birdclef.num_birdclef2025_primary_labels}
203
+ optimizer:
204
+ _target_: torch.optim.Adam
205
+ lr_scheduler: {}
206
+ criterion:
207
+ _target_: audyn.criterion.MultiCriteria
208
+ cross_entropy:
209
+ _target_: audyn.criterion.BaseCriterionWrapper
210
+ criterion:
211
+ _target_: torch.nn.CrossEntropyLoss
212
+ reduction: mean
213
+ weight: 1
214
+ key_mapping:
215
+ estimated:
216
+ input: logit
217
+ target:
218
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
219
+ metrics:
220
+ roc_auc:
221
+ metric:
222
+ _target_: birdclef2025.metrics.ROCAUC
223
+ take_softmax: true
224
+ key_mapping:
225
+ estimated:
226
+ input: logit
227
+ target:
228
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-094415/log/20250505-094417/.hydra/hydra.yaml ADDED
@@ -0,0 +1,191 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ hydra:
2
+ run:
3
+ dir: ./exp/20250505-094415/log/20250505-094417
4
+ sweep:
5
+ dir: multirun/${now:%Y-%m-%d}/${now:%H-%M-%S}
6
+ subdir: ${hydra.job.num}
7
+ launcher:
8
+ _target_: hydra._internal.core_plugins.basic_launcher.BasicLauncher
9
+ sweeper:
10
+ _target_: hydra._internal.core_plugins.basic_sweeper.BasicSweeper
11
+ max_batch_size: null
12
+ params: null
13
+ help:
14
+ app_name: ${hydra.job.name}
15
+ header: '${hydra.help.app_name} is powered by Hydra.
16
+
17
+ '
18
+ footer: 'Powered by Hydra (https://hydra.cc)
19
+
20
+ Use --hydra-help to view Hydra specific help
21
+
22
+ '
23
+ template: '${hydra.help.header}
24
+
25
+ == Configuration groups ==
26
+
27
+ Compose your configuration from those groups (group=option)
28
+
29
+
30
+ $APP_CONFIG_GROUPS
31
+
32
+
33
+ == Config ==
34
+
35
+ Override anything in the config (foo.bar=value)
36
+
37
+
38
+ $CONFIG
39
+
40
+
41
+ ${hydra.help.footer}
42
+
43
+ '
44
+ hydra_help:
45
+ template: 'Hydra (${hydra.runtime.version})
46
+
47
+ See https://hydra.cc for more info.
48
+
49
+
50
+ == Flags ==
51
+
52
+ $FLAGS_HELP
53
+
54
+
55
+ == Configuration groups ==
56
+
57
+ Compose your configuration from those groups (For example, append hydra/job_logging=disabled
58
+ to command line)
59
+
60
+
61
+ $HYDRA_CONFIG_GROUPS
62
+
63
+
64
+ Use ''--cfg hydra'' to Show the Hydra config.
65
+
66
+ '
67
+ hydra_help: ???
68
+ hydra_logging:
69
+ version: 1
70
+ formatters:
71
+ simple:
72
+ format: '[%(asctime)s][HYDRA] %(message)s'
73
+ handlers:
74
+ console:
75
+ class: logging.StreamHandler
76
+ formatter: simple
77
+ stream: ext://sys.stdout
78
+ root:
79
+ level: INFO
80
+ handlers:
81
+ - console
82
+ loggers:
83
+ logging_example:
84
+ level: DEBUG
85
+ disable_existing_loggers: false
86
+ job_logging:
87
+ version: 1
88
+ formatters:
89
+ simple:
90
+ format: '[%(asctime)s][%(name)s][%(levelname)s] - %(message)s'
91
+ handlers:
92
+ console:
93
+ class: logging.StreamHandler
94
+ formatter: simple
95
+ stream: ext://sys.stdout
96
+ file:
97
+ class: logging.FileHandler
98
+ formatter: simple
99
+ filename: ${hydra.runtime.output_dir}/${hydra.job.name}.log
100
+ root:
101
+ level: INFO
102
+ handlers:
103
+ - console
104
+ - file
105
+ disable_existing_loggers: false
106
+ env: {}
107
+ mode: RUN
108
+ searchpath: []
109
+ callbacks: {}
110
+ output_subdir: .hydra
111
+ overrides:
112
+ hydra:
113
+ - hydra.run.dir=./exp/20250505-094415/log/20250505-094417
114
+ - hydra.mode=RUN
115
+ task:
116
+ - system=cuda
117
+ - preprocess=birdclef2025
118
+ - data=birdclef2025_15s
119
+ - train=birdclef2025_noaug_efficientnet_b0
120
+ - model=birdclef2025_efficientnet_b0
121
+ - optimizer=adam
122
+ - lr_scheduler=none
123
+ - criterion=birdclef2025_categorical_cross_entropy
124
+ - +metrics=birdclef2025_categorical_cross_entropy
125
+ - preprocess.dump_format=birdclef2025
126
+ - train.dataset.train.list_path=dump/birdclef2025_15s/list/train.txt
127
+ - train.dataset.train.feature_dir=/kaggle/input/birdclef-2025
128
+ - train.dataset.validation.list_path=dump/birdclef2025_15s/list/validation.txt
129
+ - train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025
130
+ - train.resume.continue_from=
131
+ - train.output.exp_dir=./exp/20250505-094415
132
+ - train.output.tensorboard_dir=./tensorboard/20250505-094415
133
+ job:
134
+ name: train
135
+ chdir: false
136
+ override_dirname: +metrics=birdclef2025_categorical_cross_entropy,criterion=birdclef2025_categorical_cross_entropy,data=birdclef2025_15s,lr_scheduler=none,model=birdclef2025_efficientnet_b0,optimizer=adam,preprocess.dump_format=birdclef2025,preprocess=birdclef2025,system=cuda,train.dataset.train.feature_dir=/kaggle/input/birdclef-2025,train.dataset.train.list_path=dump/birdclef2025_15s/list/train.txt,train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025,train.dataset.validation.list_path=dump/birdclef2025_15s/list/validation.txt,train.output.exp_dir=./exp/20250505-094415,train.output.tensorboard_dir=./tensorboard/20250505-094415,train.resume.continue_from=,train=birdclef2025_noaug_efficientnet_b0
137
+ id: ???
138
+ num: ???
139
+ config_name: config
140
+ env_set: {}
141
+ env_copy: []
142
+ config:
143
+ override_dirname:
144
+ kv_sep: '='
145
+ item_sep: ','
146
+ exclude_keys: []
147
+ runtime:
148
+ version: 1.3.2
149
+ version_base: '1.2'
150
+ cwd: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/EfficientNetB0
151
+ config_sources:
152
+ - path: hydra.conf
153
+ schema: pkg
154
+ provider: hydra
155
+ - path: /usr/local/lib/python3.10/dist-packages/audyn/configs
156
+ schema: file
157
+ provider: main
158
+ - path: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/EfficientNetB0/conf
159
+ schema: file
160
+ provider: command-line
161
+ - path: ''
162
+ schema: structured
163
+ provider: schema
164
+ output_dir: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-094415/log/20250505-094417
165
+ choices:
166
+ metrics: birdclef2025_categorical_cross_entropy
167
+ criterion: birdclef2025_categorical_cross_entropy
168
+ lr_scheduler: none
169
+ optimizer: adam
170
+ model: birdclef2025_efficientnet_b0
171
+ test: default
172
+ test/dataloader: default
173
+ test/dataset: default
174
+ train: birdclef2025_noaug_efficientnet_b0
175
+ train/record: default
176
+ train/clip_gradient: default
177
+ train/dataloader: default
178
+ train/dataset: birdclef2025_primary-label
179
+ data: birdclef2025_15s
180
+ preprocess: birdclef2025
181
+ system: cuda
182
+ hydra/env: default
183
+ hydra/callbacks: null
184
+ hydra/job_logging: default
185
+ hydra/hydra_logging: default
186
+ hydra/hydra_help: default
187
+ hydra/help: default
188
+ hydra/sweeper: basic
189
+ hydra/launcher: basic
190
+ hydra/output: default
191
+ verbose: false
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-094415/log/20250505-094417/.hydra/overrides.yaml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ - system=cuda
2
+ - preprocess=birdclef2025
3
+ - data=birdclef2025_15s
4
+ - train=birdclef2025_noaug_efficientnet_b0
5
+ - model=birdclef2025_efficientnet_b0
6
+ - optimizer=adam
7
+ - lr_scheduler=none
8
+ - criterion=birdclef2025_categorical_cross_entropy
9
+ - +metrics=birdclef2025_categorical_cross_entropy
10
+ - preprocess.dump_format=birdclef2025
11
+ - train.dataset.train.list_path=dump/birdclef2025_15s/list/train.txt
12
+ - train.dataset.train.feature_dir=/kaggle/input/birdclef-2025
13
+ - train.dataset.validation.list_path=dump/birdclef2025_15s/list/validation.txt
14
+ - train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025
15
+ - train.resume.continue_from=
16
+ - train.output.exp_dir=./exp/20250505-094415
17
+ - train.output.tensorboard_dir=./tensorboard/20250505-094415
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-094415/log/20250505-094417/.hydra/resolved_config.yaml ADDED
@@ -0,0 +1,287 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: false
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: 2
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 15
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
46
+ sample_rate: 32000
47
+ hop_length: 1253
48
+ f_min: 20
49
+ f_max: 16000
50
+ pad: 0
51
+ n_mels: 128
52
+ window_fn:
53
+ _target_: torch.hann_window
54
+ _partial_: true
55
+ power: 1.0
56
+ normalized: false
57
+ wkwargs: null
58
+ center: true
59
+ pad_mode: constant
60
+ onesided: null
61
+ norm: slaney
62
+ mel_scale: slaney
63
+ take_log: true
64
+ freq_mask_param:
65
+ - 0.06
66
+ - 0.1
67
+ time_mask_param:
68
+ - 0.06
69
+ - 0.12
70
+ eps: null
71
+ train:
72
+ dataset:
73
+ train:
74
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
75
+ list_path: dump/birdclef2025_15s/list/train.txt
76
+ feature_dir: /kaggle/input/birdclef-2025
77
+ audio_key: audio
78
+ sample_rate_key: sample_rate
79
+ label_name_key: primary_label
80
+ filename_key: filename
81
+ validation:
82
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
83
+ list_path: dump/birdclef2025_15s/list/validation.txt
84
+ feature_dir: /kaggle/input/birdclef-2025
85
+ audio_key: audio
86
+ sample_rate_key: sample_rate
87
+ label_name_key: primary_label
88
+ filename_key: filename
89
+ dataloader:
90
+ train:
91
+ _target_: torch.utils.data.DataLoader
92
+ batch_size: 64
93
+ shuffle: true
94
+ collate_fn:
95
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
96
+ composer:
97
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelComposer
98
+ melspectrogram_transform:
99
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
100
+ sample_rate: 32000
101
+ hop_length: 1253
102
+ f_min: 20
103
+ f_max: 16000
104
+ pad: 0
105
+ n_mels: 128
106
+ window_fn:
107
+ _target_: torch.hann_window
108
+ _partial_: true
109
+ power: 1.0
110
+ normalized: false
111
+ wkwargs: null
112
+ center: true
113
+ pad_mode: constant
114
+ onesided: null
115
+ norm: slaney
116
+ mel_scale: slaney
117
+ take_log: true
118
+ freq_mask_param:
119
+ - 0.06
120
+ - 0.1
121
+ time_mask_param:
122
+ - 0.06
123
+ - 0.12
124
+ eps: null
125
+ audio_key: audio
126
+ sample_rate_key: sample_rate
127
+ label_name_key: primary_label
128
+ filename_key: filename
129
+ waveform_key: waveform
130
+ melspectrogram_key: log_melspectrogram
131
+ label_index_key: label_index
132
+ sample_rate: 32000
133
+ duration: 15
134
+ decode_audio_as_waveform: true
135
+ decode_audio_as_monoral: true
136
+ training: false
137
+ melspectrogram_key: log_melspectrogram
138
+ label_index_key: label_index
139
+ num_workers: 2
140
+ validation:
141
+ _target_: torch.utils.data.DataLoader
142
+ batch_size: 64
143
+ shuffle: false
144
+ collate_fn:
145
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
146
+ composer:
147
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelComposer
148
+ melspectrogram_transform:
149
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
150
+ sample_rate: 32000
151
+ hop_length: 1253
152
+ f_min: 20
153
+ f_max: 16000
154
+ pad: 0
155
+ n_mels: 128
156
+ window_fn:
157
+ _target_: torch.hann_window
158
+ _partial_: true
159
+ power: 1.0
160
+ normalized: false
161
+ wkwargs: null
162
+ center: true
163
+ pad_mode: constant
164
+ onesided: null
165
+ norm: slaney
166
+ mel_scale: slaney
167
+ take_log: true
168
+ freq_mask_param:
169
+ - 0.06
170
+ - 0.1
171
+ time_mask_param:
172
+ - 0.06
173
+ - 0.12
174
+ eps: null
175
+ audio_key: audio
176
+ sample_rate_key: sample_rate
177
+ label_name_key: primary_label
178
+ filename_key: filename
179
+ waveform_key: waveform
180
+ melspectrogram_key: log_melspectrogram
181
+ label_index_key: label_index
182
+ sample_rate: 32000
183
+ duration: 15
184
+ decode_audio_as_waveform: true
185
+ decode_audio_as_monoral: true
186
+ training: false
187
+ melspectrogram_key: log_melspectrogram
188
+ label_index_key: label_index
189
+ num_workers: 2
190
+ clip_gradient: {}
191
+ record: {}
192
+ trainer:
193
+ _target_: birdclef2025.utils.driver.BaseTrainer
194
+ key_mapping:
195
+ train:
196
+ input:
197
+ input: log_melspectrogram
198
+ output: logit
199
+ validation:
200
+ input:
201
+ input: log_melspectrogram
202
+ output: logit
203
+ inference:
204
+ input:
205
+ input: log_melspectrogram
206
+ output: logit
207
+ ddp_kwargs: null
208
+ resume:
209
+ continue_from: ''
210
+ output:
211
+ exp_dir: ./exp/20250505-094415
212
+ tensorboard_dir: ./tensorboard/20250505-094415
213
+ save_checkpoint:
214
+ iteration:
215
+ every: 10000
216
+ path: ./exp/20250505-094415/model/iteration{iteration}.pth
217
+ epoch:
218
+ every: 10
219
+ path: ./exp/20250505-094415/model/epoch{epoch}.pth
220
+ last:
221
+ path: ./exp/20250505-094415/model/last.pth
222
+ best_epoch:
223
+ path: ./exp/20250505-094415/model/best_epoch.pth
224
+ steps:
225
+ epochs: 10
226
+ iterations: null
227
+ lr_scheduler: epoch
228
+ test:
229
+ dataset:
230
+ test:
231
+ _target_: torch.utils.data.Dataset
232
+ dataloader:
233
+ test:
234
+ _target_: torch.utils.data.DataLoader
235
+ batch_size: 1
236
+ shuffle: false
237
+ key_mapping:
238
+ inference:
239
+ input: null
240
+ output: null
241
+ identifier: null
242
+ checkpoint: null
243
+ remove_weight_norm: null
244
+ output:
245
+ exp_dir: ./exp
246
+ inference_dir: ./exp/inference
247
+ audio:
248
+ sample_rate: 32000
249
+ key_mapping:
250
+ inference:
251
+ output: null
252
+ reference: null
253
+ transforms:
254
+ inference:
255
+ output: null
256
+ reference: null
257
+ ddp_kwargs: null
258
+ model:
259
+ _target_: birdclef2025.models.EfficientNetB0
260
+ weights: IMAGENET1K_V1
261
+ num_classes: 206
262
+ optimizer:
263
+ _target_: torch.optim.Adam
264
+ lr_scheduler: {}
265
+ criterion:
266
+ _target_: audyn.criterion.MultiCriteria
267
+ cross_entropy:
268
+ _target_: audyn.criterion.BaseCriterionWrapper
269
+ criterion:
270
+ _target_: torch.nn.CrossEntropyLoss
271
+ reduction: mean
272
+ weight: 1
273
+ key_mapping:
274
+ estimated:
275
+ input: logit
276
+ target:
277
+ target: label_index
278
+ metrics:
279
+ roc_auc:
280
+ metric:
281
+ _target_: birdclef2025.metrics.ROCAUC
282
+ take_softmax: true
283
+ key_mapping:
284
+ estimated:
285
+ input: logit
286
+ target:
287
+ target: label_index
recipes/BirdCLEF2025/EfficientNetB0/exp/20250505-094415/log/20250505-094417/train.log ADDED
@@ -0,0 +1,1045 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [2025-05-05 09:44:45,181][BaseTrainer][INFO] - system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: false
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: 2
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 15
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2025BaselineMelSpectrogram
46
+ sample_rate: 32000
47
+ hop_length: 1253
48
+ f_min: 20
49
+ f_max: 16000
50
+ pad: 0
51
+ n_mels: 128
52
+ window_fn:
53
+ _target_: torch.hann_window
54
+ _partial_: true
55
+ power: 1.0
56
+ normalized: false
57
+ wkwargs: null
58
+ center: true
59
+ pad_mode: constant
60
+ onesided: null
61
+ norm: slaney
62
+ mel_scale: slaney
63
+ take_log: true
64
+ freq_mask_param:
65
+ - 0.06
66
+ - 0.1
67
+ time_mask_param:
68
+ - 0.06
69
+ - 0.12
70
+ eps: null
71
+ train:
72
+ dataset:
73
+ train:
74
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
75
+ list_path: dump/birdclef2025_15s/list/train.txt
76
+ feature_dir: /kaggle/input/birdclef-2025
77
+ audio_key: audio
78
+ sample_rate_key: sample_rate
79
+ label_name_key: primary_label
80
+ filename_key: filename
81
+ validation:
82
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
83
+ list_path: dump/birdclef2025_15s/list/validation.txt
84
+ feature_dir: /kaggle/input/birdclef-2025
85
+ audio_key: ${..train.audio_key}
86
+ sample_rate_key: ${..train.sample_rate_key}
87
+ label_name_key: ${..train.label_name_key}
88
+ filename_key: ${..train.filename_key}
89
+ dataloader:
90
+ train:
91
+ _target_: torch.utils.data.DataLoader
92
+ batch_size: 64
93
+ shuffle: true
94
+ collate_fn:
95
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
96
+ composer:
97
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelComposer
98
+ melspectrogram_transform: ${data.melspectrogram}
99
+ audio_key: audio
100
+ sample_rate_key: sample_rate
101
+ label_name_key: primary_label
102
+ filename_key: filename
103
+ waveform_key: waveform
104
+ melspectrogram_key: log_melspectrogram
105
+ label_index_key: label_index
106
+ sample_rate: ${data.audio.sample_rate}
107
+ duration: ${data.audio.duration}
108
+ decode_audio_as_waveform: true
109
+ decode_audio_as_monoral: true
110
+ training: false
111
+ melspectrogram_key: ${.composer.melspectrogram_key}
112
+ label_index_key: ${.composer.label_index_key}
113
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
114
+ validation:
115
+ _target_: torch.utils.data.DataLoader
116
+ batch_size: 64
117
+ shuffle: false
118
+ collate_fn:
119
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
120
+ composer:
121
+ _target_: ${....train.collate_fn.composer._target_}
122
+ melspectrogram_transform: ${....train.collate_fn.composer.melspectrogram_transform}
123
+ audio_key: ${....train.collate_fn.composer.audio_key}
124
+ sample_rate_key: ${....train.collate_fn.composer.sample_rate_key}
125
+ label_name_key: ${....train.collate_fn.composer.label_name_key}
126
+ filename_key: ${....train.collate_fn.composer.filename_key}
127
+ waveform_key: ${....train.collate_fn.composer.waveform_key}
128
+ melspectrogram_key: ${....train.collate_fn.composer.melspectrogram_key}
129
+ label_index_key: ${....train.collate_fn.composer.label_index_key}
130
+ sample_rate: ${....train.collate_fn.composer.sample_rate}
131
+ duration: ${....train.collate_fn.composer.duration}
132
+ decode_audio_as_waveform: ${....train.collate_fn.composer.decode_audio_as_waveform}
133
+ decode_audio_as_monoral: ${....train.collate_fn.composer.decode_audio_as_monoral}
134
+ training: false
135
+ melspectrogram_key: ${...train.collate_fn.composer.melspectrogram_key}
136
+ label_index_key: ${...train.collate_fn.composer.label_index_key}
137
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
138
+ clip_gradient: {}
139
+ record: {}
140
+ trainer:
141
+ _target_: birdclef2025.utils.driver.BaseTrainer
142
+ _partial_: true
143
+ key_mapping:
144
+ train:
145
+ input:
146
+ input: ${....dataloader.train.collate_fn.composer.melspectrogram_key}
147
+ output: logit
148
+ validation: ${.train}
149
+ inference: ${.validation}
150
+ ddp_kwargs: null
151
+ resume:
152
+ continue_from: ''
153
+ output:
154
+ exp_dir: ./exp/20250505-094415
155
+ tensorboard_dir: ./tensorboard/20250505-094415
156
+ save_checkpoint:
157
+ iteration:
158
+ every: 10000
159
+ path: ${...exp_dir}/model/iteration{iteration}.pth
160
+ epoch:
161
+ every: 10
162
+ path: ${...exp_dir}/model/epoch{epoch}.pth
163
+ last:
164
+ path: ${...exp_dir}/model/last.pth
165
+ best_epoch:
166
+ path: ${...exp_dir}/model/best_epoch.pth
167
+ steps:
168
+ epochs: 10
169
+ iterations: null
170
+ lr_scheduler: epoch
171
+ test:
172
+ dataset:
173
+ test:
174
+ _target_: torch.utils.data.Dataset
175
+ dataloader:
176
+ test:
177
+ _target_: torch.utils.data.DataLoader
178
+ batch_size: 1
179
+ shuffle: false
180
+ key_mapping:
181
+ inference:
182
+ input: null
183
+ output: null
184
+ identifier: null
185
+ checkpoint: null
186
+ remove_weight_norm: null
187
+ output:
188
+ exp_dir: ./exp
189
+ inference_dir: ${.exp_dir}/inference
190
+ audio:
191
+ sample_rate: ${data.audio.sample_rate}
192
+ key_mapping:
193
+ inference:
194
+ output: null
195
+ reference: null
196
+ transforms:
197
+ inference:
198
+ output: null
199
+ reference: null
200
+ ddp_kwargs: null
201
+ model:
202
+ _target_: birdclef2025.models.EfficientNetB0
203
+ weights: ${const:torchvision.models.EfficientNet_B0_Weights.IMAGENET1K_V1}
204
+ num_classes: ${const:birdclef2025.utils.data.birdclef.num_birdclef2025_primary_labels}
205
+ optimizer:
206
+ _target_: torch.optim.Adam
207
+ lr_scheduler: {}
208
+ criterion:
209
+ _target_: audyn.criterion.MultiCriteria
210
+ cross_entropy:
211
+ _target_: audyn.criterion.BaseCriterionWrapper
212
+ criterion:
213
+ _target_: torch.nn.CrossEntropyLoss
214
+ reduction: mean
215
+ weight: 1
216
+ key_mapping:
217
+ estimated:
218
+ input: logit
219
+ target:
220
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
221
+ metrics:
222
+ roc_auc:
223
+ metric:
224
+ _target_: birdclef2025.metrics.ROCAUC
225
+ take_softmax: true
226
+ key_mapping:
227
+ estimated:
228
+ input: logit
229
+ target:
230
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
231
+
232
+ [2025-05-05 09:44:45,181][BaseTrainer][INFO] - EfficientNetB0(
233
+ (backbone): Sequential(
234
+ (0): Conv2dNormActivation(
235
+ (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
236
+ (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
237
+ (2): SiLU(inplace=True)
238
+ )
239
+ (1): Sequential(
240
+ (0): MBConv(
241
+ (block): Sequential(
242
+ (0): Conv2dNormActivation(
243
+ (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
244
+ (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
245
+ (2): SiLU(inplace=True)
246
+ )
247
+ (1): SqueezeExcitation(
248
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
249
+ (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
250
+ (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
251
+ (activation): SiLU(inplace=True)
252
+ (scale_activation): Sigmoid()
253
+ )
254
+ (2): Conv2dNormActivation(
255
+ (0): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
256
+ (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
257
+ )
258
+ )
259
+ (stochastic_depth): StochasticDepth(p=0.0, mode=row)
260
+ )
261
+ )
262
+ (2): Sequential(
263
+ (0): MBConv(
264
+ (block): Sequential(
265
+ (0): Conv2dNormActivation(
266
+ (0): Conv2d(16, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
267
+ (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
268
+ (2): SiLU(inplace=True)
269
+ )
270
+ (1): Conv2dNormActivation(
271
+ (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96, bias=False)
272
+ (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
273
+ (2): SiLU(inplace=True)
274
+ )
275
+ (2): SqueezeExcitation(
276
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
277
+ (fc1): Conv2d(96, 4, kernel_size=(1, 1), stride=(1, 1))
278
+ (fc2): Conv2d(4, 96, kernel_size=(1, 1), stride=(1, 1))
279
+ (activation): SiLU(inplace=True)
280
+ (scale_activation): Sigmoid()
281
+ )
282
+ (3): Conv2dNormActivation(
283
+ (0): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
284
+ (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
285
+ )
286
+ )
287
+ (stochastic_depth): StochasticDepth(p=0.0125, mode=row)
288
+ )
289
+ (1): MBConv(
290
+ (block): Sequential(
291
+ (0): Conv2dNormActivation(
292
+ (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
293
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
294
+ (2): SiLU(inplace=True)
295
+ )
296
+ (1): Conv2dNormActivation(
297
+ (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=144, bias=False)
298
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
299
+ (2): SiLU(inplace=True)
300
+ )
301
+ (2): SqueezeExcitation(
302
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
303
+ (fc1): Conv2d(144, 6, kernel_size=(1, 1), stride=(1, 1))
304
+ (fc2): Conv2d(6, 144, kernel_size=(1, 1), stride=(1, 1))
305
+ (activation): SiLU(inplace=True)
306
+ (scale_activation): Sigmoid()
307
+ )
308
+ (3): Conv2dNormActivation(
309
+ (0): Conv2d(144, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
310
+ (1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
311
+ )
312
+ )
313
+ (stochastic_depth): StochasticDepth(p=0.025, mode=row)
314
+ )
315
+ )
316
+ (3): Sequential(
317
+ (0): MBConv(
318
+ (block): Sequential(
319
+ (0): Conv2dNormActivation(
320
+ (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
321
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
322
+ (2): SiLU(inplace=True)
323
+ )
324
+ (1): Conv2dNormActivation(
325
+ (0): Conv2d(144, 144, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=144, bias=False)
326
+ (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
327
+ (2): SiLU(inplace=True)
328
+ )
329
+ (2): SqueezeExcitation(
330
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
331
+ (fc1): Conv2d(144, 6, kernel_size=(1, 1), stride=(1, 1))
332
+ (fc2): Conv2d(6, 144, kernel_size=(1, 1), stride=(1, 1))
333
+ (activation): SiLU(inplace=True)
334
+ (scale_activation): Sigmoid()
335
+ )
336
+ (3): Conv2dNormActivation(
337
+ (0): Conv2d(144, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
338
+ (1): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
339
+ )
340
+ )
341
+ (stochastic_depth): StochasticDepth(p=0.037500000000000006, mode=row)
342
+ )
343
+ (1): MBConv(
344
+ (block): Sequential(
345
+ (0): Conv2dNormActivation(
346
+ (0): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
347
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
348
+ (2): SiLU(inplace=True)
349
+ )
350
+ (1): Conv2dNormActivation(
351
+ (0): Conv2d(240, 240, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=240, bias=False)
352
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
353
+ (2): SiLU(inplace=True)
354
+ )
355
+ (2): SqueezeExcitation(
356
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
357
+ (fc1): Conv2d(240, 10, kernel_size=(1, 1), stride=(1, 1))
358
+ (fc2): Conv2d(10, 240, kernel_size=(1, 1), stride=(1, 1))
359
+ (activation): SiLU(inplace=True)
360
+ (scale_activation): Sigmoid()
361
+ )
362
+ (3): Conv2dNormActivation(
363
+ (0): Conv2d(240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
364
+ (1): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
365
+ )
366
+ )
367
+ (stochastic_depth): StochasticDepth(p=0.05, mode=row)
368
+ )
369
+ )
370
+ (4): Sequential(
371
+ (0): MBConv(
372
+ (block): Sequential(
373
+ (0): Conv2dNormActivation(
374
+ (0): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
375
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
376
+ (2): SiLU(inplace=True)
377
+ )
378
+ (1): Conv2dNormActivation(
379
+ (0): Conv2d(240, 240, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=240, bias=False)
380
+ (1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
381
+ (2): SiLU(inplace=True)
382
+ )
383
+ (2): SqueezeExcitation(
384
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
385
+ (fc1): Conv2d(240, 10, kernel_size=(1, 1), stride=(1, 1))
386
+ (fc2): Conv2d(10, 240, kernel_size=(1, 1), stride=(1, 1))
387
+ (activation): SiLU(inplace=True)
388
+ (scale_activation): Sigmoid()
389
+ )
390
+ (3): Conv2dNormActivation(
391
+ (0): Conv2d(240, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
392
+ (1): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
393
+ )
394
+ )
395
+ (stochastic_depth): StochasticDepth(p=0.0625, mode=row)
396
+ )
397
+ (1): MBConv(
398
+ (block): Sequential(
399
+ (0): Conv2dNormActivation(
400
+ (0): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
401
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
402
+ (2): SiLU(inplace=True)
403
+ )
404
+ (1): Conv2dNormActivation(
405
+ (0): Conv2d(480, 480, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=480, bias=False)
406
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
407
+ (2): SiLU(inplace=True)
408
+ )
409
+ (2): SqueezeExcitation(
410
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
411
+ (fc1): Conv2d(480, 20, kernel_size=(1, 1), stride=(1, 1))
412
+ (fc2): Conv2d(20, 480, kernel_size=(1, 1), stride=(1, 1))
413
+ (activation): SiLU(inplace=True)
414
+ (scale_activation): Sigmoid()
415
+ )
416
+ (3): Conv2dNormActivation(
417
+ (0): Conv2d(480, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
418
+ (1): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
419
+ )
420
+ )
421
+ (stochastic_depth): StochasticDepth(p=0.07500000000000001, mode=row)
422
+ )
423
+ (2): MBConv(
424
+ (block): Sequential(
425
+ (0): Conv2dNormActivation(
426
+ (0): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
427
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
428
+ (2): SiLU(inplace=True)
429
+ )
430
+ (1): Conv2dNormActivation(
431
+ (0): Conv2d(480, 480, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=480, bias=False)
432
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
433
+ (2): SiLU(inplace=True)
434
+ )
435
+ (2): SqueezeExcitation(
436
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
437
+ (fc1): Conv2d(480, 20, kernel_size=(1, 1), stride=(1, 1))
438
+ (fc2): Conv2d(20, 480, kernel_size=(1, 1), stride=(1, 1))
439
+ (activation): SiLU(inplace=True)
440
+ (scale_activation): Sigmoid()
441
+ )
442
+ (3): Conv2dNormActivation(
443
+ (0): Conv2d(480, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
444
+ (1): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
445
+ )
446
+ )
447
+ (stochastic_depth): StochasticDepth(p=0.08750000000000001, mode=row)
448
+ )
449
+ )
450
+ (5): Sequential(
451
+ (0): MBConv(
452
+ (block): Sequential(
453
+ (0): Conv2dNormActivation(
454
+ (0): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
455
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
456
+ (2): SiLU(inplace=True)
457
+ )
458
+ (1): Conv2dNormActivation(
459
+ (0): Conv2d(480, 480, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=480, bias=False)
460
+ (1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
461
+ (2): SiLU(inplace=True)
462
+ )
463
+ (2): SqueezeExcitation(
464
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
465
+ (fc1): Conv2d(480, 20, kernel_size=(1, 1), stride=(1, 1))
466
+ (fc2): Conv2d(20, 480, kernel_size=(1, 1), stride=(1, 1))
467
+ (activation): SiLU(inplace=True)
468
+ (scale_activation): Sigmoid()
469
+ )
470
+ (3): Conv2dNormActivation(
471
+ (0): Conv2d(480, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
472
+ (1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
473
+ )
474
+ )
475
+ (stochastic_depth): StochasticDepth(p=0.1, mode=row)
476
+ )
477
+ (1): MBConv(
478
+ (block): Sequential(
479
+ (0): Conv2dNormActivation(
480
+ (0): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
481
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
482
+ (2): SiLU(inplace=True)
483
+ )
484
+ (1): Conv2dNormActivation(
485
+ (0): Conv2d(672, 672, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=672, bias=False)
486
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
487
+ (2): SiLU(inplace=True)
488
+ )
489
+ (2): SqueezeExcitation(
490
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
491
+ (fc1): Conv2d(672, 28, kernel_size=(1, 1), stride=(1, 1))
492
+ (fc2): Conv2d(28, 672, kernel_size=(1, 1), stride=(1, 1))
493
+ (activation): SiLU(inplace=True)
494
+ (scale_activation): Sigmoid()
495
+ )
496
+ (3): Conv2dNormActivation(
497
+ (0): Conv2d(672, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
498
+ (1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
499
+ )
500
+ )
501
+ (stochastic_depth): StochasticDepth(p=0.1125, mode=row)
502
+ )
503
+ (2): MBConv(
504
+ (block): Sequential(
505
+ (0): Conv2dNormActivation(
506
+ (0): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
507
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
508
+ (2): SiLU(inplace=True)
509
+ )
510
+ (1): Conv2dNormActivation(
511
+ (0): Conv2d(672, 672, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=672, bias=False)
512
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
513
+ (2): SiLU(inplace=True)
514
+ )
515
+ (2): SqueezeExcitation(
516
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
517
+ (fc1): Conv2d(672, 28, kernel_size=(1, 1), stride=(1, 1))
518
+ (fc2): Conv2d(28, 672, kernel_size=(1, 1), stride=(1, 1))
519
+ (activation): SiLU(inplace=True)
520
+ (scale_activation): Sigmoid()
521
+ )
522
+ (3): Conv2dNormActivation(
523
+ (0): Conv2d(672, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
524
+ (1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
525
+ )
526
+ )
527
+ (stochastic_depth): StochasticDepth(p=0.125, mode=row)
528
+ )
529
+ )
530
+ (6): Sequential(
531
+ (0): MBConv(
532
+ (block): Sequential(
533
+ (0): Conv2dNormActivation(
534
+ (0): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
535
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
536
+ (2): SiLU(inplace=True)
537
+ )
538
+ (1): Conv2dNormActivation(
539
+ (0): Conv2d(672, 672, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=672, bias=False)
540
+ (1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
541
+ (2): SiLU(inplace=True)
542
+ )
543
+ (2): SqueezeExcitation(
544
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
545
+ (fc1): Conv2d(672, 28, kernel_size=(1, 1), stride=(1, 1))
546
+ (fc2): Conv2d(28, 672, kernel_size=(1, 1), stride=(1, 1))
547
+ (activation): SiLU(inplace=True)
548
+ (scale_activation): Sigmoid()
549
+ )
550
+ (3): Conv2dNormActivation(
551
+ (0): Conv2d(672, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
552
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
553
+ )
554
+ )
555
+ (stochastic_depth): StochasticDepth(p=0.1375, mode=row)
556
+ )
557
+ (1): MBConv(
558
+ (block): Sequential(
559
+ (0): Conv2dNormActivation(
560
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
561
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
562
+ (2): SiLU(inplace=True)
563
+ )
564
+ (1): Conv2dNormActivation(
565
+ (0): Conv2d(1152, 1152, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=1152, bias=False)
566
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
567
+ (2): SiLU(inplace=True)
568
+ )
569
+ (2): SqueezeExcitation(
570
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
571
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
572
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
573
+ (activation): SiLU(inplace=True)
574
+ (scale_activation): Sigmoid()
575
+ )
576
+ (3): Conv2dNormActivation(
577
+ (0): Conv2d(1152, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
578
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
579
+ )
580
+ )
581
+ (stochastic_depth): StochasticDepth(p=0.15000000000000002, mode=row)
582
+ )
583
+ (2): MBConv(
584
+ (block): Sequential(
585
+ (0): Conv2dNormActivation(
586
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
587
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
588
+ (2): SiLU(inplace=True)
589
+ )
590
+ (1): Conv2dNormActivation(
591
+ (0): Conv2d(1152, 1152, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=1152, bias=False)
592
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
593
+ (2): SiLU(inplace=True)
594
+ )
595
+ (2): SqueezeExcitation(
596
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
597
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
598
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
599
+ (activation): SiLU(inplace=True)
600
+ (scale_activation): Sigmoid()
601
+ )
602
+ (3): Conv2dNormActivation(
603
+ (0): Conv2d(1152, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
604
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
605
+ )
606
+ )
607
+ (stochastic_depth): StochasticDepth(p=0.1625, mode=row)
608
+ )
609
+ (3): MBConv(
610
+ (block): Sequential(
611
+ (0): Conv2dNormActivation(
612
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
613
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
614
+ (2): SiLU(inplace=True)
615
+ )
616
+ (1): Conv2dNormActivation(
617
+ (0): Conv2d(1152, 1152, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=1152, bias=False)
618
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
619
+ (2): SiLU(inplace=True)
620
+ )
621
+ (2): SqueezeExcitation(
622
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
623
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
624
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
625
+ (activation): SiLU(inplace=True)
626
+ (scale_activation): Sigmoid()
627
+ )
628
+ (3): Conv2dNormActivation(
629
+ (0): Conv2d(1152, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
630
+ (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
631
+ )
632
+ )
633
+ (stochastic_depth): StochasticDepth(p=0.17500000000000002, mode=row)
634
+ )
635
+ )
636
+ (7): Sequential(
637
+ (0): MBConv(
638
+ (block): Sequential(
639
+ (0): Conv2dNormActivation(
640
+ (0): Conv2d(192, 1152, kernel_size=(1, 1), stride=(1, 1), bias=False)
641
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
642
+ (2): SiLU(inplace=True)
643
+ )
644
+ (1): Conv2dNormActivation(
645
+ (0): Conv2d(1152, 1152, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1152, bias=False)
646
+ (1): BatchNorm2d(1152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
647
+ (2): SiLU(inplace=True)
648
+ )
649
+ (2): SqueezeExcitation(
650
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
651
+ (fc1): Conv2d(1152, 48, kernel_size=(1, 1), stride=(1, 1))
652
+ (fc2): Conv2d(48, 1152, kernel_size=(1, 1), stride=(1, 1))
653
+ (activation): SiLU(inplace=True)
654
+ (scale_activation): Sigmoid()
655
+ )
656
+ (3): Conv2dNormActivation(
657
+ (0): Conv2d(1152, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)
658
+ (1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
659
+ )
660
+ )
661
+ (stochastic_depth): StochasticDepth(p=0.1875, mode=row)
662
+ )
663
+ )
664
+ (8): Conv2dNormActivation(
665
+ (0): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)
666
+ (1): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
667
+ (2): SiLU(inplace=True)
668
+ )
669
+ )
670
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
671
+ (classifier): Sequential(
672
+ (0): Dropout(p=0.2, inplace=False)
673
+ (1): Linear(in_features=1280, out_features=206, bias=True)
674
+ )
675
+ )
676
+ [2025-05-05 09:44:45,185][BaseTrainer][INFO] - # of parameters: 4271434.
677
+ [2025-05-05 09:44:53,812][BaseTrainer][INFO] - [Epoch 1/10, Iter 1/3560] 5.3044819831848145, cross_entropy: 5.3044819831848145
678
+ [2025-05-05 09:44:54,104][BaseTrainer][INFO] - [Epoch 1/10, Iter 2/3560] 5.359292030334473, cross_entropy: 5.359292030334473
679
+ [2025-05-05 09:44:54,391][BaseTrainer][INFO] - [Epoch 1/10, Iter 3/3560] 5.181077480316162, cross_entropy: 5.181077480316162
680
+ [2025-05-05 09:44:54,709][BaseTrainer][INFO] - [Epoch 1/10, Iter 4/3560] 5.193525314331055, cross_entropy: 5.193525314331055
681
+ [2025-05-05 09:44:55,268][BaseTrainer][INFO] - [Epoch 1/10, Iter 5/3560] 5.035099983215332, cross_entropy: 5.035099983215332
682
+ [2025-05-05 09:44:57,704][BaseTrainer][INFO] - [Epoch 1/10, Iter 6/3560] 5.028683662414551, cross_entropy: 5.028683662414551
683
+ [2025-05-05 09:44:58,286][BaseTrainer][INFO] - [Epoch 1/10, Iter 7/3560] 4.94443416595459, cross_entropy: 4.94443416595459
684
+ [2025-05-05 09:45:00,754][BaseTrainer][INFO] - [Epoch 1/10, Iter 8/3560] 5.00942850112915, cross_entropy: 5.00942850112915
685
+ [2025-05-05 09:45:01,075][BaseTrainer][INFO] - [Epoch 1/10, Iter 9/3560] 4.736028671264648, cross_entropy: 4.736028671264648
686
+ [2025-05-05 09:45:03,926][BaseTrainer][INFO] - [Epoch 1/10, Iter 10/3560] 4.761155128479004, cross_entropy: 4.761155128479004
687
+ [2025-05-05 09:45:04,257][BaseTrainer][INFO] - [Epoch 1/10, Iter 11/3560] 4.883791923522949, cross_entropy: 4.883791923522949
688
+ [2025-05-05 09:45:06,691][BaseTrainer][INFO] - [Epoch 1/10, Iter 12/3560] 4.7488603591918945, cross_entropy: 4.7488603591918945
689
+ [2025-05-05 09:45:06,999][BaseTrainer][INFO] - [Epoch 1/10, Iter 13/3560] 4.90508508682251, cross_entropy: 4.90508508682251
690
+ [2025-05-05 09:45:10,896][BaseTrainer][INFO] - [Epoch 1/10, Iter 14/3560] 4.743208885192871, cross_entropy: 4.743208885192871
691
+ [2025-05-05 09:45:11,165][BaseTrainer][INFO] - [Epoch 1/10, Iter 15/3560] 4.541562080383301, cross_entropy: 4.541562080383301
692
+ [2025-05-05 09:45:13,644][BaseTrainer][INFO] - [Epoch 1/10, Iter 16/3560] 4.653104305267334, cross_entropy: 4.653104305267334
693
+ [2025-05-05 09:45:13,947][BaseTrainer][INFO] - [Epoch 1/10, Iter 17/3560] 4.9948906898498535, cross_entropy: 4.9948906898498535
694
+ [2025-05-05 09:45:16,796][BaseTrainer][INFO] - [Epoch 1/10, Iter 18/3560] 4.959690093994141, cross_entropy: 4.959690093994141
695
+ [2025-05-05 09:45:17,417][BaseTrainer][INFO] - [Epoch 1/10, Iter 19/3560] 4.700274467468262, cross_entropy: 4.700274467468262
696
+ [2025-05-05 09:45:19,582][BaseTrainer][INFO] - [Epoch 1/10, Iter 20/3560] 4.301037788391113, cross_entropy: 4.301037788391113
697
+ [2025-05-05 09:45:20,887][BaseTrainer][INFO] - [Epoch 1/10, Iter 21/3560] 4.551095008850098, cross_entropy: 4.551095008850098
698
+ [2025-05-05 09:45:22,143][BaseTrainer][INFO] - [Epoch 1/10, Iter 22/3560] 4.5659990310668945, cross_entropy: 4.5659990310668945
699
+ [2025-05-05 09:45:23,886][BaseTrainer][INFO] - [Epoch 1/10, Iter 23/3560] 4.437910079956055, cross_entropy: 4.437910079956055
700
+ [2025-05-05 09:45:25,221][BaseTrainer][INFO] - [Epoch 1/10, Iter 24/3560] 4.478883743286133, cross_entropy: 4.478883743286133
701
+ [2025-05-05 09:45:26,893][BaseTrainer][INFO] - [Epoch 1/10, Iter 25/3560] 4.3158464431762695, cross_entropy: 4.3158464431762695
702
+ [2025-05-05 09:45:29,233][BaseTrainer][INFO] - [Epoch 1/10, Iter 26/3560] 4.4522504806518555, cross_entropy: 4.4522504806518555
703
+ [2025-05-05 09:45:29,552][BaseTrainer][INFO] - [Epoch 1/10, Iter 27/3560] 4.317988395690918, cross_entropy: 4.317988395690918
704
+ [2025-05-05 09:45:32,527][BaseTrainer][INFO] - [Epoch 1/10, Iter 28/3560] 4.0444722175598145, cross_entropy: 4.0444722175598145
705
+ [2025-05-05 09:45:32,815][BaseTrainer][INFO] - [Epoch 1/10, Iter 29/3560] 4.207454204559326, cross_entropy: 4.207454204559326
706
+ [2025-05-05 09:45:35,800][BaseTrainer][INFO] - [Epoch 1/10, Iter 30/3560] 4.256237983703613, cross_entropy: 4.256237983703613
707
+ [2025-05-05 09:45:36,087][BaseTrainer][INFO] - [Epoch 1/10, Iter 31/3560] 4.132406234741211, cross_entropy: 4.132406234741211
708
+ [2025-05-05 09:45:38,886][BaseTrainer][INFO] - [Epoch 1/10, Iter 32/3560] 4.214587211608887, cross_entropy: 4.214587211608887
709
+ [2025-05-05 09:45:39,152][BaseTrainer][INFO] - [Epoch 1/10, Iter 33/3560] 4.3889899253845215, cross_entropy: 4.3889899253845215
710
+ [2025-05-05 09:45:42,643][BaseTrainer][INFO] - [Epoch 1/10, Iter 34/3560] 4.112233638763428, cross_entropy: 4.112233638763428
711
+ [2025-05-05 09:45:42,927][BaseTrainer][INFO] - [Epoch 1/10, Iter 35/3560] 4.208951473236084, cross_entropy: 4.208951473236084
712
+ [2025-05-05 09:45:45,674][BaseTrainer][INFO] - [Epoch 1/10, Iter 36/3560] 3.9804739952087402, cross_entropy: 3.9804739952087402
713
+ [2025-05-05 09:45:45,952][BaseTrainer][INFO] - [Epoch 1/10, Iter 37/3560] 4.045773029327393, cross_entropy: 4.045773029327393
714
+ [2025-05-05 09:45:48,380][BaseTrainer][INFO] - [Epoch 1/10, Iter 38/3560] 3.831357479095459, cross_entropy: 3.831357479095459
715
+ [2025-05-05 09:45:48,646][BaseTrainer][INFO] - [Epoch 1/10, Iter 39/3560] 4.062015056610107, cross_entropy: 4.062015056610107
716
+ [2025-05-05 09:45:52,395][BaseTrainer][INFO] - [Epoch 1/10, Iter 40/3560] 4.1217498779296875, cross_entropy: 4.1217498779296875
717
+ [2025-05-05 09:45:52,663][BaseTrainer][INFO] - [Epoch 1/10, Iter 41/3560] 3.667060613632202, cross_entropy: 3.667060613632202
718
+ [2025-05-05 09:45:55,796][BaseTrainer][INFO] - [Epoch 1/10, Iter 42/3560] 3.9150490760803223, cross_entropy: 3.9150490760803223
719
+ [2025-05-05 09:45:56,083][BaseTrainer][INFO] - [Epoch 1/10, Iter 43/3560] 3.4254770278930664, cross_entropy: 3.4254770278930664
720
+ [2025-05-05 09:45:58,235][BaseTrainer][INFO] - [Epoch 1/10, Iter 44/3560] 4.047301769256592, cross_entropy: 4.047301769256592
721
+ [2025-05-05 09:45:58,513][BaseTrainer][INFO] - [Epoch 1/10, Iter 45/3560] 3.868840217590332, cross_entropy: 3.868840217590332
722
+ [2025-05-05 09:46:01,552][BaseTrainer][INFO] - [Epoch 1/10, Iter 46/3560] 3.602304458618164, cross_entropy: 3.602304458618164
723
+ [2025-05-05 09:46:01,830][BaseTrainer][INFO] - [Epoch 1/10, Iter 47/3560] 4.087170600891113, cross_entropy: 4.087170600891113
724
+ [2025-05-05 09:46:04,620][BaseTrainer][INFO] - [Epoch 1/10, Iter 48/3560] 3.497514247894287, cross_entropy: 3.497514247894287
725
+ [2025-05-05 09:46:04,953][BaseTrainer][INFO] - [Epoch 1/10, Iter 49/3560] 3.723015069961548, cross_entropy: 3.723015069961548
726
+ [2025-05-05 09:46:07,411][BaseTrainer][INFO] - [Epoch 1/10, Iter 50/3560] 3.4713640213012695, cross_entropy: 3.4713640213012695
727
+ [2025-05-05 09:46:07,951][BaseTrainer][INFO] - [Epoch 1/10, Iter 51/3560] 3.3803837299346924, cross_entropy: 3.3803837299346924
728
+ [2025-05-05 09:46:10,538][BaseTrainer][INFO] - [Epoch 1/10, Iter 52/3560] 3.4266440868377686, cross_entropy: 3.4266440868377686
729
+ [2025-05-05 09:46:10,901][BaseTrainer][INFO] - [Epoch 1/10, Iter 53/3560] 3.4024219512939453, cross_entropy: 3.4024219512939453
730
+ [2025-05-05 09:46:14,562][BaseTrainer][INFO] - [Epoch 1/10, Iter 54/3560] 3.674903392791748, cross_entropy: 3.674903392791748
731
+ [2025-05-05 09:46:14,829][BaseTrainer][INFO] - [Epoch 1/10, Iter 55/3560] 3.7284488677978516, cross_entropy: 3.7284488677978516
732
+ [2025-05-05 09:46:17,464][BaseTrainer][INFO] - [Epoch 1/10, Iter 56/3560] 3.896651268005371, cross_entropy: 3.896651268005371
733
+ [2025-05-05 09:46:17,730][BaseTrainer][INFO] - [Epoch 1/10, Iter 57/3560] 3.543196678161621, cross_entropy: 3.543196678161621
734
+ [2025-05-05 09:46:20,605][BaseTrainer][INFO] - [Epoch 1/10, Iter 58/3560] 3.4224328994750977, cross_entropy: 3.4224328994750977
735
+ [2025-05-05 09:46:20,885][BaseTrainer][INFO] - [Epoch 1/10, Iter 59/3560] 3.528635025024414, cross_entropy: 3.528635025024414
736
+ [2025-05-05 09:46:24,066][BaseTrainer][INFO] - [Epoch 1/10, Iter 60/3560] 3.2219009399414062, cross_entropy: 3.2219009399414062
737
+ [2025-05-05 09:46:24,355][BaseTrainer][INFO] - [Epoch 1/10, Iter 61/3560] 3.474341869354248, cross_entropy: 3.474341869354248
738
+ [2025-05-05 09:46:27,570][BaseTrainer][INFO] - [Epoch 1/10, Iter 62/3560] 3.117574453353882, cross_entropy: 3.117574453353882
739
+ [2025-05-05 09:46:27,837][BaseTrainer][INFO] - [Epoch 1/10, Iter 63/3560] 3.488274097442627, cross_entropy: 3.488274097442627
740
+ [2025-05-05 09:46:30,446][BaseTrainer][INFO] - [Epoch 1/10, Iter 64/3560] 2.8341617584228516, cross_entropy: 2.8341617584228516
741
+ [2025-05-05 09:46:30,727][BaseTrainer][INFO] - [Epoch 1/10, Iter 65/3560] 3.3542423248291016, cross_entropy: 3.3542423248291016
742
+ [2025-05-05 09:46:33,560][BaseTrainer][INFO] - [Epoch 1/10, Iter 66/3560] 3.385037899017334, cross_entropy: 3.385037899017334
743
+ [2025-05-05 09:46:33,845][BaseTrainer][INFO] - [Epoch 1/10, Iter 67/3560] 3.3171205520629883, cross_entropy: 3.3171205520629883
744
+ [2025-05-05 09:46:36,949][BaseTrainer][INFO] - [Epoch 1/10, Iter 68/3560] 3.467435121536255, cross_entropy: 3.467435121536255
745
+ [2025-05-05 09:46:37,226][BaseTrainer][INFO] - [Epoch 1/10, Iter 69/3560] 3.5573220252990723, cross_entropy: 3.5573220252990723
746
+ [2025-05-05 09:46:39,503][BaseTrainer][INFO] - [Epoch 1/10, Iter 70/3560] 3.022134304046631, cross_entropy: 3.022134304046631
747
+ [2025-05-05 09:46:39,782][BaseTrainer][INFO] - [Epoch 1/10, Iter 71/3560] 3.5358519554138184, cross_entropy: 3.5358519554138184
748
+ [2025-05-05 09:46:42,825][BaseTrainer][INFO] - [Epoch 1/10, Iter 72/3560] 3.5402398109436035, cross_entropy: 3.5402398109436035
749
+ [2025-05-05 09:46:43,117][BaseTrainer][INFO] - [Epoch 1/10, Iter 73/3560] 2.891481399536133, cross_entropy: 2.891481399536133
750
+ [2025-05-05 09:46:46,312][BaseTrainer][INFO] - [Epoch 1/10, Iter 74/3560] 2.97945499420166, cross_entropy: 2.97945499420166
751
+ [2025-05-05 09:46:46,578][BaseTrainer][INFO] - [Epoch 1/10, Iter 75/3560] 3.1224918365478516, cross_entropy: 3.1224918365478516
752
+ [2025-05-05 09:46:49,781][BaseTrainer][INFO] - [Epoch 1/10, Iter 76/3560] 2.911125898361206, cross_entropy: 2.911125898361206
753
+ [2025-05-05 09:46:50,066][BaseTrainer][INFO] - [Epoch 1/10, Iter 77/3560] 2.999051094055176, cross_entropy: 2.999051094055176
754
+ [2025-05-05 09:46:52,794][BaseTrainer][INFO] - [Epoch 1/10, Iter 78/3560] 3.449187755584717, cross_entropy: 3.449187755584717
755
+ [2025-05-05 09:46:53,067][BaseTrainer][INFO] - [Epoch 1/10, Iter 79/3560] 2.743861675262451, cross_entropy: 2.743861675262451
756
+ [2025-05-05 09:46:55,680][BaseTrainer][INFO] - [Epoch 1/10, Iter 80/3560] 2.905088424682617, cross_entropy: 2.905088424682617
757
+ [2025-05-05 09:46:55,953][BaseTrainer][INFO] - [Epoch 1/10, Iter 81/3560] 3.1474924087524414, cross_entropy: 3.1474924087524414
758
+ [2025-05-05 09:46:58,936][BaseTrainer][INFO] - [Epoch 1/10, Iter 82/3560] 3.0723185539245605, cross_entropy: 3.0723185539245605
759
+ [2025-05-05 09:46:59,290][BaseTrainer][INFO] - [Epoch 1/10, Iter 83/3560] 3.306333541870117, cross_entropy: 3.306333541870117
760
+ [2025-05-05 09:47:01,869][BaseTrainer][INFO] - [Epoch 1/10, Iter 84/3560] 3.0679280757904053, cross_entropy: 3.0679280757904053
761
+ [2025-05-05 09:47:02,577][BaseTrainer][INFO] - [Epoch 1/10, Iter 85/3560] 3.4102044105529785, cross_entropy: 3.4102044105529785
762
+ [2025-05-05 09:47:05,249][BaseTrainer][INFO] - [Epoch 1/10, Iter 86/3560] 3.0075879096984863, cross_entropy: 3.0075879096984863
763
+ [2025-05-05 09:47:06,137][BaseTrainer][INFO] - [Epoch 1/10, Iter 87/3560] 2.9128518104553223, cross_entropy: 2.9128518104553223
764
+ [2025-05-05 09:47:08,479][BaseTrainer][INFO] - [Epoch 1/10, Iter 88/3560] 3.0350518226623535, cross_entropy: 3.0350518226623535
765
+ [2025-05-05 09:47:08,923][BaseTrainer][INFO] - [Epoch 1/10, Iter 89/3560] 2.9499733448028564, cross_entropy: 2.9499733448028564
766
+ [2025-05-05 09:47:12,073][BaseTrainer][INFO] - [Epoch 1/10, Iter 90/3560] 3.022963047027588, cross_entropy: 3.022963047027588
767
+ [2025-05-05 09:47:12,340][BaseTrainer][INFO] - [Epoch 1/10, Iter 91/3560] 2.994563579559326, cross_entropy: 2.994563579559326
768
+ [2025-05-05 09:47:14,549][BaseTrainer][INFO] - [Epoch 1/10, Iter 92/3560] 2.632516622543335, cross_entropy: 2.632516622543335
769
+ [2025-05-05 09:47:15,470][BaseTrainer][INFO] - [Epoch 1/10, Iter 93/3560] 3.2041497230529785, cross_entropy: 3.2041497230529785
770
+ [2025-05-05 09:47:18,072][BaseTrainer][INFO] - [Epoch 1/10, Iter 94/3560] 2.9007208347320557, cross_entropy: 2.9007208347320557
771
+ [2025-05-05 09:47:19,009][BaseTrainer][INFO] - [Epoch 1/10, Iter 95/3560] 3.005298137664795, cross_entropy: 3.005298137664795
772
+ [2025-05-05 09:47:21,461][BaseTrainer][INFO] - [Epoch 1/10, Iter 96/3560] 2.8202903270721436, cross_entropy: 2.8202903270721436
773
+ [2025-05-05 09:47:21,843][BaseTrainer][INFO] - [Epoch 1/10, Iter 97/3560] 3.1353464126586914, cross_entropy: 3.1353464126586914
774
+ [2025-05-05 09:47:24,892][BaseTrainer][INFO] - [Epoch 1/10, Iter 98/3560] 2.6882143020629883, cross_entropy: 2.6882143020629883
775
+ [2025-05-05 09:47:25,255][BaseTrainer][INFO] - [Epoch 1/10, Iter 99/3560] 2.542910099029541, cross_entropy: 2.542910099029541
776
+ [2025-05-05 09:47:28,739][BaseTrainer][INFO] - [Epoch 1/10, Iter 100/3560] 2.6153109073638916, cross_entropy: 2.6153109073638916
777
+ [2025-05-05 09:47:29,029][BaseTrainer][INFO] - [Epoch 1/10, Iter 101/3560] 3.251145362854004, cross_entropy: 3.251145362854004
778
+ [2025-05-05 09:47:31,835][BaseTrainer][INFO] - [Epoch 1/10, Iter 102/3560] 2.772571563720703, cross_entropy: 2.772571563720703
779
+ [2025-05-05 09:47:32,102][BaseTrainer][INFO] - [Epoch 1/10, Iter 103/3560] 3.0203089714050293, cross_entropy: 3.0203089714050293
780
+ [2025-05-05 09:47:35,187][BaseTrainer][INFO] - [Epoch 1/10, Iter 104/3560] 2.638249397277832, cross_entropy: 2.638249397277832
781
+ [2025-05-05 09:47:35,476][BaseTrainer][INFO] - [Epoch 1/10, Iter 105/3560] 2.763543128967285, cross_entropy: 2.763543128967285
782
+ [2025-05-05 09:47:38,050][BaseTrainer][INFO] - [Epoch 1/10, Iter 106/3560] 3.3073742389678955, cross_entropy: 3.3073742389678955
783
+ [2025-05-05 09:47:38,321][BaseTrainer][INFO] - [Epoch 1/10, Iter 107/3560] 2.858224868774414, cross_entropy: 2.858224868774414
784
+ [2025-05-05 09:47:41,506][BaseTrainer][INFO] - [Epoch 1/10, Iter 108/3560] 2.8307814598083496, cross_entropy: 2.8307814598083496
785
+ [2025-05-05 09:47:41,799][BaseTrainer][INFO] - [Epoch 1/10, Iter 109/3560] 2.6661691665649414, cross_entropy: 2.6661691665649414
786
+ [2025-05-05 09:47:44,763][BaseTrainer][INFO] - [Epoch 1/10, Iter 110/3560] 2.993783712387085, cross_entropy: 2.993783712387085
787
+ [2025-05-05 09:47:45,055][BaseTrainer][INFO] - [Epoch 1/10, Iter 111/3560] 2.4647257328033447, cross_entropy: 2.4647257328033447
788
+ [2025-05-05 09:47:48,092][BaseTrainer][INFO] - [Epoch 1/10, Iter 112/3560] 3.0236153602600098, cross_entropy: 3.0236153602600098
789
+ [2025-05-05 09:47:48,383][BaseTrainer][INFO] - [Epoch 1/10, Iter 113/3560] 2.8135550022125244, cross_entropy: 2.8135550022125244
790
+ [2025-05-05 09:47:50,874][BaseTrainer][INFO] - [Epoch 1/10, Iter 114/3560] 3.0279812812805176, cross_entropy: 3.0279812812805176
791
+ [2025-05-05 09:47:51,142][BaseTrainer][INFO] - [Epoch 1/10, Iter 115/3560] 2.353130340576172, cross_entropy: 2.353130340576172
792
+ [2025-05-05 09:47:53,991][BaseTrainer][INFO] - [Epoch 1/10, Iter 116/3560] 2.427293300628662, cross_entropy: 2.427293300628662
793
+ [2025-05-05 09:47:54,287][BaseTrainer][INFO] - [Epoch 1/10, Iter 117/3560] 2.7655386924743652, cross_entropy: 2.7655386924743652
794
+ [2025-05-05 09:47:57,082][BaseTrainer][INFO] - [Epoch 1/10, Iter 118/3560] 2.973236560821533, cross_entropy: 2.973236560821533
795
+ [2025-05-05 09:47:57,375][BaseTrainer][INFO] - [Epoch 1/10, Iter 119/3560] 3.0423779487609863, cross_entropy: 3.0423779487609863
796
+ [2025-05-05 09:48:00,019][BaseTrainer][INFO] - [Epoch 1/10, Iter 120/3560] 2.8837733268737793, cross_entropy: 2.8837733268737793
797
+ [2025-05-05 09:48:00,292][BaseTrainer][INFO] - [Epoch 1/10, Iter 121/3560] 3.0309014320373535, cross_entropy: 3.0309014320373535
798
+ [2025-05-05 09:48:03,200][BaseTrainer][INFO] - [Epoch 1/10, Iter 122/3560] 2.4118125438690186, cross_entropy: 2.4118125438690186
799
+ [2025-05-05 09:48:03,485][BaseTrainer][INFO] - [Epoch 1/10, Iter 123/3560] 2.163151741027832, cross_entropy: 2.163151741027832
800
+ [2025-05-05 09:48:05,812][BaseTrainer][INFO] - [Epoch 1/10, Iter 124/3560] 2.070911169052124, cross_entropy: 2.070911169052124
801
+ [2025-05-05 09:48:06,096][BaseTrainer][INFO] - [Epoch 1/10, Iter 125/3560] 2.8572659492492676, cross_entropy: 2.8572659492492676
802
+ [2025-05-05 09:48:08,406][BaseTrainer][INFO] - [Epoch 1/10, Iter 126/3560] 2.412656307220459, cross_entropy: 2.412656307220459
803
+ [2025-05-05 09:48:08,830][BaseTrainer][INFO] - [Epoch 1/10, Iter 127/3560] 2.7920114994049072, cross_entropy: 2.7920114994049072
804
+ [2025-05-05 09:48:10,967][BaseTrainer][INFO] - [Epoch 1/10, Iter 128/3560] 2.2450432777404785, cross_entropy: 2.2450432777404785
805
+ [2025-05-05 09:48:12,887][BaseTrainer][INFO] - [Epoch 1/10, Iter 129/3560] 2.7679288387298584, cross_entropy: 2.7679288387298584
806
+ [2025-05-05 09:48:13,773][BaseTrainer][INFO] - [Epoch 1/10, Iter 130/3560] 2.589447021484375, cross_entropy: 2.589447021484375
807
+ [2025-05-05 09:48:15,599][BaseTrainer][INFO] - [Epoch 1/10, Iter 131/3560] 2.4124765396118164, cross_entropy: 2.4124765396118164
808
+ [2025-05-05 09:48:16,777][BaseTrainer][INFO] - [Epoch 1/10, Iter 132/3560] 2.4687347412109375, cross_entropy: 2.4687347412109375
809
+ [2025-05-05 09:48:18,770][BaseTrainer][INFO] - [Epoch 1/10, Iter 133/3560] 2.8234610557556152, cross_entropy: 2.8234610557556152
810
+ [2025-05-05 09:48:19,147][BaseTrainer][INFO] - [Epoch 1/10, Iter 134/3560] 2.389613151550293, cross_entropy: 2.389613151550293
811
+ [2025-05-05 09:48:22,089][BaseTrainer][INFO] - [Epoch 1/10, Iter 135/3560] 2.3148984909057617, cross_entropy: 2.3148984909057617
812
+ [2025-05-05 09:48:22,385][BaseTrainer][INFO] - [Epoch 1/10, Iter 136/3560] 2.8420419692993164, cross_entropy: 2.8420419692993164
813
+ [2025-05-05 09:48:24,814][BaseTrainer][INFO] - [Epoch 1/10, Iter 137/3560] 2.29653263092041, cross_entropy: 2.29653263092041
814
+ [2025-05-05 09:48:25,130][BaseTrainer][INFO] - [Epoch 1/10, Iter 138/3560] 2.443634510040283, cross_entropy: 2.443634510040283
815
+ [2025-05-05 09:48:27,786][BaseTrainer][INFO] - [Epoch 1/10, Iter 139/3560] 2.458268642425537, cross_entropy: 2.458268642425537
816
+ [2025-05-05 09:48:28,237][BaseTrainer][INFO] - [Epoch 1/10, Iter 140/3560] 2.3709287643432617, cross_entropy: 2.3709287643432617
817
+ [2025-05-05 09:48:30,507][BaseTrainer][INFO] - [Epoch 1/10, Iter 141/3560] 2.7334578037261963, cross_entropy: 2.7334578037261963
818
+ [2025-05-05 09:48:30,817][BaseTrainer][INFO] - [Epoch 1/10, Iter 142/3560] 2.576626777648926, cross_entropy: 2.576626777648926
819
+ [2025-05-05 09:48:33,333][BaseTrainer][INFO] - [Epoch 1/10, Iter 143/3560] 2.551196336746216, cross_entropy: 2.551196336746216
820
+ [2025-05-05 09:48:33,746][BaseTrainer][INFO] - [Epoch 1/10, Iter 144/3560] 2.6161606311798096, cross_entropy: 2.6161606311798096
821
+ [2025-05-05 09:48:36,572][BaseTrainer][INFO] - [Epoch 1/10, Iter 145/3560] 2.496171712875366, cross_entropy: 2.496171712875366
822
+ [2025-05-05 09:48:36,882][BaseTrainer][INFO] - [Epoch 1/10, Iter 146/3560] 2.6214725971221924, cross_entropy: 2.6214725971221924
823
+ [2025-05-05 09:48:39,559][BaseTrainer][INFO] - [Epoch 1/10, Iter 147/3560] 2.706058979034424, cross_entropy: 2.706058979034424
824
+ [2025-05-05 09:48:39,895][BaseTrainer][INFO] - [Epoch 1/10, Iter 148/3560] 2.4072933197021484, cross_entropy: 2.4072933197021484
825
+ [2025-05-05 09:48:43,163][BaseTrainer][INFO] - [Epoch 1/10, Iter 149/3560] 2.228184700012207, cross_entropy: 2.228184700012207
826
+ [2025-05-05 09:48:43,439][BaseTrainer][INFO] - [Epoch 1/10, Iter 150/3560] 2.6638638973236084, cross_entropy: 2.6638638973236084
827
+ [2025-05-05 09:48:45,896][BaseTrainer][INFO] - [Epoch 1/10, Iter 151/3560] 2.2962441444396973, cross_entropy: 2.2962441444396973
828
+ [2025-05-05 09:48:46,172][BaseTrainer][INFO] - [Epoch 1/10, Iter 152/3560] 2.375032901763916, cross_entropy: 2.375032901763916
829
+ [2025-05-05 09:48:49,025][BaseTrainer][INFO] - [Epoch 1/10, Iter 153/3560] 2.8031396865844727, cross_entropy: 2.8031396865844727
830
+ [2025-05-05 09:48:49,292][BaseTrainer][INFO] - [Epoch 1/10, Iter 154/3560] 2.2409143447875977, cross_entropy: 2.2409143447875977
831
+ [2025-05-05 09:48:52,033][BaseTrainer][INFO] - [Epoch 1/10, Iter 155/3560] 2.580317497253418, cross_entropy: 2.580317497253418
832
+ [2025-05-05 09:48:52,323][BaseTrainer][INFO] - [Epoch 1/10, Iter 156/3560] 2.499070882797241, cross_entropy: 2.499070882797241
833
+ [2025-05-05 09:48:55,341][BaseTrainer][INFO] - [Epoch 1/10, Iter 157/3560] 2.282062530517578, cross_entropy: 2.282062530517578
834
+ [2025-05-05 09:48:55,607][BaseTrainer][INFO] - [Epoch 1/10, Iter 158/3560] 2.359468936920166, cross_entropy: 2.359468936920166
835
+ [2025-05-05 09:48:59,018][BaseTrainer][INFO] - [Epoch 1/10, Iter 159/3560] 2.647857427597046, cross_entropy: 2.647857427597046
836
+ [2025-05-05 09:48:59,303][BaseTrainer][INFO] - [Epoch 1/10, Iter 160/3560] 2.8444228172302246, cross_entropy: 2.8444228172302246
837
+ [2025-05-05 09:49:01,927][BaseTrainer][INFO] - [Epoch 1/10, Iter 161/3560] 2.19181489944458, cross_entropy: 2.19181489944458
838
+ [2025-05-05 09:49:02,210][BaseTrainer][INFO] - [Epoch 1/10, Iter 162/3560] 2.5929222106933594, cross_entropy: 2.5929222106933594
839
+ [2025-05-05 09:49:04,993][BaseTrainer][INFO] - [Epoch 1/10, Iter 163/3560] 2.351461410522461, cross_entropy: 2.351461410522461
840
+ [2025-05-05 09:49:05,265][BaseTrainer][INFO] - [Epoch 1/10, Iter 164/3560] 2.3750619888305664, cross_entropy: 2.3750619888305664
841
+ [2025-05-05 09:49:08,198][BaseTrainer][INFO] - [Epoch 1/10, Iter 165/3560] 2.4669322967529297, cross_entropy: 2.4669322967529297
842
+ [2025-05-05 09:49:08,484][BaseTrainer][INFO] - [Epoch 1/10, Iter 166/3560] 2.9245047569274902, cross_entropy: 2.9245047569274902
843
+ [2025-05-05 09:49:11,194][BaseTrainer][INFO] - [Epoch 1/10, Iter 167/3560] 2.381864309310913, cross_entropy: 2.381864309310913
844
+ [2025-05-05 09:49:11,483][BaseTrainer][INFO] - [Epoch 1/10, Iter 168/3560] 2.5745413303375244, cross_entropy: 2.5745413303375244
845
+ [2025-05-05 09:49:14,174][BaseTrainer][INFO] - [Epoch 1/10, Iter 169/3560] 2.686976671218872, cross_entropy: 2.686976671218872
846
+ [2025-05-05 09:49:14,458][BaseTrainer][INFO] - [Epoch 1/10, Iter 170/3560] 2.435781955718994, cross_entropy: 2.435781955718994
847
+ [2025-05-05 09:49:17,471][BaseTrainer][INFO] - [Epoch 1/10, Iter 171/3560] 2.5392961502075195, cross_entropy: 2.5392961502075195
848
+ [2025-05-05 09:49:17,756][BaseTrainer][INFO] - [Epoch 1/10, Iter 172/3560] 2.492906332015991, cross_entropy: 2.492906332015991
849
+ [2025-05-05 09:49:21,164][BaseTrainer][INFO] - [Epoch 1/10, Iter 173/3560] 2.358499765396118, cross_entropy: 2.358499765396118
850
+ [2025-05-05 09:49:21,460][BaseTrainer][INFO] - [Epoch 1/10, Iter 174/3560] 2.6670870780944824, cross_entropy: 2.6670870780944824
851
+ [2025-05-05 09:49:23,993][BaseTrainer][INFO] - [Epoch 1/10, Iter 175/3560] 2.619516372680664, cross_entropy: 2.619516372680664
852
+ [2025-05-05 09:49:24,287][BaseTrainer][INFO] - [Epoch 1/10, Iter 176/3560] 2.337444305419922, cross_entropy: 2.337444305419922
853
+ [2025-05-05 09:49:27,098][BaseTrainer][INFO] - [Epoch 1/10, Iter 177/3560] 2.225595712661743, cross_entropy: 2.225595712661743
854
+ [2025-05-05 09:49:27,364][BaseTrainer][INFO] - [Epoch 1/10, Iter 178/3560] 2.194519519805908, cross_entropy: 2.194519519805908
855
+ [2025-05-05 09:49:30,533][BaseTrainer][INFO] - [Epoch 1/10, Iter 179/3560] 2.2509477138519287, cross_entropy: 2.2509477138519287
856
+ [2025-05-05 09:49:30,821][BaseTrainer][INFO] - [Epoch 1/10, Iter 180/3560] 2.0769686698913574, cross_entropy: 2.0769686698913574
857
+ [2025-05-05 09:49:33,517][BaseTrainer][INFO] - [Epoch 1/10, Iter 181/3560] 2.1994895935058594, cross_entropy: 2.1994895935058594
858
+ [2025-05-05 09:49:33,809][BaseTrainer][INFO] - [Epoch 1/10, Iter 182/3560] 1.9446724653244019, cross_entropy: 1.9446724653244019
859
+ [2025-05-05 09:49:37,015][BaseTrainer][INFO] - [Epoch 1/10, Iter 183/3560] 2.339994192123413, cross_entropy: 2.339994192123413
860
+ [2025-05-05 09:49:37,282][BaseTrainer][INFO] - [Epoch 1/10, Iter 184/3560] 2.393031597137451, cross_entropy: 2.393031597137451
861
+ [2025-05-05 09:49:40,335][BaseTrainer][INFO] - [Epoch 1/10, Iter 185/3560] 2.134624719619751, cross_entropy: 2.134624719619751
862
+ [2025-05-05 09:49:40,623][BaseTrainer][INFO] - [Epoch 1/10, Iter 186/3560] 2.350126266479492, cross_entropy: 2.350126266479492
863
+ [2025-05-05 09:49:43,184][BaseTrainer][INFO] - [Epoch 1/10, Iter 187/3560] 2.6669764518737793, cross_entropy: 2.6669764518737793
864
+ [2025-05-05 09:49:43,459][BaseTrainer][INFO] - [Epoch 1/10, Iter 188/3560] 2.644606113433838, cross_entropy: 2.644606113433838
865
+ [2025-05-05 09:49:46,203][BaseTrainer][INFO] - [Epoch 1/10, Iter 189/3560] 1.9613032341003418, cross_entropy: 1.9613032341003418
866
+ [2025-05-05 09:49:46,469][BaseTrainer][INFO] - [Epoch 1/10, Iter 190/3560] 2.8871889114379883, cross_entropy: 2.8871889114379883
867
+ [2025-05-05 09:49:49,177][BaseTrainer][INFO] - [Epoch 1/10, Iter 191/3560] 1.9578357934951782, cross_entropy: 1.9578357934951782
868
+ [2025-05-05 09:49:49,459][BaseTrainer][INFO] - [Epoch 1/10, Iter 192/3560] 2.2871298789978027, cross_entropy: 2.2871298789978027
869
+ [2025-05-05 09:49:52,294][BaseTrainer][INFO] - [Epoch 1/10, Iter 193/3560] 2.071976900100708, cross_entropy: 2.071976900100708
870
+ [2025-05-05 09:49:52,582][BaseTrainer][INFO] - [Epoch 1/10, Iter 194/3560] 2.0094850063323975, cross_entropy: 2.0094850063323975
871
+ [2025-05-05 09:49:56,010][BaseTrainer][INFO] - [Epoch 1/10, Iter 195/3560] 2.0015220642089844, cross_entropy: 2.0015220642089844
872
+ [2025-05-05 09:49:56,305][BaseTrainer][INFO] - [Epoch 1/10, Iter 196/3560] 2.7452821731567383, cross_entropy: 2.7452821731567383
873
+ [2025-05-05 09:49:59,103][BaseTrainer][INFO] - [Epoch 1/10, Iter 197/3560] 2.210491180419922, cross_entropy: 2.210491180419922
874
+ [2025-05-05 09:49:59,379][BaseTrainer][INFO] - [Epoch 1/10, Iter 198/3560] 2.0572052001953125, cross_entropy: 2.0572052001953125
875
+ [2025-05-05 09:50:01,917][BaseTrainer][INFO] - [Epoch 1/10, Iter 199/3560] 1.8931865692138672, cross_entropy: 1.8931865692138672
876
+ [2025-05-05 09:50:02,205][BaseTrainer][INFO] - [Epoch 1/10, Iter 200/3560] 2.6257567405700684, cross_entropy: 2.6257567405700684
877
+ [2025-05-05 09:50:04,562][BaseTrainer][INFO] - [Epoch 1/10, Iter 201/3560] 2.301325798034668, cross_entropy: 2.301325798034668
878
+ [2025-05-05 09:50:04,831][BaseTrainer][INFO] - [Epoch 1/10, Iter 202/3560] 2.1421902179718018, cross_entropy: 2.1421902179718018
879
+ [2025-05-05 09:50:07,622][BaseTrainer][INFO] - [Epoch 1/10, Iter 203/3560] 2.3199124336242676, cross_entropy: 2.3199124336242676
880
+ [2025-05-05 09:50:07,891][BaseTrainer][INFO] - [Epoch 1/10, Iter 204/3560] 1.9478565454483032, cross_entropy: 1.9478565454483032
881
+ [2025-05-05 09:50:10,575][BaseTrainer][INFO] - [Epoch 1/10, Iter 205/3560] 2.031534194946289, cross_entropy: 2.031534194946289
882
+ [2025-05-05 09:50:10,839][BaseTrainer][INFO] - [Epoch 1/10, Iter 206/3560] 1.5908913612365723, cross_entropy: 1.5908913612365723
883
+ [2025-05-05 09:50:13,516][BaseTrainer][INFO] - [Epoch 1/10, Iter 207/3560] 1.9361119270324707, cross_entropy: 1.9361119270324707
884
+ [2025-05-05 09:50:13,801][BaseTrainer][INFO] - [Epoch 1/10, Iter 208/3560] 1.8822338581085205, cross_entropy: 1.8822338581085205
885
+ [2025-05-05 09:50:17,060][BaseTrainer][INFO] - [Epoch 1/10, Iter 209/3560] 1.9554469585418701, cross_entropy: 1.9554469585418701
886
+ [2025-05-05 09:50:17,342][BaseTrainer][INFO] - [Epoch 1/10, Iter 210/3560] 2.131263017654419, cross_entropy: 2.131263017654419
887
+ [2025-05-05 09:50:19,577][BaseTrainer][INFO] - [Epoch 1/10, Iter 211/3560] 2.7405385971069336, cross_entropy: 2.7405385971069336
888
+ [2025-05-05 09:50:19,866][BaseTrainer][INFO] - [Epoch 1/10, Iter 212/3560] 2.1455931663513184, cross_entropy: 2.1455931663513184
889
+ [2025-05-05 09:50:22,243][BaseTrainer][INFO] - [Epoch 1/10, Iter 213/3560] 2.1557955741882324, cross_entropy: 2.1557955741882324
890
+ [2025-05-05 09:50:22,509][BaseTrainer][INFO] - [Epoch 1/10, Iter 214/3560] 1.8553715944290161, cross_entropy: 1.8553715944290161
891
+ [2025-05-05 09:50:25,417][BaseTrainer][INFO] - [Epoch 1/10, Iter 215/3560] 2.2070069313049316, cross_entropy: 2.2070069313049316
892
+ [2025-05-05 09:50:25,709][BaseTrainer][INFO] - [Epoch 1/10, Iter 216/3560] 1.602014183998108, cross_entropy: 1.602014183998108
893
+ [2025-05-05 09:50:28,697][BaseTrainer][INFO] - [Epoch 1/10, Iter 217/3560] 1.8780882358551025, cross_entropy: 1.8780882358551025
894
+ [2025-05-05 09:50:28,983][BaseTrainer][INFO] - [Epoch 1/10, Iter 218/3560] 1.7485675811767578, cross_entropy: 1.7485675811767578
895
+ [2025-05-05 09:50:31,529][BaseTrainer][INFO] - [Epoch 1/10, Iter 219/3560] 2.1740505695343018, cross_entropy: 2.1740505695343018
896
+ [2025-05-05 09:50:31,797][BaseTrainer][INFO] - [Epoch 1/10, Iter 220/3560] 2.364553689956665, cross_entropy: 2.364553689956665
897
+ [2025-05-05 09:50:34,637][BaseTrainer][INFO] - [Epoch 1/10, Iter 221/3560] 2.018742084503174, cross_entropy: 2.018742084503174
898
+ [2025-05-05 09:50:34,930][BaseTrainer][INFO] - [Epoch 1/10, Iter 222/3560] 2.467301845550537, cross_entropy: 2.467301845550537
899
+ [2025-05-05 09:50:37,663][BaseTrainer][INFO] - [Epoch 1/10, Iter 223/3560] 2.0563735961914062, cross_entropy: 2.0563735961914062
900
+ [2025-05-05 09:50:37,948][BaseTrainer][INFO] - [Epoch 1/10, Iter 224/3560] 2.325286865234375, cross_entropy: 2.325286865234375
901
+ [2025-05-05 09:50:40,408][BaseTrainer][INFO] - [Epoch 1/10, Iter 225/3560] 1.734659194946289, cross_entropy: 1.734659194946289
902
+ [2025-05-05 09:50:40,682][BaseTrainer][INFO] - [Epoch 1/10, Iter 226/3560] 2.2344789505004883, cross_entropy: 2.2344789505004883
903
+ [2025-05-05 09:50:43,195][BaseTrainer][INFO] - [Epoch 1/10, Iter 227/3560] 2.372894287109375, cross_entropy: 2.372894287109375
904
+ [2025-05-05 09:50:43,481][BaseTrainer][INFO] - [Epoch 1/10, Iter 228/3560] 1.9628961086273193, cross_entropy: 1.9628961086273193
905
+ [2025-05-05 09:50:46,459][BaseTrainer][INFO] - [Epoch 1/10, Iter 229/3560] 1.878267526626587, cross_entropy: 1.878267526626587
906
+ [2025-05-05 09:50:46,726][BaseTrainer][INFO] - [Epoch 1/10, Iter 230/3560] 1.7528660297393799, cross_entropy: 1.7528660297393799
907
+ [2025-05-05 09:50:49,875][BaseTrainer][INFO] - [Epoch 1/10, Iter 231/3560] 1.9477362632751465, cross_entropy: 1.9477362632751465
908
+ [2025-05-05 09:50:50,144][BaseTrainer][INFO] - [Epoch 1/10, Iter 232/3560] 2.0673203468322754, cross_entropy: 2.0673203468322754
909
+ [2025-05-05 09:50:53,003][BaseTrainer][INFO] - [Epoch 1/10, Iter 233/3560] 2.2647716999053955, cross_entropy: 2.2647716999053955
910
+ [2025-05-05 09:50:53,270][BaseTrainer][INFO] - [Epoch 1/10, Iter 234/3560] 1.9522674083709717, cross_entropy: 1.9522674083709717
911
+ [2025-05-05 09:50:55,882][BaseTrainer][INFO] - [Epoch 1/10, Iter 235/3560] 2.08988356590271, cross_entropy: 2.08988356590271
912
+ [2025-05-05 09:50:56,148][BaseTrainer][INFO] - [Epoch 1/10, Iter 236/3560] 2.33000111579895, cross_entropy: 2.33000111579895
913
+ [2025-05-05 09:50:59,089][BaseTrainer][INFO] - [Epoch 1/10, Iter 237/3560] 1.4400973320007324, cross_entropy: 1.4400973320007324
914
+ [2025-05-05 09:50:59,378][BaseTrainer][INFO] - [Epoch 1/10, Iter 238/3560] 2.1513843536376953, cross_entropy: 2.1513843536376953
915
+ [2025-05-05 09:51:02,963][BaseTrainer][INFO] - [Epoch 1/10, Iter 239/3560] 1.8079352378845215, cross_entropy: 1.8079352378845215
916
+ [2025-05-05 09:51:03,263][BaseTrainer][INFO] - [Epoch 1/10, Iter 240/3560] 2.222867965698242, cross_entropy: 2.222867965698242
917
+ [2025-05-05 09:51:06,263][BaseTrainer][INFO] - [Epoch 1/10, Iter 241/3560] 2.185883045196533, cross_entropy: 2.185883045196533
918
+ [2025-05-05 09:51:06,534][BaseTrainer][INFO] - [Epoch 1/10, Iter 242/3560] 2.454998731613159, cross_entropy: 2.454998731613159
919
+ [2025-05-05 09:51:09,033][BaseTrainer][INFO] - [Epoch 1/10, Iter 243/3560] 1.819913387298584, cross_entropy: 1.819913387298584
920
+ [2025-05-05 09:51:09,318][BaseTrainer][INFO] - [Epoch 1/10, Iter 244/3560] 1.996874213218689, cross_entropy: 1.996874213218689
921
+ [2025-05-05 09:51:12,034][BaseTrainer][INFO] - [Epoch 1/10, Iter 245/3560] 2.1636722087860107, cross_entropy: 2.1636722087860107
922
+ [2025-05-05 09:51:12,323][BaseTrainer][INFO] - [Epoch 1/10, Iter 246/3560] 2.109215497970581, cross_entropy: 2.109215497970581
923
+ [2025-05-05 09:51:15,644][BaseTrainer][INFO] - [Epoch 1/10, Iter 247/3560] 2.327542781829834, cross_entropy: 2.327542781829834
924
+ [2025-05-05 09:51:15,910][BaseTrainer][INFO] - [Epoch 1/10, Iter 248/3560] 2.3415567874908447, cross_entropy: 2.3415567874908447
925
+ [2025-05-05 09:51:17,860][BaseTrainer][INFO] - [Epoch 1/10, Iter 249/3560] 1.991217017173767, cross_entropy: 1.991217017173767
926
+ [2025-05-05 09:51:18,131][BaseTrainer][INFO] - [Epoch 1/10, Iter 250/3560] 1.9958906173706055, cross_entropy: 1.9958906173706055
927
+ [2025-05-05 09:51:20,621][BaseTrainer][INFO] - [Epoch 1/10, Iter 251/3560] 1.9864914417266846, cross_entropy: 1.9864914417266846
928
+ [2025-05-05 09:51:20,914][BaseTrainer][INFO] - [Epoch 1/10, Iter 252/3560] 2.311190128326416, cross_entropy: 2.311190128326416
929
+ [2025-05-05 09:51:23,474][BaseTrainer][INFO] - [Epoch 1/10, Iter 253/3560] 2.269057273864746, cross_entropy: 2.269057273864746
930
+ [2025-05-05 09:51:23,762][BaseTrainer][INFO] - [Epoch 1/10, Iter 254/3560] 1.3703190088272095, cross_entropy: 1.3703190088272095
931
+ [2025-05-05 09:51:27,148][BaseTrainer][INFO] - [Epoch 1/10, Iter 255/3560] 2.0604770183563232, cross_entropy: 2.0604770183563232
932
+ [2025-05-05 09:51:27,414][BaseTrainer][INFO] - [Epoch 1/10, Iter 256/3560] 2.0235233306884766, cross_entropy: 2.0235233306884766
933
+ [2025-05-05 09:51:30,871][BaseTrainer][INFO] - [Epoch 1/10, Iter 257/3560] 2.533194065093994, cross_entropy: 2.533194065093994
934
+ [2025-05-05 09:51:31,159][BaseTrainer][INFO] - [Epoch 1/10, Iter 258/3560] 1.829576015472412, cross_entropy: 1.829576015472412
935
+ [2025-05-05 09:51:34,578][BaseTrainer][INFO] - [Epoch 1/10, Iter 259/3560] 2.383409261703491, cross_entropy: 2.383409261703491
936
+ [2025-05-05 09:51:34,871][BaseTrainer][INFO] - [Epoch 1/10, Iter 260/3560] 1.7044014930725098, cross_entropy: 1.7044014930725098
937
+ [2025-05-05 09:51:37,419][BaseTrainer][INFO] - [Epoch 1/10, Iter 261/3560] 1.7177097797393799, cross_entropy: 1.7177097797393799
938
+ [2025-05-05 09:51:37,709][BaseTrainer][INFO] - [Epoch 1/10, Iter 262/3560] 2.376150131225586, cross_entropy: 2.376150131225586
939
+ [2025-05-05 09:51:40,855][BaseTrainer][INFO] - [Epoch 1/10, Iter 263/3560] 2.3659605979919434, cross_entropy: 2.3659605979919434
940
+ [2025-05-05 09:51:41,120][BaseTrainer][INFO] - [Epoch 1/10, Iter 264/3560] 2.284888505935669, cross_entropy: 2.284888505935669
941
+ [2025-05-05 09:51:44,668][BaseTrainer][INFO] - [Epoch 1/10, Iter 265/3560] 1.8810943365097046, cross_entropy: 1.8810943365097046
942
+ [2025-05-05 09:51:44,935][BaseTrainer][INFO] - [Epoch 1/10, Iter 266/3560] 1.9273468255996704, cross_entropy: 1.9273468255996704
943
+ [2025-05-05 09:51:47,541][BaseTrainer][INFO] - [Epoch 1/10, Iter 267/3560] 2.427088975906372, cross_entropy: 2.427088975906372
944
+ [2025-05-05 09:51:47,807][BaseTrainer][INFO] - [Epoch 1/10, Iter 268/3560] 2.3629322052001953, cross_entropy: 2.3629322052001953
945
+ [2025-05-05 09:51:50,532][BaseTrainer][INFO] - [Epoch 1/10, Iter 269/3560] 2.0712502002716064, cross_entropy: 2.0712502002716064
946
+ [2025-05-05 09:51:50,820][BaseTrainer][INFO] - [Epoch 1/10, Iter 270/3560] 1.8640518188476562, cross_entropy: 1.8640518188476562
947
+ [2025-05-05 09:51:53,070][BaseTrainer][INFO] - [Epoch 1/10, Iter 271/3560] 2.3092873096466064, cross_entropy: 2.3092873096466064
948
+ [2025-05-05 09:51:53,352][BaseTrainer][INFO] - [Epoch 1/10, Iter 272/3560] 2.160331964492798, cross_entropy: 2.160331964492798
949
+ [2025-05-05 09:51:56,767][BaseTrainer][INFO] - [Epoch 1/10, Iter 273/3560] 2.1224117279052734, cross_entropy: 2.1224117279052734
950
+ [2025-05-05 09:51:57,033][BaseTrainer][INFO] - [Epoch 1/10, Iter 274/3560] 2.066070079803467, cross_entropy: 2.066070079803467
951
+ [2025-05-05 09:51:59,740][BaseTrainer][INFO] - [Epoch 1/10, Iter 275/3560] 1.9042013883590698, cross_entropy: 1.9042013883590698
952
+ [2025-05-05 09:52:00,024][BaseTrainer][INFO] - [Epoch 1/10, Iter 276/3560] 1.7828054428100586, cross_entropy: 1.7828054428100586
953
+ [2025-05-05 09:52:03,327][BaseTrainer][INFO] - [Epoch 1/10, Iter 277/3560] 2.3629775047302246, cross_entropy: 2.3629775047302246
954
+ [2025-05-05 09:52:03,622][BaseTrainer][INFO] - [Epoch 1/10, Iter 278/3560] 1.797624111175537, cross_entropy: 1.797624111175537
955
+ [2025-05-05 09:52:06,251][BaseTrainer][INFO] - [Epoch 1/10, Iter 279/3560] 2.3308746814727783, cross_entropy: 2.3308746814727783
956
+ [2025-05-05 09:52:06,557][BaseTrainer][INFO] - [Epoch 1/10, Iter 280/3560] 2.8353261947631836, cross_entropy: 2.8353261947631836
957
+ [2025-05-05 09:52:09,145][BaseTrainer][INFO] - [Epoch 1/10, Iter 281/3560] 2.230394124984741, cross_entropy: 2.230394124984741
958
+ [2025-05-05 09:52:09,428][BaseTrainer][INFO] - [Epoch 1/10, Iter 282/3560] 1.965634822845459, cross_entropy: 1.965634822845459
959
+ [2025-05-05 09:52:12,157][BaseTrainer][INFO] - [Epoch 1/10, Iter 283/3560] 2.075137138366699, cross_entropy: 2.075137138366699
960
+ [2025-05-05 09:52:12,437][BaseTrainer][INFO] - [Epoch 1/10, Iter 284/3560] 1.6591405868530273, cross_entropy: 1.6591405868530273
961
+ [2025-05-05 09:52:14,673][BaseTrainer][INFO] - [Epoch 1/10, Iter 285/3560] 1.6709420680999756, cross_entropy: 1.6709420680999756
962
+ [2025-05-05 09:52:14,940][BaseTrainer][INFO] - [Epoch 1/10, Iter 286/3560] 2.1470398902893066, cross_entropy: 2.1470398902893066
963
+ [2025-05-05 09:52:17,686][BaseTrainer][INFO] - [Epoch 1/10, Iter 287/3560] 1.849224328994751, cross_entropy: 1.849224328994751
964
+ [2025-05-05 09:52:17,965][BaseTrainer][INFO] - [Epoch 1/10, Iter 288/3560] 2.1824347972869873, cross_entropy: 2.1824347972869873
965
+ [2025-05-05 09:52:20,617][BaseTrainer][INFO] - [Epoch 1/10, Iter 289/3560] 1.8390185832977295, cross_entropy: 1.8390185832977295
966
+ [2025-05-05 09:52:20,888][BaseTrainer][INFO] - [Epoch 1/10, Iter 290/3560] 1.895951747894287, cross_entropy: 1.895951747894287
967
+ [2025-05-05 09:52:23,391][BaseTrainer][INFO] - [Epoch 1/10, Iter 291/3560] 2.008394479751587, cross_entropy: 2.008394479751587
968
+ [2025-05-05 09:52:23,677][BaseTrainer][INFO] - [Epoch 1/10, Iter 292/3560] 1.8846654891967773, cross_entropy: 1.8846654891967773
969
+ [2025-05-05 09:52:26,314][BaseTrainer][INFO] - [Epoch 1/10, Iter 293/3560] 1.9871348142623901, cross_entropy: 1.9871348142623901
970
+ [2025-05-05 09:52:26,596][BaseTrainer][INFO] - [Epoch 1/10, Iter 294/3560] 1.9298832416534424, cross_entropy: 1.9298832416534424
971
+ [2025-05-05 09:52:29,507][BaseTrainer][INFO] - [Epoch 1/10, Iter 295/3560] 2.6350250244140625, cross_entropy: 2.6350250244140625
972
+ [2025-05-05 09:52:29,793][BaseTrainer][INFO] - [Epoch 1/10, Iter 296/3560] 2.036161422729492, cross_entropy: 2.036161422729492
973
+ [2025-05-05 09:52:32,496][BaseTrainer][INFO] - [Epoch 1/10, Iter 297/3560] 1.7561547756195068, cross_entropy: 1.7561547756195068
974
+ [2025-05-05 09:52:32,782][BaseTrainer][INFO] - [Epoch 1/10, Iter 298/3560] 1.9822800159454346, cross_entropy: 1.9822800159454346
975
+ [2025-05-05 09:52:35,971][BaseTrainer][INFO] - [Epoch 1/10, Iter 299/3560] 2.130452871322632, cross_entropy: 2.130452871322632
976
+ [2025-05-05 09:52:36,263][BaseTrainer][INFO] - [Epoch 1/10, Iter 300/3560] 1.7963216304779053, cross_entropy: 1.7963216304779053
977
+ [2025-05-05 09:52:39,616][BaseTrainer][INFO] - [Epoch 1/10, Iter 301/3560] 2.1473166942596436, cross_entropy: 2.1473166942596436
978
+ [2025-05-05 09:52:39,903][BaseTrainer][INFO] - [Epoch 1/10, Iter 302/3560] 2.4878604412078857, cross_entropy: 2.4878604412078857
979
+ [2025-05-05 09:52:42,247][BaseTrainer][INFO] - [Epoch 1/10, Iter 303/3560] 2.1226227283477783, cross_entropy: 2.1226227283477783
980
+ [2025-05-05 09:52:42,514][BaseTrainer][INFO] - [Epoch 1/10, Iter 304/3560] 1.913116216659546, cross_entropy: 1.913116216659546
981
+ [2025-05-05 09:52:45,665][BaseTrainer][INFO] - [Epoch 1/10, Iter 305/3560] 1.6049457788467407, cross_entropy: 1.6049457788467407
982
+ [2025-05-05 09:52:45,932][BaseTrainer][INFO] - [Epoch 1/10, Iter 306/3560] 2.292205810546875, cross_entropy: 2.292205810546875
983
+ [2025-05-05 09:52:48,253][BaseTrainer][INFO] - [Epoch 1/10, Iter 307/3560] 1.8730549812316895, cross_entropy: 1.8730549812316895
984
+ [2025-05-05 09:52:48,547][BaseTrainer][INFO] - [Epoch 1/10, Iter 308/3560] 1.8482187986373901, cross_entropy: 1.8482187986373901
985
+ [2025-05-05 09:52:51,038][BaseTrainer][INFO] - [Epoch 1/10, Iter 309/3560] 2.1439313888549805, cross_entropy: 2.1439313888549805
986
+ [2025-05-05 09:52:51,314][BaseTrainer][INFO] - [Epoch 1/10, Iter 310/3560] 1.626539945602417, cross_entropy: 1.626539945602417
987
+ [2025-05-05 09:52:53,846][BaseTrainer][INFO] - [Epoch 1/10, Iter 311/3560] 1.5768496990203857, cross_entropy: 1.5768496990203857
988
+ [2025-05-05 09:52:54,122][BaseTrainer][INFO] - [Epoch 1/10, Iter 312/3560] 1.9663681983947754, cross_entropy: 1.9663681983947754
989
+ [2025-05-05 09:52:56,966][BaseTrainer][INFO] - [Epoch 1/10, Iter 313/3560] 2.328012466430664, cross_entropy: 2.328012466430664
990
+ [2025-05-05 09:52:57,253][BaseTrainer][INFO] - [Epoch 1/10, Iter 314/3560] 1.5617284774780273, cross_entropy: 1.5617284774780273
991
+ [2025-05-05 09:53:00,630][BaseTrainer][INFO] - [Epoch 1/10, Iter 315/3560] 2.2280170917510986, cross_entropy: 2.2280170917510986
992
+ [2025-05-05 09:53:00,909][BaseTrainer][INFO] - [Epoch 1/10, Iter 316/3560] 1.9406999349594116, cross_entropy: 1.9406999349594116
993
+ [2025-05-05 09:53:03,922][BaseTrainer][INFO] - [Epoch 1/10, Iter 317/3560] 2.3057799339294434, cross_entropy: 2.3057799339294434
994
+ [2025-05-05 09:53:04,216][BaseTrainer][INFO] - [Epoch 1/10, Iter 318/3560] 1.965660572052002, cross_entropy: 1.965660572052002
995
+ [2025-05-05 09:53:07,605][BaseTrainer][INFO] - [Epoch 1/10, Iter 319/3560] 1.5760626792907715, cross_entropy: 1.5760626792907715
996
+ [2025-05-05 09:53:07,890][BaseTrainer][INFO] - [Epoch 1/10, Iter 320/3560] 1.740902066230774, cross_entropy: 1.740902066230774
997
+ [2025-05-05 09:53:10,790][BaseTrainer][INFO] - [Epoch 1/10, Iter 321/3560] 1.7732293605804443, cross_entropy: 1.7732293605804443
998
+ [2025-05-05 09:53:11,065][BaseTrainer][INFO] - [Epoch 1/10, Iter 322/3560] 1.9820091724395752, cross_entropy: 1.9820091724395752
999
+ [2025-05-05 09:53:13,669][BaseTrainer][INFO] - [Epoch 1/10, Iter 323/3560] 1.4971777200698853, cross_entropy: 1.4971777200698853
1000
+ [2025-05-05 09:53:13,935][BaseTrainer][INFO] - [Epoch 1/10, Iter 324/3560] 1.635733962059021, cross_entropy: 1.635733962059021
1001
+ [2025-05-05 09:53:16,925][BaseTrainer][INFO] - [Epoch 1/10, Iter 325/3560] 1.9015440940856934, cross_entropy: 1.9015440940856934
1002
+ [2025-05-05 09:53:17,192][BaseTrainer][INFO] - [Epoch 1/10, Iter 326/3560] 1.7800923585891724, cross_entropy: 1.7800923585891724
1003
+ [2025-05-05 09:53:19,942][BaseTrainer][INFO] - [Epoch 1/10, Iter 327/3560] 2.3088884353637695, cross_entropy: 2.3088884353637695
1004
+ [2025-05-05 09:53:20,213][BaseTrainer][INFO] - [Epoch 1/10, Iter 328/3560] 1.5234527587890625, cross_entropy: 1.5234527587890625
1005
+ [2025-05-05 09:53:22,489][BaseTrainer][INFO] - [Epoch 1/10, Iter 329/3560] 1.6320879459381104, cross_entropy: 1.6320879459381104
1006
+ [2025-05-05 09:53:22,776][BaseTrainer][INFO] - [Epoch 1/10, Iter 330/3560] 1.793968915939331, cross_entropy: 1.793968915939331
1007
+ [2025-05-05 09:53:25,352][BaseTrainer][INFO] - [Epoch 1/10, Iter 331/3560] 1.8218343257904053, cross_entropy: 1.8218343257904053
1008
+ [2025-05-05 09:53:25,619][BaseTrainer][INFO] - [Epoch 1/10, Iter 332/3560] 1.9146698713302612, cross_entropy: 1.9146698713302612
1009
+ [2025-05-05 09:53:28,632][BaseTrainer][INFO] - [Epoch 1/10, Iter 333/3560] 1.609933853149414, cross_entropy: 1.609933853149414
1010
+ [2025-05-05 09:53:28,898][BaseTrainer][INFO] - [Epoch 1/10, Iter 334/3560] 2.052873373031616, cross_entropy: 2.052873373031616
1011
+ [2025-05-05 09:53:32,037][BaseTrainer][INFO] - [Epoch 1/10, Iter 335/3560] 1.7238523960113525, cross_entropy: 1.7238523960113525
1012
+ [2025-05-05 09:53:32,330][BaseTrainer][INFO] - [Epoch 1/10, Iter 336/3560] 1.653236985206604, cross_entropy: 1.653236985206604
1013
+ [2025-05-05 09:53:34,866][BaseTrainer][INFO] - [Epoch 1/10, Iter 337/3560] 1.7774677276611328, cross_entropy: 1.7774677276611328
1014
+ [2025-05-05 09:53:35,143][BaseTrainer][INFO] - [Epoch 1/10, Iter 338/3560] 1.9091377258300781, cross_entropy: 1.9091377258300781
1015
+ [2025-05-05 09:53:37,667][BaseTrainer][INFO] - [Epoch 1/10, Iter 339/3560] 1.8947666883468628, cross_entropy: 1.8947666883468628
1016
+ [2025-05-05 09:53:37,971][BaseTrainer][INFO] - [Epoch 1/10, Iter 340/3560] 1.9489954710006714, cross_entropy: 1.9489954710006714
1017
+ [2025-05-05 09:53:40,947][BaseTrainer][INFO] - [Epoch 1/10, Iter 341/3560] 2.2012619972229004, cross_entropy: 2.2012619972229004
1018
+ [2025-05-05 09:53:41,232][BaseTrainer][INFO] - [Epoch 1/10, Iter 342/3560] 1.7727586030960083, cross_entropy: 1.7727586030960083
1019
+ [2025-05-05 09:53:44,226][BaseTrainer][INFO] - [Epoch 1/10, Iter 343/3560] 1.9406934976577759, cross_entropy: 1.9406934976577759
1020
+ [2025-05-05 09:53:44,513][BaseTrainer][INFO] - [Epoch 1/10, Iter 344/3560] 2.1256887912750244, cross_entropy: 2.1256887912750244
1021
+ [2025-05-05 09:53:46,615][BaseTrainer][INFO] - [Epoch 1/10, Iter 345/3560] 2.093701124191284, cross_entropy: 2.093701124191284
1022
+ [2025-05-05 09:53:46,882][BaseTrainer][INFO] - [Epoch 1/10, Iter 346/3560] 1.8950786590576172, cross_entropy: 1.8950786590576172
1023
+ [2025-05-05 09:53:49,438][BaseTrainer][INFO] - [Epoch 1/10, Iter 347/3560] 2.1647377014160156, cross_entropy: 2.1647377014160156
1024
+ [2025-05-05 09:53:49,724][BaseTrainer][INFO] - [Epoch 1/10, Iter 348/3560] 2.382690906524658, cross_entropy: 2.382690906524658
1025
+ [2025-05-05 09:53:52,776][BaseTrainer][INFO] - [Epoch 1/10, Iter 349/3560] 1.4423621892929077, cross_entropy: 1.4423621892929077
1026
+ [2025-05-05 09:53:53,045][BaseTrainer][INFO] - [Epoch 1/10, Iter 350/3560] 1.4608666896820068, cross_entropy: 1.4608666896820068
1027
+ [2025-05-05 09:53:55,572][BaseTrainer][INFO] - [Epoch 1/10, Iter 351/3560] 2.0346779823303223, cross_entropy: 2.0346779823303223
1028
+ [2025-05-05 09:53:55,861][BaseTrainer][INFO] - [Epoch 1/10, Iter 352/3560] 1.9468430280685425, cross_entropy: 1.9468430280685425
1029
+ [2025-05-05 09:53:58,407][BaseTrainer][INFO] - [Epoch 1/10, Iter 353/3560] 2.13270902633667, cross_entropy: 2.13270902633667
1030
+ [2025-05-05 09:53:58,677][BaseTrainer][INFO] - [Epoch 1/10, Iter 354/3560] 1.85300874710083, cross_entropy: 1.85300874710083
1031
+ [2025-05-05 09:54:00,859][BaseTrainer][INFO] - [Epoch 1/10, Iter 355/3560] 1.6413021087646484, cross_entropy: 1.6413021087646484
1032
+ [2025-05-05 09:54:04,112][BaseTrainer][INFO] - [Epoch 1/10, Iter 356/3560] 1.5595623254776, cross_entropy: 1.5595623254776
1033
+ [2025-05-05 09:56:34,736][BaseTrainer][INFO] - [Epoch 1/10] (train) 2.649425745010376, cross_entropy: 2.649425745010376
1034
+ [2025-05-05 09:56:34,736][BaseTrainer][INFO] - [Epoch 1/10] (validation) 1.8284838199615479, cross_entropy: 1.8284838199615479
1035
+ [2025-05-05 09:56:34,737][BaseTrainer][INFO] - [Epoch 1/10] (metrics) roc_auc: 0.9597710371017456
1036
+ [2025-05-05 09:56:34,925][BaseTrainer][INFO] - Save model: ./exp/20250505-094415/model/best_epoch.pth.
1037
+ [2025-05-05 09:56:35,081][BaseTrainer][INFO] - Save model: ./exp/20250505-094415/model/last.pth.
1038
+ [2025-05-05 09:56:38,555][BaseTrainer][INFO] - [Epoch 2/10, Iter 357/3560] 1.1553785800933838, cross_entropy: 1.1553785800933838
1039
+ [2025-05-05 09:56:38,877][BaseTrainer][INFO] - [Epoch 2/10, Iter 358/3560] 1.7672476768493652, cross_entropy: 1.7672476768493652
1040
+ [2025-05-05 09:56:41,165][BaseTrainer][INFO] - [Epoch 2/10, Iter 359/3560] 1.25618314743042, cross_entropy: 1.25618314743042
1041
+ [2025-05-05 09:56:41,452][BaseTrainer][INFO] - [Epoch 2/10, Iter 360/3560] 1.3965697288513184, cross_entropy: 1.3965697288513184
1042
+ [2025-05-05 09:56:43,503][BaseTrainer][INFO] - [Epoch 2/10, Iter 361/3560] 1.5174055099487305, cross_entropy: 1.5174055099487305
1043
+ [2025-05-05 09:56:43,791][BaseTrainer][INFO] - [Epoch 2/10, Iter 362/3560] 1.6941776275634766, cross_entropy: 1.6941776275634766
1044
+ [2025-05-05 09:56:46,159][BaseTrainer][INFO] - [Epoch 2/10, Iter 363/3560] 1.7774323225021362, cross_entropy: 1.7774323225021362
1045
+ [2025-05-05 09:56:46,434][BaseTrainer][INFO] - [Epoch 2/10, Iter 364/3560] 1.4445414543151855, cross_entropy: 1.4445414543151855