tky823 commited on
Commit
d160536
·
verified ·
1 Parent(s): dacfc43

Upload folder using huggingface_hub

Browse files
recipes/BirdCLEF2025/ConvNeXtTiny/exp/20250504-153714/log/20250504-153717/.hydra/config.yaml ADDED
@@ -0,0 +1,233 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: null
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: null
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 15
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2024BaselineMelSpectrogram
46
+ sample_rate: ${..audio.sample_rate}
47
+ hop_length: 512
48
+ f_min: 20
49
+ f_max: 16000
50
+ pad: 0
51
+ n_mels: 128
52
+ window_fn:
53
+ _target_: torch.hann_window
54
+ _partial_: true
55
+ power: 1.0
56
+ normalized: false
57
+ wkwargs: null
58
+ center: true
59
+ pad_mode: constant
60
+ onesided: null
61
+ norm: slaney
62
+ mel_scale: slaney
63
+ take_log: true
64
+ freq_mask_param:
65
+ - 0.06
66
+ - 0.1
67
+ time_mask_param:
68
+ - 0.06
69
+ - 0.12
70
+ eps: null
71
+ train:
72
+ dataset:
73
+ train:
74
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
75
+ list_path: dump/birdclef2025_reshape_15s/list/train.txt
76
+ feature_dir: /kaggle/input/birdclef-2025
77
+ audio_key: audio
78
+ sample_rate_key: sample_rate
79
+ label_name_key: primary_label
80
+ filename_key: filename
81
+ validation:
82
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
83
+ list_path: dump/birdclef2025_reshape_15s/list/validation.txt
84
+ feature_dir: /kaggle/input/birdclef-2025
85
+ audio_key: ${..train.audio_key}
86
+ sample_rate_key: ${..train.sample_rate_key}
87
+ label_name_key: ${..train.label_name_key}
88
+ filename_key: ${..train.filename_key}
89
+ dataloader:
90
+ train:
91
+ _target_: torch.utils.data.DataLoader
92
+ batch_size: 64
93
+ shuffle: true
94
+ collate_fn:
95
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineCollator
96
+ composer:
97
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
98
+ melspectrogram_transform: ${data.melspectrogram}
99
+ audio_key: audio
100
+ sample_rate_key: sample_rate
101
+ label_name_key: primary_label
102
+ filename_key: filename
103
+ waveform_key: waveform
104
+ melspectrogram_key: log_melspectrogram
105
+ label_index_key: label_index
106
+ sample_rate: ${data.audio.sample_rate}
107
+ duration: ${data.audio.duration}
108
+ decode_audio_as_waveform: true
109
+ decode_audio_as_monoral: true
110
+ training: true
111
+ target_shape: 256
112
+ melspectrogram_key: ${.composer.melspectrogram_key}
113
+ label_index_key: ${.composer.label_index_key}
114
+ alpha: 0.4
115
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
116
+ validation:
117
+ _target_: torch.utils.data.DataLoader
118
+ batch_size: 64
119
+ shuffle: false
120
+ collate_fn:
121
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
122
+ composer:
123
+ _target_: ${....train.collate_fn.composer._target_}
124
+ melspectrogram_transform: ${....train.collate_fn.composer.melspectrogram_transform}
125
+ audio_key: ${....train.collate_fn.composer.audio_key}
126
+ sample_rate_key: ${....train.collate_fn.composer.sample_rate_key}
127
+ label_name_key: ${....train.collate_fn.composer.label_name_key}
128
+ filename_key: ${....train.collate_fn.composer.filename_key}
129
+ waveform_key: ${....train.collate_fn.composer.waveform_key}
130
+ melspectrogram_key: ${....train.collate_fn.composer.melspectrogram_key}
131
+ label_index_key: ${....train.collate_fn.composer.label_index_key}
132
+ sample_rate: ${....train.collate_fn.composer.sample_rate}
133
+ duration: ${....train.collate_fn.composer.duration}
134
+ decode_audio_as_waveform: ${....train.collate_fn.composer.decode_audio_as_waveform}
135
+ decode_audio_as_monoral: ${....train.collate_fn.composer.decode_audio_as_monoral}
136
+ training: false
137
+ target_shape: ${....train.collate_fn.composer.target_shape}
138
+ melspectrogram_key: ${...train.collate_fn.composer.melspectrogram_key}
139
+ label_index_key: ${...train.collate_fn.composer.label_index_key}
140
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
141
+ clip_gradient: {}
142
+ record: {}
143
+ trainer:
144
+ _target_: birdclef2025.utils.driver.BaseTrainer
145
+ key_mapping:
146
+ train:
147
+ input:
148
+ input: ${....dataloader.train.collate_fn.composer.melspectrogram_key}
149
+ output: logit
150
+ validation: ${.train}
151
+ inference: ${.validation}
152
+ ddp_kwargs: null
153
+ resume:
154
+ continue_from: ''
155
+ output:
156
+ exp_dir: ./exp/20250504-153714
157
+ tensorboard_dir: ./tensorboard/20250504-153714
158
+ save_checkpoint:
159
+ iteration:
160
+ every: 10000
161
+ path: ${...exp_dir}/model/iteration{iteration}.pth
162
+ epoch:
163
+ every: 1
164
+ path: ${...exp_dir}/model/epoch{epoch}.pth
165
+ last:
166
+ path: ${...exp_dir}/model/last.pth
167
+ best_epoch:
168
+ path: ${...exp_dir}/model/best_epoch.pth
169
+ steps:
170
+ epochs: 10
171
+ iterations: null
172
+ lr_scheduler: epoch
173
+ test:
174
+ dataset:
175
+ test:
176
+ _target_: torch.utils.data.Dataset
177
+ dataloader:
178
+ test:
179
+ _target_: torch.utils.data.DataLoader
180
+ batch_size: 1
181
+ shuffle: false
182
+ key_mapping:
183
+ inference:
184
+ input: null
185
+ output: null
186
+ identifier: null
187
+ checkpoint: null
188
+ remove_weight_norm: null
189
+ output:
190
+ exp_dir: ./exp
191
+ inference_dir: ${.exp_dir}/inference
192
+ audio:
193
+ sample_rate: ${data.audio.sample_rate}
194
+ key_mapping:
195
+ inference:
196
+ output: null
197
+ reference: null
198
+ transforms:
199
+ inference:
200
+ output: null
201
+ reference: null
202
+ model:
203
+ _target_: birdclef2025.models.ConvNeXtTiny
204
+ weights: ${const:torchvision.models.ConvNeXt_Tiny_Weights.IMAGENET1K_V1}
205
+ num_classes: ${const:birdclef2025.utils.data.birdclef.num_birdclef2025_primary_labels}
206
+ optimizer:
207
+ _target_: torch.optim.AdamW
208
+ lr: 0.0001
209
+ weight_decay: 0.05
210
+ lr_scheduler: {}
211
+ criterion:
212
+ _target_: audyn.criterion.MultiCriteria
213
+ cross_entropy:
214
+ _target_: audyn.criterion.BaseCriterionWrapper
215
+ criterion:
216
+ _target_: torch.nn.CrossEntropyLoss
217
+ reduction: mean
218
+ weight: 1
219
+ key_mapping:
220
+ estimated:
221
+ input: logit
222
+ target:
223
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
224
+ metrics:
225
+ roc_auc:
226
+ metric:
227
+ _target_: birdclef2025.metrics.ROCAUC
228
+ take_softmax: true
229
+ key_mapping:
230
+ estimated:
231
+ input: logit
232
+ target:
233
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
recipes/BirdCLEF2025/ConvNeXtTiny/exp/20250504-153714/log/20250504-153717/.hydra/hydra.yaml ADDED
@@ -0,0 +1,191 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ hydra:
2
+ run:
3
+ dir: ./exp/20250504-153714/log/20250504-153717
4
+ sweep:
5
+ dir: multirun/${now:%Y-%m-%d}/${now:%H-%M-%S}
6
+ subdir: ${hydra.job.num}
7
+ launcher:
8
+ _target_: hydra._internal.core_plugins.basic_launcher.BasicLauncher
9
+ sweeper:
10
+ _target_: hydra._internal.core_plugins.basic_sweeper.BasicSweeper
11
+ max_batch_size: null
12
+ params: null
13
+ help:
14
+ app_name: ${hydra.job.name}
15
+ header: '${hydra.help.app_name} is powered by Hydra.
16
+
17
+ '
18
+ footer: 'Powered by Hydra (https://hydra.cc)
19
+
20
+ Use --hydra-help to view Hydra specific help
21
+
22
+ '
23
+ template: '${hydra.help.header}
24
+
25
+ == Configuration groups ==
26
+
27
+ Compose your configuration from those groups (group=option)
28
+
29
+
30
+ $APP_CONFIG_GROUPS
31
+
32
+
33
+ == Config ==
34
+
35
+ Override anything in the config (foo.bar=value)
36
+
37
+
38
+ $CONFIG
39
+
40
+
41
+ ${hydra.help.footer}
42
+
43
+ '
44
+ hydra_help:
45
+ template: 'Hydra (${hydra.runtime.version})
46
+
47
+ See https://hydra.cc for more info.
48
+
49
+
50
+ == Flags ==
51
+
52
+ $FLAGS_HELP
53
+
54
+
55
+ == Configuration groups ==
56
+
57
+ Compose your configuration from those groups (For example, append hydra/job_logging=disabled
58
+ to command line)
59
+
60
+
61
+ $HYDRA_CONFIG_GROUPS
62
+
63
+
64
+ Use ''--cfg hydra'' to Show the Hydra config.
65
+
66
+ '
67
+ hydra_help: ???
68
+ hydra_logging:
69
+ version: 1
70
+ formatters:
71
+ simple:
72
+ format: '[%(asctime)s][HYDRA] %(message)s'
73
+ handlers:
74
+ console:
75
+ class: logging.StreamHandler
76
+ formatter: simple
77
+ stream: ext://sys.stdout
78
+ root:
79
+ level: INFO
80
+ handlers:
81
+ - console
82
+ loggers:
83
+ logging_example:
84
+ level: DEBUG
85
+ disable_existing_loggers: false
86
+ job_logging:
87
+ version: 1
88
+ formatters:
89
+ simple:
90
+ format: '[%(asctime)s][%(name)s][%(levelname)s] - %(message)s'
91
+ handlers:
92
+ console:
93
+ class: logging.StreamHandler
94
+ formatter: simple
95
+ stream: ext://sys.stdout
96
+ file:
97
+ class: logging.FileHandler
98
+ formatter: simple
99
+ filename: ${hydra.runtime.output_dir}/${hydra.job.name}.log
100
+ root:
101
+ level: INFO
102
+ handlers:
103
+ - console
104
+ - file
105
+ disable_existing_loggers: false
106
+ env: {}
107
+ mode: RUN
108
+ searchpath: []
109
+ callbacks: {}
110
+ output_subdir: .hydra
111
+ overrides:
112
+ hydra:
113
+ - hydra.run.dir=./exp/20250504-153714/log/20250504-153717
114
+ - hydra.mode=RUN
115
+ task:
116
+ - system=cuda
117
+ - preprocess=birdclef2025
118
+ - data=birdclef2025_reshape_15s
119
+ - train=birdclef2025_reshape_convnext_tiny
120
+ - model=birdclef2025_convnext_tiny
121
+ - optimizer=adamw_1e-4_decay_5e-2
122
+ - lr_scheduler=none
123
+ - criterion=birdclef2025_categorical_cross_entropy
124
+ - +metrics=birdclef2025_categorical_cross_entropy
125
+ - preprocess.dump_format=birdclef2025
126
+ - train.dataset.train.list_path=dump/birdclef2025_reshape_15s/list/train.txt
127
+ - train.dataset.train.feature_dir=/kaggle/input/birdclef-2025
128
+ - train.dataset.validation.list_path=dump/birdclef2025_reshape_15s/list/validation.txt
129
+ - train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025
130
+ - train.resume.continue_from=
131
+ - train.output.exp_dir=./exp/20250504-153714
132
+ - train.output.tensorboard_dir=./tensorboard/20250504-153714
133
+ job:
134
+ name: train
135
+ chdir: false
136
+ override_dirname: +metrics=birdclef2025_categorical_cross_entropy,criterion=birdclef2025_categorical_cross_entropy,data=birdclef2025_reshape_15s,lr_scheduler=none,model=birdclef2025_convnext_tiny,optimizer=adamw_1e-4_decay_5e-2,preprocess.dump_format=birdclef2025,preprocess=birdclef2025,system=cuda,train.dataset.train.feature_dir=/kaggle/input/birdclef-2025,train.dataset.train.list_path=dump/birdclef2025_reshape_15s/list/train.txt,train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025,train.dataset.validation.list_path=dump/birdclef2025_reshape_15s/list/validation.txt,train.output.exp_dir=./exp/20250504-153714,train.output.tensorboard_dir=./tensorboard/20250504-153714,train.resume.continue_from=,train=birdclef2025_reshape_convnext_tiny
137
+ id: ???
138
+ num: ???
139
+ config_name: config
140
+ env_set: {}
141
+ env_copy: []
142
+ config:
143
+ override_dirname:
144
+ kv_sep: '='
145
+ item_sep: ','
146
+ exclude_keys: []
147
+ runtime:
148
+ version: 1.3.2
149
+ version_base: '1.2'
150
+ cwd: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/ConvNeXtTiny
151
+ config_sources:
152
+ - path: hydra.conf
153
+ schema: pkg
154
+ provider: hydra
155
+ - path: /usr/local/lib/python3.10/dist-packages/audyn/configs
156
+ schema: file
157
+ provider: main
158
+ - path: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/ConvNeXtTiny/conf
159
+ schema: file
160
+ provider: command-line
161
+ - path: ''
162
+ schema: structured
163
+ provider: schema
164
+ output_dir: /kaggle/working/BirdCLEF2025/recipes/BirdCLEF2025/ConvNeXtTiny/exp/20250504-153714/log/20250504-153717
165
+ choices:
166
+ metrics: birdclef2025_categorical_cross_entropy
167
+ criterion: birdclef2025_categorical_cross_entropy
168
+ lr_scheduler: none
169
+ optimizer: adamw_1e-4_decay_5e-2
170
+ model: birdclef2025_convnext_tiny
171
+ test: default
172
+ test/dataloader: default
173
+ test/dataset: default
174
+ train: birdclef2025_reshape_convnext_tiny
175
+ train/record: default
176
+ train/clip_gradient: default
177
+ train/dataloader: default
178
+ train/dataset: birdclef2025_primary-label
179
+ data: birdclef2025_reshape_15s
180
+ preprocess: birdclef2025
181
+ system: cuda
182
+ hydra/env: default
183
+ hydra/callbacks: null
184
+ hydra/job_logging: default
185
+ hydra/hydra_logging: default
186
+ hydra/hydra_help: default
187
+ hydra/help: default
188
+ hydra/sweeper: basic
189
+ hydra/launcher: basic
190
+ hydra/output: default
191
+ verbose: false
recipes/BirdCLEF2025/ConvNeXtTiny/exp/20250504-153714/log/20250504-153717/.hydra/overrides.yaml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ - system=cuda
2
+ - preprocess=birdclef2025
3
+ - data=birdclef2025_reshape_15s
4
+ - train=birdclef2025_reshape_convnext_tiny
5
+ - model=birdclef2025_convnext_tiny
6
+ - optimizer=adamw_1e-4_decay_5e-2
7
+ - lr_scheduler=none
8
+ - criterion=birdclef2025_categorical_cross_entropy
9
+ - +metrics=birdclef2025_categorical_cross_entropy
10
+ - preprocess.dump_format=birdclef2025
11
+ - train.dataset.train.list_path=dump/birdclef2025_reshape_15s/list/train.txt
12
+ - train.dataset.train.feature_dir=/kaggle/input/birdclef-2025
13
+ - train.dataset.validation.list_path=dump/birdclef2025_reshape_15s/list/validation.txt
14
+ - train.dataset.validation.feature_dir=/kaggle/input/birdclef-2025
15
+ - train.resume.continue_from=
16
+ - train.output.exp_dir=./exp/20250504-153714
17
+ - train.output.tensorboard_dir=./tensorboard/20250504-153714
recipes/BirdCLEF2025/ConvNeXtTiny/exp/20250504-153714/log/20250504-153717/.hydra/resolved_config.yaml ADDED
@@ -0,0 +1,292 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: false
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: 2
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 15
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2024BaselineMelSpectrogram
46
+ sample_rate: 32000
47
+ hop_length: 512
48
+ f_min: 20
49
+ f_max: 16000
50
+ pad: 0
51
+ n_mels: 128
52
+ window_fn:
53
+ _target_: torch.hann_window
54
+ _partial_: true
55
+ power: 1.0
56
+ normalized: false
57
+ wkwargs: null
58
+ center: true
59
+ pad_mode: constant
60
+ onesided: null
61
+ norm: slaney
62
+ mel_scale: slaney
63
+ take_log: true
64
+ freq_mask_param:
65
+ - 0.06
66
+ - 0.1
67
+ time_mask_param:
68
+ - 0.06
69
+ - 0.12
70
+ eps: null
71
+ train:
72
+ dataset:
73
+ train:
74
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
75
+ list_path: dump/birdclef2025_reshape_15s/list/train.txt
76
+ feature_dir: /kaggle/input/birdclef-2025
77
+ audio_key: audio
78
+ sample_rate_key: sample_rate
79
+ label_name_key: primary_label
80
+ filename_key: filename
81
+ validation:
82
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
83
+ list_path: dump/birdclef2025_reshape_15s/list/validation.txt
84
+ feature_dir: /kaggle/input/birdclef-2025
85
+ audio_key: audio
86
+ sample_rate_key: sample_rate
87
+ label_name_key: primary_label
88
+ filename_key: filename
89
+ dataloader:
90
+ train:
91
+ _target_: torch.utils.data.DataLoader
92
+ batch_size: 64
93
+ shuffle: true
94
+ collate_fn:
95
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineCollator
96
+ composer:
97
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
98
+ melspectrogram_transform:
99
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2024BaselineMelSpectrogram
100
+ sample_rate: 32000
101
+ hop_length: 512
102
+ f_min: 20
103
+ f_max: 16000
104
+ pad: 0
105
+ n_mels: 128
106
+ window_fn:
107
+ _target_: torch.hann_window
108
+ _partial_: true
109
+ power: 1.0
110
+ normalized: false
111
+ wkwargs: null
112
+ center: true
113
+ pad_mode: constant
114
+ onesided: null
115
+ norm: slaney
116
+ mel_scale: slaney
117
+ take_log: true
118
+ freq_mask_param:
119
+ - 0.06
120
+ - 0.1
121
+ time_mask_param:
122
+ - 0.06
123
+ - 0.12
124
+ eps: null
125
+ audio_key: audio
126
+ sample_rate_key: sample_rate
127
+ label_name_key: primary_label
128
+ filename_key: filename
129
+ waveform_key: waveform
130
+ melspectrogram_key: log_melspectrogram
131
+ label_index_key: label_index
132
+ sample_rate: 32000
133
+ duration: 15
134
+ decode_audio_as_waveform: true
135
+ decode_audio_as_monoral: true
136
+ training: true
137
+ target_shape: 256
138
+ melspectrogram_key: log_melspectrogram
139
+ label_index_key: label_index
140
+ alpha: 0.4
141
+ num_workers: 2
142
+ validation:
143
+ _target_: torch.utils.data.DataLoader
144
+ batch_size: 64
145
+ shuffle: false
146
+ collate_fn:
147
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
148
+ composer:
149
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
150
+ melspectrogram_transform:
151
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2024BaselineMelSpectrogram
152
+ sample_rate: 32000
153
+ hop_length: 512
154
+ f_min: 20
155
+ f_max: 16000
156
+ pad: 0
157
+ n_mels: 128
158
+ window_fn:
159
+ _target_: torch.hann_window
160
+ _partial_: true
161
+ power: 1.0
162
+ normalized: false
163
+ wkwargs: null
164
+ center: true
165
+ pad_mode: constant
166
+ onesided: null
167
+ norm: slaney
168
+ mel_scale: slaney
169
+ take_log: true
170
+ freq_mask_param:
171
+ - 0.06
172
+ - 0.1
173
+ time_mask_param:
174
+ - 0.06
175
+ - 0.12
176
+ eps: null
177
+ audio_key: audio
178
+ sample_rate_key: sample_rate
179
+ label_name_key: primary_label
180
+ filename_key: filename
181
+ waveform_key: waveform
182
+ melspectrogram_key: log_melspectrogram
183
+ label_index_key: label_index
184
+ sample_rate: 32000
185
+ duration: 15
186
+ decode_audio_as_waveform: true
187
+ decode_audio_as_monoral: true
188
+ training: false
189
+ target_shape: 256
190
+ melspectrogram_key: log_melspectrogram
191
+ label_index_key: label_index
192
+ num_workers: 2
193
+ clip_gradient: {}
194
+ record: {}
195
+ trainer:
196
+ _target_: birdclef2025.utils.driver.BaseTrainer
197
+ key_mapping:
198
+ train:
199
+ input:
200
+ input: log_melspectrogram
201
+ output: logit
202
+ validation:
203
+ input:
204
+ input: log_melspectrogram
205
+ output: logit
206
+ inference:
207
+ input:
208
+ input: log_melspectrogram
209
+ output: logit
210
+ ddp_kwargs: null
211
+ resume:
212
+ continue_from: ''
213
+ output:
214
+ exp_dir: ./exp/20250504-153714
215
+ tensorboard_dir: ./tensorboard/20250504-153714
216
+ save_checkpoint:
217
+ iteration:
218
+ every: 10000
219
+ path: ./exp/20250504-153714/model/iteration{iteration}.pth
220
+ epoch:
221
+ every: 1
222
+ path: ./exp/20250504-153714/model/epoch{epoch}.pth
223
+ last:
224
+ path: ./exp/20250504-153714/model/last.pth
225
+ best_epoch:
226
+ path: ./exp/20250504-153714/model/best_epoch.pth
227
+ steps:
228
+ epochs: 10
229
+ iterations: null
230
+ lr_scheduler: epoch
231
+ test:
232
+ dataset:
233
+ test:
234
+ _target_: torch.utils.data.Dataset
235
+ dataloader:
236
+ test:
237
+ _target_: torch.utils.data.DataLoader
238
+ batch_size: 1
239
+ shuffle: false
240
+ key_mapping:
241
+ inference:
242
+ input: null
243
+ output: null
244
+ identifier: null
245
+ checkpoint: null
246
+ remove_weight_norm: null
247
+ output:
248
+ exp_dir: ./exp
249
+ inference_dir: ./exp/inference
250
+ audio:
251
+ sample_rate: 32000
252
+ key_mapping:
253
+ inference:
254
+ output: null
255
+ reference: null
256
+ transforms:
257
+ inference:
258
+ output: null
259
+ reference: null
260
+ ddp_kwargs: null
261
+ model:
262
+ _target_: birdclef2025.models.ConvNeXtTiny
263
+ weights: IMAGENET1K_V1
264
+ num_classes: 206
265
+ optimizer:
266
+ _target_: torch.optim.AdamW
267
+ lr: 0.0001
268
+ weight_decay: 0.05
269
+ lr_scheduler: {}
270
+ criterion:
271
+ _target_: audyn.criterion.MultiCriteria
272
+ cross_entropy:
273
+ _target_: audyn.criterion.BaseCriterionWrapper
274
+ criterion:
275
+ _target_: torch.nn.CrossEntropyLoss
276
+ reduction: mean
277
+ weight: 1
278
+ key_mapping:
279
+ estimated:
280
+ input: logit
281
+ target:
282
+ target: label_index
283
+ metrics:
284
+ roc_auc:
285
+ metric:
286
+ _target_: birdclef2025.metrics.ROCAUC
287
+ take_softmax: true
288
+ key_mapping:
289
+ estimated:
290
+ input: logit
291
+ target:
292
+ target: label_index
recipes/BirdCLEF2025/ConvNeXtTiny/exp/20250504-153714/log/20250504-153717/train.log ADDED
@@ -0,0 +1,853 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [2025-05-04 15:37:45,581][BaseTrainer][INFO] - system:
2
+ seed: 0
3
+ distributed:
4
+ enable: null
5
+ nodes: null
6
+ nproc_per_node: null
7
+ backend: null
8
+ init_method: null
9
+ rdzv_id: null
10
+ rdzv_backend: null
11
+ rdzv_endpoint: null
12
+ max_restarts: null
13
+ cudnn:
14
+ benchmark: true
15
+ deterministic: false
16
+ amp:
17
+ enable: false
18
+ dtype: null
19
+ accelerator: cuda
20
+ compile:
21
+ enable: false
22
+ kwargs: null
23
+ preprocess:
24
+ dump_format: birdclef2025
25
+ list_path: null
26
+ wav_dir: null
27
+ feature_dir: null
28
+ max_workers: 2
29
+ max_shard_size: 1000000000
30
+ vad:
31
+ raw_root: null
32
+ trimmed_root: null
33
+ threshold: null
34
+ min_duration: 15
35
+ csv_path: ???
36
+ submission_path: ???
37
+ audio_root: ???
38
+ subset: ???
39
+ train_ratio: 0.8
40
+ data:
41
+ audio:
42
+ sample_rate: 32000
43
+ duration: 15
44
+ melspectrogram:
45
+ _target_: birdclef2025.transforms.birdclef.BirdCLEF2024BaselineMelSpectrogram
46
+ sample_rate: 32000
47
+ hop_length: 512
48
+ f_min: 20
49
+ f_max: 16000
50
+ pad: 0
51
+ n_mels: 128
52
+ window_fn:
53
+ _target_: torch.hann_window
54
+ _partial_: true
55
+ power: 1.0
56
+ normalized: false
57
+ wkwargs: null
58
+ center: true
59
+ pad_mode: constant
60
+ onesided: null
61
+ norm: slaney
62
+ mel_scale: slaney
63
+ take_log: true
64
+ freq_mask_param:
65
+ - 0.06
66
+ - 0.1
67
+ time_mask_param:
68
+ - 0.06
69
+ - 0.12
70
+ eps: null
71
+ train:
72
+ dataset:
73
+ train:
74
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
75
+ list_path: dump/birdclef2025_reshape_15s/list/train.txt
76
+ feature_dir: /kaggle/input/birdclef-2025
77
+ audio_key: audio
78
+ sample_rate_key: sample_rate
79
+ label_name_key: primary_label
80
+ filename_key: filename
81
+ validation:
82
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025PrimaryLabelDataset
83
+ list_path: dump/birdclef2025_reshape_15s/list/validation.txt
84
+ feature_dir: /kaggle/input/birdclef-2025
85
+ audio_key: ${..train.audio_key}
86
+ sample_rate_key: ${..train.sample_rate_key}
87
+ label_name_key: ${..train.label_name_key}
88
+ filename_key: ${..train.filename_key}
89
+ dataloader:
90
+ train:
91
+ _target_: torch.utils.data.DataLoader
92
+ batch_size: 64
93
+ shuffle: true
94
+ collate_fn:
95
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineCollator
96
+ composer:
97
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025ReshapePrimaryLabelComposer
98
+ melspectrogram_transform: ${data.melspectrogram}
99
+ audio_key: audio
100
+ sample_rate_key: sample_rate
101
+ label_name_key: primary_label
102
+ filename_key: filename
103
+ waveform_key: waveform
104
+ melspectrogram_key: log_melspectrogram
105
+ label_index_key: label_index
106
+ sample_rate: ${data.audio.sample_rate}
107
+ duration: ${data.audio.duration}
108
+ decode_audio_as_waveform: true
109
+ decode_audio_as_monoral: true
110
+ training: true
111
+ target_shape: 256
112
+ melspectrogram_key: ${.composer.melspectrogram_key}
113
+ label_index_key: ${.composer.label_index_key}
114
+ alpha: 0.4
115
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
116
+ validation:
117
+ _target_: torch.utils.data.DataLoader
118
+ batch_size: 64
119
+ shuffle: false
120
+ collate_fn:
121
+ _target_: birdclef2025.utils.data.birdclef.BirdCLEF2025BaselineValidationCollator
122
+ composer:
123
+ _target_: ${....train.collate_fn.composer._target_}
124
+ melspectrogram_transform: ${....train.collate_fn.composer.melspectrogram_transform}
125
+ audio_key: ${....train.collate_fn.composer.audio_key}
126
+ sample_rate_key: ${....train.collate_fn.composer.sample_rate_key}
127
+ label_name_key: ${....train.collate_fn.composer.label_name_key}
128
+ filename_key: ${....train.collate_fn.composer.filename_key}
129
+ waveform_key: ${....train.collate_fn.composer.waveform_key}
130
+ melspectrogram_key: ${....train.collate_fn.composer.melspectrogram_key}
131
+ label_index_key: ${....train.collate_fn.composer.label_index_key}
132
+ sample_rate: ${....train.collate_fn.composer.sample_rate}
133
+ duration: ${....train.collate_fn.composer.duration}
134
+ decode_audio_as_waveform: ${....train.collate_fn.composer.decode_audio_as_waveform}
135
+ decode_audio_as_monoral: ${....train.collate_fn.composer.decode_audio_as_monoral}
136
+ training: false
137
+ target_shape: ${....train.collate_fn.composer.target_shape}
138
+ melspectrogram_key: ${...train.collate_fn.composer.melspectrogram_key}
139
+ label_index_key: ${...train.collate_fn.composer.label_index_key}
140
+ num_workers: ${const:birdclef2025.utils.data.default_num_workers}
141
+ clip_gradient: {}
142
+ record: {}
143
+ trainer:
144
+ _target_: birdclef2025.utils.driver.BaseTrainer
145
+ _partial_: true
146
+ key_mapping:
147
+ train:
148
+ input:
149
+ input: ${....dataloader.train.collate_fn.composer.melspectrogram_key}
150
+ output: logit
151
+ validation: ${.train}
152
+ inference: ${.validation}
153
+ ddp_kwargs: null
154
+ resume:
155
+ continue_from: ''
156
+ output:
157
+ exp_dir: ./exp/20250504-153714
158
+ tensorboard_dir: ./tensorboard/20250504-153714
159
+ save_checkpoint:
160
+ iteration:
161
+ every: 10000
162
+ path: ${...exp_dir}/model/iteration{iteration}.pth
163
+ epoch:
164
+ every: 1
165
+ path: ${...exp_dir}/model/epoch{epoch}.pth
166
+ last:
167
+ path: ${...exp_dir}/model/last.pth
168
+ best_epoch:
169
+ path: ${...exp_dir}/model/best_epoch.pth
170
+ steps:
171
+ epochs: 10
172
+ iterations: null
173
+ lr_scheduler: epoch
174
+ test:
175
+ dataset:
176
+ test:
177
+ _target_: torch.utils.data.Dataset
178
+ dataloader:
179
+ test:
180
+ _target_: torch.utils.data.DataLoader
181
+ batch_size: 1
182
+ shuffle: false
183
+ key_mapping:
184
+ inference:
185
+ input: null
186
+ output: null
187
+ identifier: null
188
+ checkpoint: null
189
+ remove_weight_norm: null
190
+ output:
191
+ exp_dir: ./exp
192
+ inference_dir: ${.exp_dir}/inference
193
+ audio:
194
+ sample_rate: ${data.audio.sample_rate}
195
+ key_mapping:
196
+ inference:
197
+ output: null
198
+ reference: null
199
+ transforms:
200
+ inference:
201
+ output: null
202
+ reference: null
203
+ ddp_kwargs: null
204
+ model:
205
+ _target_: birdclef2025.models.ConvNeXtTiny
206
+ weights: ${const:torchvision.models.ConvNeXt_Tiny_Weights.IMAGENET1K_V1}
207
+ num_classes: ${const:birdclef2025.utils.data.birdclef.num_birdclef2025_primary_labels}
208
+ optimizer:
209
+ _target_: torch.optim.AdamW
210
+ lr: 0.0001
211
+ weight_decay: 0.05
212
+ lr_scheduler: {}
213
+ criterion:
214
+ _target_: audyn.criterion.MultiCriteria
215
+ cross_entropy:
216
+ _target_: audyn.criterion.BaseCriterionWrapper
217
+ criterion:
218
+ _target_: torch.nn.CrossEntropyLoss
219
+ reduction: mean
220
+ weight: 1
221
+ key_mapping:
222
+ estimated:
223
+ input: logit
224
+ target:
225
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
226
+ metrics:
227
+ roc_auc:
228
+ metric:
229
+ _target_: birdclef2025.metrics.ROCAUC
230
+ take_softmax: true
231
+ key_mapping:
232
+ estimated:
233
+ input: logit
234
+ target:
235
+ target: ${train.dataloader.train.collate_fn.composer.label_index_key}
236
+
237
+ [2025-05-04 15:37:45,581][BaseTrainer][INFO] - ConvNeXtTiny(
238
+ (backbone): Sequential(
239
+ (0): Conv2dNormActivation(
240
+ (0): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4))
241
+ (1): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
242
+ )
243
+ (1): Sequential(
244
+ (0): CNBlock(
245
+ (block): Sequential(
246
+ (0): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
247
+ (1): Permute()
248
+ (2): LayerNorm((96,), eps=1e-06, elementwise_affine=True)
249
+ (3): Linear(in_features=96, out_features=384, bias=True)
250
+ (4): GELU(approximate='none')
251
+ (5): Linear(in_features=384, out_features=96, bias=True)
252
+ (6): Permute()
253
+ )
254
+ (stochastic_depth): StochasticDepth(p=0.0, mode=row)
255
+ )
256
+ (1): CNBlock(
257
+ (block): Sequential(
258
+ (0): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
259
+ (1): Permute()
260
+ (2): LayerNorm((96,), eps=1e-06, elementwise_affine=True)
261
+ (3): Linear(in_features=96, out_features=384, bias=True)
262
+ (4): GELU(approximate='none')
263
+ (5): Linear(in_features=384, out_features=96, bias=True)
264
+ (6): Permute()
265
+ )
266
+ (stochastic_depth): StochasticDepth(p=0.0058823529411764705, mode=row)
267
+ )
268
+ (2): CNBlock(
269
+ (block): Sequential(
270
+ (0): Conv2d(96, 96, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=96)
271
+ (1): Permute()
272
+ (2): LayerNorm((96,), eps=1e-06, elementwise_affine=True)
273
+ (3): Linear(in_features=96, out_features=384, bias=True)
274
+ (4): GELU(approximate='none')
275
+ (5): Linear(in_features=384, out_features=96, bias=True)
276
+ (6): Permute()
277
+ )
278
+ (stochastic_depth): StochasticDepth(p=0.011764705882352941, mode=row)
279
+ )
280
+ )
281
+ (2): Sequential(
282
+ (0): LayerNorm2d((96,), eps=1e-06, elementwise_affine=True)
283
+ (1): Conv2d(96, 192, kernel_size=(2, 2), stride=(2, 2))
284
+ )
285
+ (3): Sequential(
286
+ (0): CNBlock(
287
+ (block): Sequential(
288
+ (0): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
289
+ (1): Permute()
290
+ (2): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
291
+ (3): Linear(in_features=192, out_features=768, bias=True)
292
+ (4): GELU(approximate='none')
293
+ (5): Linear(in_features=768, out_features=192, bias=True)
294
+ (6): Permute()
295
+ )
296
+ (stochastic_depth): StochasticDepth(p=0.017647058823529415, mode=row)
297
+ )
298
+ (1): CNBlock(
299
+ (block): Sequential(
300
+ (0): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
301
+ (1): Permute()
302
+ (2): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
303
+ (3): Linear(in_features=192, out_features=768, bias=True)
304
+ (4): GELU(approximate='none')
305
+ (5): Linear(in_features=768, out_features=192, bias=True)
306
+ (6): Permute()
307
+ )
308
+ (stochastic_depth): StochasticDepth(p=0.023529411764705882, mode=row)
309
+ )
310
+ (2): CNBlock(
311
+ (block): Sequential(
312
+ (0): Conv2d(192, 192, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=192)
313
+ (1): Permute()
314
+ (2): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
315
+ (3): Linear(in_features=192, out_features=768, bias=True)
316
+ (4): GELU(approximate='none')
317
+ (5): Linear(in_features=768, out_features=192, bias=True)
318
+ (6): Permute()
319
+ )
320
+ (stochastic_depth): StochasticDepth(p=0.029411764705882353, mode=row)
321
+ )
322
+ )
323
+ (4): Sequential(
324
+ (0): LayerNorm2d((192,), eps=1e-06, elementwise_affine=True)
325
+ (1): Conv2d(192, 384, kernel_size=(2, 2), stride=(2, 2))
326
+ )
327
+ (5): Sequential(
328
+ (0): CNBlock(
329
+ (block): Sequential(
330
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
331
+ (1): Permute()
332
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
333
+ (3): Linear(in_features=384, out_features=1536, bias=True)
334
+ (4): GELU(approximate='none')
335
+ (5): Linear(in_features=1536, out_features=384, bias=True)
336
+ (6): Permute()
337
+ )
338
+ (stochastic_depth): StochasticDepth(p=0.03529411764705883, mode=row)
339
+ )
340
+ (1): CNBlock(
341
+ (block): Sequential(
342
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
343
+ (1): Permute()
344
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
345
+ (3): Linear(in_features=384, out_features=1536, bias=True)
346
+ (4): GELU(approximate='none')
347
+ (5): Linear(in_features=1536, out_features=384, bias=True)
348
+ (6): Permute()
349
+ )
350
+ (stochastic_depth): StochasticDepth(p=0.0411764705882353, mode=row)
351
+ )
352
+ (2): CNBlock(
353
+ (block): Sequential(
354
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
355
+ (1): Permute()
356
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
357
+ (3): Linear(in_features=384, out_features=1536, bias=True)
358
+ (4): GELU(approximate='none')
359
+ (5): Linear(in_features=1536, out_features=384, bias=True)
360
+ (6): Permute()
361
+ )
362
+ (stochastic_depth): StochasticDepth(p=0.047058823529411764, mode=row)
363
+ )
364
+ (3): CNBlock(
365
+ (block): Sequential(
366
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
367
+ (1): Permute()
368
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
369
+ (3): Linear(in_features=384, out_features=1536, bias=True)
370
+ (4): GELU(approximate='none')
371
+ (5): Linear(in_features=1536, out_features=384, bias=True)
372
+ (6): Permute()
373
+ )
374
+ (stochastic_depth): StochasticDepth(p=0.052941176470588235, mode=row)
375
+ )
376
+ (4): CNBlock(
377
+ (block): Sequential(
378
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
379
+ (1): Permute()
380
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
381
+ (3): Linear(in_features=384, out_features=1536, bias=True)
382
+ (4): GELU(approximate='none')
383
+ (5): Linear(in_features=1536, out_features=384, bias=True)
384
+ (6): Permute()
385
+ )
386
+ (stochastic_depth): StochasticDepth(p=0.058823529411764705, mode=row)
387
+ )
388
+ (5): CNBlock(
389
+ (block): Sequential(
390
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
391
+ (1): Permute()
392
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
393
+ (3): Linear(in_features=384, out_features=1536, bias=True)
394
+ (4): GELU(approximate='none')
395
+ (5): Linear(in_features=1536, out_features=384, bias=True)
396
+ (6): Permute()
397
+ )
398
+ (stochastic_depth): StochasticDepth(p=0.06470588235294118, mode=row)
399
+ )
400
+ (6): CNBlock(
401
+ (block): Sequential(
402
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
403
+ (1): Permute()
404
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
405
+ (3): Linear(in_features=384, out_features=1536, bias=True)
406
+ (4): GELU(approximate='none')
407
+ (5): Linear(in_features=1536, out_features=384, bias=True)
408
+ (6): Permute()
409
+ )
410
+ (stochastic_depth): StochasticDepth(p=0.07058823529411766, mode=row)
411
+ )
412
+ (7): CNBlock(
413
+ (block): Sequential(
414
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
415
+ (1): Permute()
416
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
417
+ (3): Linear(in_features=384, out_features=1536, bias=True)
418
+ (4): GELU(approximate='none')
419
+ (5): Linear(in_features=1536, out_features=384, bias=True)
420
+ (6): Permute()
421
+ )
422
+ (stochastic_depth): StochasticDepth(p=0.07647058823529412, mode=row)
423
+ )
424
+ (8): CNBlock(
425
+ (block): Sequential(
426
+ (0): Conv2d(384, 384, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=384)
427
+ (1): Permute()
428
+ (2): LayerNorm((384,), eps=1e-06, elementwise_affine=True)
429
+ (3): Linear(in_features=384, out_features=1536, bias=True)
430
+ (4): GELU(approximate='none')
431
+ (5): Linear(in_features=1536, out_features=384, bias=True)
432
+ (6): Permute()
433
+ )
434
+ (stochastic_depth): StochasticDepth(p=0.0823529411764706, mode=row)
435
+ )
436
+ )
437
+ (6): Sequential(
438
+ (0): LayerNorm2d((384,), eps=1e-06, elementwise_affine=True)
439
+ (1): Conv2d(384, 768, kernel_size=(2, 2), stride=(2, 2))
440
+ )
441
+ (7): Sequential(
442
+ (0): CNBlock(
443
+ (block): Sequential(
444
+ (0): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
445
+ (1): Permute()
446
+ (2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
447
+ (3): Linear(in_features=768, out_features=3072, bias=True)
448
+ (4): GELU(approximate='none')
449
+ (5): Linear(in_features=3072, out_features=768, bias=True)
450
+ (6): Permute()
451
+ )
452
+ (stochastic_depth): StochasticDepth(p=0.08823529411764706, mode=row)
453
+ )
454
+ (1): CNBlock(
455
+ (block): Sequential(
456
+ (0): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
457
+ (1): Permute()
458
+ (2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
459
+ (3): Linear(in_features=768, out_features=3072, bias=True)
460
+ (4): GELU(approximate='none')
461
+ (5): Linear(in_features=3072, out_features=768, bias=True)
462
+ (6): Permute()
463
+ )
464
+ (stochastic_depth): StochasticDepth(p=0.09411764705882353, mode=row)
465
+ )
466
+ (2): CNBlock(
467
+ (block): Sequential(
468
+ (0): Conv2d(768, 768, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=768)
469
+ (1): Permute()
470
+ (2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
471
+ (3): Linear(in_features=768, out_features=3072, bias=True)
472
+ (4): GELU(approximate='none')
473
+ (5): Linear(in_features=3072, out_features=768, bias=True)
474
+ (6): Permute()
475
+ )
476
+ (stochastic_depth): StochasticDepth(p=0.1, mode=row)
477
+ )
478
+ )
479
+ )
480
+ (avgpool): AdaptiveAvgPool2d(output_size=1)
481
+ (classifier): Sequential(
482
+ (0): LayerNorm2d((768,), eps=1e-06, elementwise_affine=True)
483
+ (1): Flatten(start_dim=1, end_dim=-1)
484
+ (2): Linear(in_features=768, out_features=206, bias=True)
485
+ )
486
+ )
487
+ [2025-05-04 15:37:45,584][BaseTrainer][INFO] - # of parameters: 27978542.
488
+ [2025-05-04 15:38:05,641][BaseTrainer][INFO] - [Epoch 1/10, Iter 1/3560] 5.376641273498535, cross_entropy: 5.376641273498535
489
+ [2025-05-04 15:38:06,677][BaseTrainer][INFO] - [Epoch 1/10, Iter 2/3560] 5.312026023864746, cross_entropy: 5.312026023864746
490
+ [2025-05-04 15:38:07,712][BaseTrainer][INFO] - [Epoch 1/10, Iter 3/3560] 5.27780818939209, cross_entropy: 5.27780818939209
491
+ [2025-05-04 15:38:08,783][BaseTrainer][INFO] - [Epoch 1/10, Iter 4/3560] 5.321934700012207, cross_entropy: 5.321934700012207
492
+ [2025-05-04 15:38:09,838][BaseTrainer][INFO] - [Epoch 1/10, Iter 5/3560] 5.203423500061035, cross_entropy: 5.203423500061035
493
+ [2025-05-04 15:38:12,173][BaseTrainer][INFO] - [Epoch 1/10, Iter 6/3560] 5.188291549682617, cross_entropy: 5.188291549682617
494
+ [2025-05-04 15:38:13,279][BaseTrainer][INFO] - [Epoch 1/10, Iter 7/3560] 5.233580589294434, cross_entropy: 5.233580589294434
495
+ [2025-05-04 15:38:16,962][BaseTrainer][INFO] - [Epoch 1/10, Iter 8/3560] 5.191223621368408, cross_entropy: 5.191223621368408
496
+ [2025-05-04 15:38:18,014][BaseTrainer][INFO] - [Epoch 1/10, Iter 9/3560] 5.152655601501465, cross_entropy: 5.152655601501465
497
+ [2025-05-04 15:38:21,246][BaseTrainer][INFO] - [Epoch 1/10, Iter 10/3560] 5.084836959838867, cross_entropy: 5.084836959838867
498
+ [2025-05-04 15:38:22,332][BaseTrainer][INFO] - [Epoch 1/10, Iter 11/3560] 5.160213470458984, cross_entropy: 5.160213470458984
499
+ [2025-05-04 15:38:25,576][BaseTrainer][INFO] - [Epoch 1/10, Iter 12/3560] 5.043108940124512, cross_entropy: 5.043108940124512
500
+ [2025-05-04 15:38:26,627][BaseTrainer][INFO] - [Epoch 1/10, Iter 13/3560] 4.957451820373535, cross_entropy: 4.957451820373535
501
+ [2025-05-04 15:38:30,608][BaseTrainer][INFO] - [Epoch 1/10, Iter 14/3560] 4.992649078369141, cross_entropy: 4.992649078369141
502
+ [2025-05-04 15:38:31,670][BaseTrainer][INFO] - [Epoch 1/10, Iter 15/3560] 4.82185173034668, cross_entropy: 4.82185173034668
503
+ [2025-05-04 15:38:34,801][BaseTrainer][INFO] - [Epoch 1/10, Iter 16/3560] 5.031904220581055, cross_entropy: 5.031904220581055
504
+ [2025-05-04 15:38:35,859][BaseTrainer][INFO] - [Epoch 1/10, Iter 17/3560] 5.100973129272461, cross_entropy: 5.100973129272461
505
+ [2025-05-04 15:38:38,908][BaseTrainer][INFO] - [Epoch 1/10, Iter 18/3560] 5.039768218994141, cross_entropy: 5.039768218994141
506
+ [2025-05-04 15:38:39,968][BaseTrainer][INFO] - [Epoch 1/10, Iter 19/3560] 4.945712089538574, cross_entropy: 4.945712089538574
507
+ [2025-05-04 15:38:42,835][BaseTrainer][INFO] - [Epoch 1/10, Iter 20/3560] 4.910451412200928, cross_entropy: 4.910451412200928
508
+ [2025-05-04 15:38:43,887][BaseTrainer][INFO] - [Epoch 1/10, Iter 21/3560] 4.891071319580078, cross_entropy: 4.891071319580078
509
+ [2025-05-04 15:38:47,421][BaseTrainer][INFO] - [Epoch 1/10, Iter 22/3560] 4.818423271179199, cross_entropy: 4.818423271179199
510
+ [2025-05-04 15:38:48,469][BaseTrainer][INFO] - [Epoch 1/10, Iter 23/3560] 4.651951789855957, cross_entropy: 4.651951789855957
511
+ [2025-05-04 15:38:51,379][BaseTrainer][INFO] - [Epoch 1/10, Iter 24/3560] 5.034425735473633, cross_entropy: 5.034425735473633
512
+ [2025-05-04 15:38:52,424][BaseTrainer][INFO] - [Epoch 1/10, Iter 25/3560] 4.565324783325195, cross_entropy: 4.565324783325195
513
+ [2025-05-04 15:38:55,513][BaseTrainer][INFO] - [Epoch 1/10, Iter 26/3560] 4.9761152267456055, cross_entropy: 4.9761152267456055
514
+ [2025-05-04 15:38:56,547][BaseTrainer][INFO] - [Epoch 1/10, Iter 27/3560] 4.841283798217773, cross_entropy: 4.841283798217773
515
+ [2025-05-04 15:38:59,982][BaseTrainer][INFO] - [Epoch 1/10, Iter 28/3560] 4.716811180114746, cross_entropy: 4.716811180114746
516
+ [2025-05-04 15:39:01,035][BaseTrainer][INFO] - [Epoch 1/10, Iter 29/3560] 5.003815650939941, cross_entropy: 5.003815650939941
517
+ [2025-05-04 15:39:04,335][BaseTrainer][INFO] - [Epoch 1/10, Iter 30/3560] 4.7728986740112305, cross_entropy: 4.7728986740112305
518
+ [2025-05-04 15:39:05,380][BaseTrainer][INFO] - [Epoch 1/10, Iter 31/3560] 4.807315826416016, cross_entropy: 4.807315826416016
519
+ [2025-05-04 15:39:07,805][BaseTrainer][INFO] - [Epoch 1/10, Iter 32/3560] 4.718504905700684, cross_entropy: 4.718504905700684
520
+ [2025-05-04 15:39:08,836][BaseTrainer][INFO] - [Epoch 1/10, Iter 33/3560] 4.847777843475342, cross_entropy: 4.847777843475342
521
+ [2025-05-04 15:39:12,467][BaseTrainer][INFO] - [Epoch 1/10, Iter 34/3560] 4.851305961608887, cross_entropy: 4.851305961608887
522
+ [2025-05-04 15:39:13,516][BaseTrainer][INFO] - [Epoch 1/10, Iter 35/3560] 4.6127824783325195, cross_entropy: 4.6127824783325195
523
+ [2025-05-04 15:39:16,284][BaseTrainer][INFO] - [Epoch 1/10, Iter 36/3560] 4.892227649688721, cross_entropy: 4.892227649688721
524
+ [2025-05-04 15:39:17,320][BaseTrainer][INFO] - [Epoch 1/10, Iter 37/3560] 4.760605812072754, cross_entropy: 4.760605812072754
525
+ [2025-05-04 15:39:20,215][BaseTrainer][INFO] - [Epoch 1/10, Iter 38/3560] 4.747416973114014, cross_entropy: 4.747416973114014
526
+ [2025-05-04 15:39:21,269][BaseTrainer][INFO] - [Epoch 1/10, Iter 39/3560] 4.610195159912109, cross_entropy: 4.610195159912109
527
+ [2025-05-04 15:39:24,466][BaseTrainer][INFO] - [Epoch 1/10, Iter 40/3560] 4.788339614868164, cross_entropy: 4.788339614868164
528
+ [2025-05-04 15:39:25,501][BaseTrainer][INFO] - [Epoch 1/10, Iter 41/3560] 4.6864423751831055, cross_entropy: 4.6864423751831055
529
+ [2025-05-04 15:39:28,466][BaseTrainer][INFO] - [Epoch 1/10, Iter 42/3560] 4.7742438316345215, cross_entropy: 4.7742438316345215
530
+ [2025-05-04 15:39:29,499][BaseTrainer][INFO] - [Epoch 1/10, Iter 43/3560] 4.873464584350586, cross_entropy: 4.873464584350586
531
+ [2025-05-04 15:39:32,250][BaseTrainer][INFO] - [Epoch 1/10, Iter 44/3560] 4.77091646194458, cross_entropy: 4.77091646194458
532
+ [2025-05-04 15:39:33,285][BaseTrainer][INFO] - [Epoch 1/10, Iter 45/3560] 4.725327014923096, cross_entropy: 4.725327014923096
533
+ [2025-05-04 15:39:36,199][BaseTrainer][INFO] - [Epoch 1/10, Iter 46/3560] 4.5556559562683105, cross_entropy: 4.5556559562683105
534
+ [2025-05-04 15:39:37,253][BaseTrainer][INFO] - [Epoch 1/10, Iter 47/3560] 4.744180202484131, cross_entropy: 4.744180202484131
535
+ [2025-05-04 15:39:40,399][BaseTrainer][INFO] - [Epoch 1/10, Iter 48/3560] 4.5664777755737305, cross_entropy: 4.5664777755737305
536
+ [2025-05-04 15:39:41,456][BaseTrainer][INFO] - [Epoch 1/10, Iter 49/3560] 4.5559892654418945, cross_entropy: 4.5559892654418945
537
+ [2025-05-04 15:39:46,237][BaseTrainer][INFO] - [Epoch 1/10, Iter 50/3560] 4.736114501953125, cross_entropy: 4.736114501953125
538
+ [2025-05-04 15:39:47,272][BaseTrainer][INFO] - [Epoch 1/10, Iter 51/3560] 4.541403770446777, cross_entropy: 4.541403770446777
539
+ [2025-05-04 15:39:49,705][BaseTrainer][INFO] - [Epoch 1/10, Iter 52/3560] 4.843523979187012, cross_entropy: 4.843523979187012
540
+ [2025-05-04 15:39:50,759][BaseTrainer][INFO] - [Epoch 1/10, Iter 53/3560] 4.459508895874023, cross_entropy: 4.459508895874023
541
+ [2025-05-04 15:39:53,462][BaseTrainer][INFO] - [Epoch 1/10, Iter 54/3560] 4.814382553100586, cross_entropy: 4.814382553100586
542
+ [2025-05-04 15:39:54,506][BaseTrainer][INFO] - [Epoch 1/10, Iter 55/3560] 4.580887794494629, cross_entropy: 4.580887794494629
543
+ [2025-05-04 15:39:57,395][BaseTrainer][INFO] - [Epoch 1/10, Iter 56/3560] 4.71505069732666, cross_entropy: 4.71505069732666
544
+ [2025-05-04 15:39:58,429][BaseTrainer][INFO] - [Epoch 1/10, Iter 57/3560] 4.5077738761901855, cross_entropy: 4.5077738761901855
545
+ [2025-05-04 15:40:01,088][BaseTrainer][INFO] - [Epoch 1/10, Iter 58/3560] 4.75019645690918, cross_entropy: 4.75019645690918
546
+ [2025-05-04 15:40:02,121][BaseTrainer][INFO] - [Epoch 1/10, Iter 59/3560] 4.728370666503906, cross_entropy: 4.728370666503906
547
+ [2025-05-04 15:40:04,650][BaseTrainer][INFO] - [Epoch 1/10, Iter 60/3560] 4.706501483917236, cross_entropy: 4.706501483917236
548
+ [2025-05-04 15:40:05,703][BaseTrainer][INFO] - [Epoch 1/10, Iter 61/3560] 4.695921897888184, cross_entropy: 4.695921897888184
549
+ [2025-05-04 15:40:09,063][BaseTrainer][INFO] - [Epoch 1/10, Iter 62/3560] 4.792250156402588, cross_entropy: 4.792250156402588
550
+ [2025-05-04 15:40:10,106][BaseTrainer][INFO] - [Epoch 1/10, Iter 63/3560] 4.864690780639648, cross_entropy: 4.864690780639648
551
+ [2025-05-04 15:40:14,049][BaseTrainer][INFO] - [Epoch 1/10, Iter 64/3560] 4.703803062438965, cross_entropy: 4.703803062438965
552
+ [2025-05-04 15:40:15,112][BaseTrainer][INFO] - [Epoch 1/10, Iter 65/3560] 4.695968151092529, cross_entropy: 4.695968151092529
553
+ [2025-05-04 15:40:18,446][BaseTrainer][INFO] - [Epoch 1/10, Iter 66/3560] 4.7378034591674805, cross_entropy: 4.7378034591674805
554
+ [2025-05-04 15:40:19,485][BaseTrainer][INFO] - [Epoch 1/10, Iter 67/3560] 4.630002975463867, cross_entropy: 4.630002975463867
555
+ [2025-05-04 15:40:22,457][BaseTrainer][INFO] - [Epoch 1/10, Iter 68/3560] 4.627466201782227, cross_entropy: 4.627466201782227
556
+ [2025-05-04 15:40:23,508][BaseTrainer][INFO] - [Epoch 1/10, Iter 69/3560] 4.664760112762451, cross_entropy: 4.664760112762451
557
+ [2025-05-04 15:40:26,421][BaseTrainer][INFO] - [Epoch 1/10, Iter 70/3560] 4.678044319152832, cross_entropy: 4.678044319152832
558
+ [2025-05-04 15:40:27,454][BaseTrainer][INFO] - [Epoch 1/10, Iter 71/3560] 4.816387176513672, cross_entropy: 4.816387176513672
559
+ [2025-05-04 15:40:30,671][BaseTrainer][INFO] - [Epoch 1/10, Iter 72/3560] 4.488626480102539, cross_entropy: 4.488626480102539
560
+ [2025-05-04 15:40:31,717][BaseTrainer][INFO] - [Epoch 1/10, Iter 73/3560] 4.663271903991699, cross_entropy: 4.663271903991699
561
+ [2025-05-04 15:40:34,328][BaseTrainer][INFO] - [Epoch 1/10, Iter 74/3560] 4.632810592651367, cross_entropy: 4.632810592651367
562
+ [2025-05-04 15:40:35,392][BaseTrainer][INFO] - [Epoch 1/10, Iter 75/3560] 4.598424911499023, cross_entropy: 4.598424911499023
563
+ [2025-05-04 15:40:38,288][BaseTrainer][INFO] - [Epoch 1/10, Iter 76/3560] 4.5883259773254395, cross_entropy: 4.5883259773254395
564
+ [2025-05-04 15:40:39,340][BaseTrainer][INFO] - [Epoch 1/10, Iter 77/3560] 4.6476922035217285, cross_entropy: 4.6476922035217285
565
+ [2025-05-04 15:40:42,592][BaseTrainer][INFO] - [Epoch 1/10, Iter 78/3560] 4.831206798553467, cross_entropy: 4.831206798553467
566
+ [2025-05-04 15:40:43,642][BaseTrainer][INFO] - [Epoch 1/10, Iter 79/3560] 4.717135906219482, cross_entropy: 4.717135906219482
567
+ [2025-05-04 15:40:47,329][BaseTrainer][INFO] - [Epoch 1/10, Iter 80/3560] 4.5759453773498535, cross_entropy: 4.5759453773498535
568
+ [2025-05-04 15:40:48,376][BaseTrainer][INFO] - [Epoch 1/10, Iter 81/3560] 4.878664970397949, cross_entropy: 4.878664970397949
569
+ [2025-05-04 15:40:51,664][BaseTrainer][INFO] - [Epoch 1/10, Iter 82/3560] 4.611279487609863, cross_entropy: 4.611279487609863
570
+ [2025-05-04 15:40:52,698][BaseTrainer][INFO] - [Epoch 1/10, Iter 83/3560] 4.751533508300781, cross_entropy: 4.751533508300781
571
+ [2025-05-04 15:40:55,983][BaseTrainer][INFO] - [Epoch 1/10, Iter 84/3560] 4.457396984100342, cross_entropy: 4.457396984100342
572
+ [2025-05-04 15:40:57,029][BaseTrainer][INFO] - [Epoch 1/10, Iter 85/3560] 4.3965020179748535, cross_entropy: 4.3965020179748535
573
+ [2025-05-04 15:41:00,237][BaseTrainer][INFO] - [Epoch 1/10, Iter 86/3560] 4.610512733459473, cross_entropy: 4.610512733459473
574
+ [2025-05-04 15:41:01,276][BaseTrainer][INFO] - [Epoch 1/10, Iter 87/3560] 4.730538845062256, cross_entropy: 4.730538845062256
575
+ [2025-05-04 15:41:04,260][BaseTrainer][INFO] - [Epoch 1/10, Iter 88/3560] 4.436934471130371, cross_entropy: 4.436934471130371
576
+ [2025-05-04 15:41:05,302][BaseTrainer][INFO] - [Epoch 1/10, Iter 89/3560] 4.580585956573486, cross_entropy: 4.580585956573486
577
+ [2025-05-04 15:41:07,921][BaseTrainer][INFO] - [Epoch 1/10, Iter 90/3560] 4.503360748291016, cross_entropy: 4.503360748291016
578
+ [2025-05-04 15:41:08,952][BaseTrainer][INFO] - [Epoch 1/10, Iter 91/3560] 4.2591400146484375, cross_entropy: 4.2591400146484375
579
+ [2025-05-04 15:41:12,215][BaseTrainer][INFO] - [Epoch 1/10, Iter 92/3560] 4.643581390380859, cross_entropy: 4.643581390380859
580
+ [2025-05-04 15:41:13,274][BaseTrainer][INFO] - [Epoch 1/10, Iter 93/3560] 4.443016052246094, cross_entropy: 4.443016052246094
581
+ [2025-05-04 15:41:16,352][BaseTrainer][INFO] - [Epoch 1/10, Iter 94/3560] 4.586359977722168, cross_entropy: 4.586359977722168
582
+ [2025-05-04 15:41:17,403][BaseTrainer][INFO] - [Epoch 1/10, Iter 95/3560] 4.471511363983154, cross_entropy: 4.471511363983154
583
+ [2025-05-04 15:41:20,271][BaseTrainer][INFO] - [Epoch 1/10, Iter 96/3560] 4.5110931396484375, cross_entropy: 4.5110931396484375
584
+ [2025-05-04 15:41:21,302][BaseTrainer][INFO] - [Epoch 1/10, Iter 97/3560] 4.563319206237793, cross_entropy: 4.563319206237793
585
+ [2025-05-04 15:41:24,537][BaseTrainer][INFO] - [Epoch 1/10, Iter 98/3560] 4.672548294067383, cross_entropy: 4.672548294067383
586
+ [2025-05-04 15:41:25,569][BaseTrainer][INFO] - [Epoch 1/10, Iter 99/3560] 4.615906715393066, cross_entropy: 4.615906715393066
587
+ [2025-05-04 15:41:29,032][BaseTrainer][INFO] - [Epoch 1/10, Iter 100/3560] 4.54000997543335, cross_entropy: 4.54000997543335
588
+ [2025-05-04 15:41:30,065][BaseTrainer][INFO] - [Epoch 1/10, Iter 101/3560] 4.559946060180664, cross_entropy: 4.559946060180664
589
+ [2025-05-04 15:41:32,638][BaseTrainer][INFO] - [Epoch 1/10, Iter 102/3560] 4.563340187072754, cross_entropy: 4.563340187072754
590
+ [2025-05-04 15:41:33,691][BaseTrainer][INFO] - [Epoch 1/10, Iter 103/3560] 4.632449150085449, cross_entropy: 4.632449150085449
591
+ [2025-05-04 15:41:36,213][BaseTrainer][INFO] - [Epoch 1/10, Iter 104/3560] 4.516786575317383, cross_entropy: 4.516786575317383
592
+ [2025-05-04 15:41:37,271][BaseTrainer][INFO] - [Epoch 1/10, Iter 105/3560] 4.154585838317871, cross_entropy: 4.154585838317871
593
+ [2025-05-04 15:41:40,329][BaseTrainer][INFO] - [Epoch 1/10, Iter 106/3560] 4.5116777420043945, cross_entropy: 4.5116777420043945
594
+ [2025-05-04 15:41:41,384][BaseTrainer][INFO] - [Epoch 1/10, Iter 107/3560] 4.780315399169922, cross_entropy: 4.780315399169922
595
+ [2025-05-04 15:41:44,375][BaseTrainer][INFO] - [Epoch 1/10, Iter 108/3560] 4.336567401885986, cross_entropy: 4.336567401885986
596
+ [2025-05-04 15:41:45,410][BaseTrainer][INFO] - [Epoch 1/10, Iter 109/3560] 4.303857326507568, cross_entropy: 4.303857326507568
597
+ [2025-05-04 15:41:48,564][BaseTrainer][INFO] - [Epoch 1/10, Iter 110/3560] 4.627621173858643, cross_entropy: 4.627621173858643
598
+ [2025-05-04 15:41:49,628][BaseTrainer][INFO] - [Epoch 1/10, Iter 111/3560] 4.187675476074219, cross_entropy: 4.187675476074219
599
+ [2025-05-04 15:41:53,056][BaseTrainer][INFO] - [Epoch 1/10, Iter 112/3560] 4.382655143737793, cross_entropy: 4.382655143737793
600
+ [2025-05-04 15:41:54,103][BaseTrainer][INFO] - [Epoch 1/10, Iter 113/3560] 4.612682819366455, cross_entropy: 4.612682819366455
601
+ [2025-05-04 15:41:57,509][BaseTrainer][INFO] - [Epoch 1/10, Iter 114/3560] 4.566553115844727, cross_entropy: 4.566553115844727
602
+ [2025-05-04 15:41:58,566][BaseTrainer][INFO] - [Epoch 1/10, Iter 115/3560] 4.172827243804932, cross_entropy: 4.172827243804932
603
+ [2025-05-04 15:42:02,412][BaseTrainer][INFO] - [Epoch 1/10, Iter 116/3560] 4.707499027252197, cross_entropy: 4.707499027252197
604
+ [2025-05-04 15:42:03,462][BaseTrainer][INFO] - [Epoch 1/10, Iter 117/3560] 4.260464668273926, cross_entropy: 4.260464668273926
605
+ [2025-05-04 15:42:06,088][BaseTrainer][INFO] - [Epoch 1/10, Iter 118/3560] 4.38938045501709, cross_entropy: 4.38938045501709
606
+ [2025-05-04 15:42:07,123][BaseTrainer][INFO] - [Epoch 1/10, Iter 119/3560] 4.397576332092285, cross_entropy: 4.397576332092285
607
+ [2025-05-04 15:42:09,779][BaseTrainer][INFO] - [Epoch 1/10, Iter 120/3560] 4.336797714233398, cross_entropy: 4.336797714233398
608
+ [2025-05-04 15:42:10,810][BaseTrainer][INFO] - [Epoch 1/10, Iter 121/3560] 4.763151168823242, cross_entropy: 4.763151168823242
609
+ [2025-05-04 15:42:13,750][BaseTrainer][INFO] - [Epoch 1/10, Iter 122/3560] 4.443341255187988, cross_entropy: 4.443341255187988
610
+ [2025-05-04 15:42:14,798][BaseTrainer][INFO] - [Epoch 1/10, Iter 123/3560] 4.7432756423950195, cross_entropy: 4.7432756423950195
611
+ [2025-05-04 15:42:17,478][BaseTrainer][INFO] - [Epoch 1/10, Iter 124/3560] 4.287132263183594, cross_entropy: 4.287132263183594
612
+ [2025-05-04 15:42:18,533][BaseTrainer][INFO] - [Epoch 1/10, Iter 125/3560] 4.172094345092773, cross_entropy: 4.172094345092773
613
+ [2025-05-04 15:42:21,268][BaseTrainer][INFO] - [Epoch 1/10, Iter 126/3560] 4.541016101837158, cross_entropy: 4.541016101837158
614
+ [2025-05-04 15:42:22,325][BaseTrainer][INFO] - [Epoch 1/10, Iter 127/3560] 4.759217262268066, cross_entropy: 4.759217262268066
615
+ [2025-05-04 15:42:25,181][BaseTrainer][INFO] - [Epoch 1/10, Iter 128/3560] 4.484977722167969, cross_entropy: 4.484977722167969
616
+ [2025-05-04 15:42:26,231][BaseTrainer][INFO] - [Epoch 1/10, Iter 129/3560] 4.228219985961914, cross_entropy: 4.228219985961914
617
+ [2025-05-04 15:42:28,768][BaseTrainer][INFO] - [Epoch 1/10, Iter 130/3560] 4.172601699829102, cross_entropy: 4.172601699829102
618
+ [2025-05-04 15:42:29,830][BaseTrainer][INFO] - [Epoch 1/10, Iter 131/3560] 4.271279811859131, cross_entropy: 4.271279811859131
619
+ [2025-05-04 15:42:32,993][BaseTrainer][INFO] - [Epoch 1/10, Iter 132/3560] 4.379694938659668, cross_entropy: 4.379694938659668
620
+ [2025-05-04 15:42:34,026][BaseTrainer][INFO] - [Epoch 1/10, Iter 133/3560] 3.976646900177002, cross_entropy: 3.976646900177002
621
+ [2025-05-04 15:42:36,696][BaseTrainer][INFO] - [Epoch 1/10, Iter 134/3560] 4.2220282554626465, cross_entropy: 4.2220282554626465
622
+ [2025-05-04 15:42:37,735][BaseTrainer][INFO] - [Epoch 1/10, Iter 135/3560] 4.535080909729004, cross_entropy: 4.535080909729004
623
+ [2025-05-04 15:42:40,535][BaseTrainer][INFO] - [Epoch 1/10, Iter 136/3560] 4.378235816955566, cross_entropy: 4.378235816955566
624
+ [2025-05-04 15:42:41,574][BaseTrainer][INFO] - [Epoch 1/10, Iter 137/3560] 4.188755035400391, cross_entropy: 4.188755035400391
625
+ [2025-05-04 15:42:44,207][BaseTrainer][INFO] - [Epoch 1/10, Iter 138/3560] 4.1770477294921875, cross_entropy: 4.1770477294921875
626
+ [2025-05-04 15:42:45,241][BaseTrainer][INFO] - [Epoch 1/10, Iter 139/3560] 4.205631256103516, cross_entropy: 4.205631256103516
627
+ [2025-05-04 15:42:48,137][BaseTrainer][INFO] - [Epoch 1/10, Iter 140/3560] 4.314923286437988, cross_entropy: 4.314923286437988
628
+ [2025-05-04 15:42:49,193][BaseTrainer][INFO] - [Epoch 1/10, Iter 141/3560] 4.068292617797852, cross_entropy: 4.068292617797852
629
+ [2025-05-04 15:42:53,460][BaseTrainer][INFO] - [Epoch 1/10, Iter 142/3560] 4.10396671295166, cross_entropy: 4.10396671295166
630
+ [2025-05-04 15:42:54,513][BaseTrainer][INFO] - [Epoch 1/10, Iter 143/3560] 4.234318733215332, cross_entropy: 4.234318733215332
631
+ [2025-05-04 15:42:57,367][BaseTrainer][INFO] - [Epoch 1/10, Iter 144/3560] 4.579281806945801, cross_entropy: 4.579281806945801
632
+ [2025-05-04 15:42:58,400][BaseTrainer][INFO] - [Epoch 1/10, Iter 145/3560] 4.095945358276367, cross_entropy: 4.095945358276367
633
+ [2025-05-04 15:43:01,556][BaseTrainer][INFO] - [Epoch 1/10, Iter 146/3560] 4.273804664611816, cross_entropy: 4.273804664611816
634
+ [2025-05-04 15:43:02,612][BaseTrainer][INFO] - [Epoch 1/10, Iter 147/3560] 4.2708587646484375, cross_entropy: 4.2708587646484375
635
+ [2025-05-04 15:43:05,603][BaseTrainer][INFO] - [Epoch 1/10, Iter 148/3560] 3.9428675174713135, cross_entropy: 3.9428675174713135
636
+ [2025-05-04 15:43:06,658][BaseTrainer][INFO] - [Epoch 1/10, Iter 149/3560] 4.465112686157227, cross_entropy: 4.465112686157227
637
+ [2025-05-04 15:43:09,488][BaseTrainer][INFO] - [Epoch 1/10, Iter 150/3560] 4.029943943023682, cross_entropy: 4.029943943023682
638
+ [2025-05-04 15:43:10,519][BaseTrainer][INFO] - [Epoch 1/10, Iter 151/3560] 3.805082321166992, cross_entropy: 3.805082321166992
639
+ [2025-05-04 15:43:13,695][BaseTrainer][INFO] - [Epoch 1/10, Iter 152/3560] 3.835386276245117, cross_entropy: 3.835386276245117
640
+ [2025-05-04 15:43:14,727][BaseTrainer][INFO] - [Epoch 1/10, Iter 153/3560] 4.022421836853027, cross_entropy: 4.022421836853027
641
+ [2025-05-04 15:43:17,758][BaseTrainer][INFO] - [Epoch 1/10, Iter 154/3560] 4.256207466125488, cross_entropy: 4.256207466125488
642
+ [2025-05-04 15:43:18,811][BaseTrainer][INFO] - [Epoch 1/10, Iter 155/3560] 3.787353038787842, cross_entropy: 3.787353038787842
643
+ [2025-05-04 15:43:22,536][BaseTrainer][INFO] - [Epoch 1/10, Iter 156/3560] 3.814817428588867, cross_entropy: 3.814817428588867
644
+ [2025-05-04 15:43:23,588][BaseTrainer][INFO] - [Epoch 1/10, Iter 157/3560] 4.2981061935424805, cross_entropy: 4.2981061935424805
645
+ [2025-05-04 15:43:26,502][BaseTrainer][INFO] - [Epoch 1/10, Iter 158/3560] 4.3363037109375, cross_entropy: 4.3363037109375
646
+ [2025-05-04 15:43:27,561][BaseTrainer][INFO] - [Epoch 1/10, Iter 159/3560] 3.993943214416504, cross_entropy: 3.993943214416504
647
+ [2025-05-04 15:43:31,023][BaseTrainer][INFO] - [Epoch 1/10, Iter 160/3560] 3.8628408908843994, cross_entropy: 3.8628408908843994
648
+ [2025-05-04 15:43:32,057][BaseTrainer][INFO] - [Epoch 1/10, Iter 161/3560] 3.870957851409912, cross_entropy: 3.870957851409912
649
+ [2025-05-04 15:43:35,409][BaseTrainer][INFO] - [Epoch 1/10, Iter 162/3560] 3.827423572540283, cross_entropy: 3.827423572540283
650
+ [2025-05-04 15:43:36,447][BaseTrainer][INFO] - [Epoch 1/10, Iter 163/3560] 3.8112123012542725, cross_entropy: 3.8112123012542725
651
+ [2025-05-04 15:43:39,359][BaseTrainer][INFO] - [Epoch 1/10, Iter 164/3560] 3.975653886795044, cross_entropy: 3.975653886795044
652
+ [2025-05-04 15:43:40,393][BaseTrainer][INFO] - [Epoch 1/10, Iter 165/3560] 4.278404712677002, cross_entropy: 4.278404712677002
653
+ [2025-05-04 15:43:43,182][BaseTrainer][INFO] - [Epoch 1/10, Iter 166/3560] 4.291914939880371, cross_entropy: 4.291914939880371
654
+ [2025-05-04 15:43:44,215][BaseTrainer][INFO] - [Epoch 1/10, Iter 167/3560] 3.8860697746276855, cross_entropy: 3.8860697746276855
655
+ [2025-05-04 15:43:47,243][BaseTrainer][INFO] - [Epoch 1/10, Iter 168/3560] 3.97953200340271, cross_entropy: 3.97953200340271
656
+ [2025-05-04 15:43:48,273][BaseTrainer][INFO] - [Epoch 1/10, Iter 169/3560] 4.418849945068359, cross_entropy: 4.418849945068359
657
+ [2025-05-04 15:43:52,155][BaseTrainer][INFO] - [Epoch 1/10, Iter 170/3560] 3.99082612991333, cross_entropy: 3.99082612991333
658
+ [2025-05-04 15:43:53,212][BaseTrainer][INFO] - [Epoch 1/10, Iter 171/3560] 3.6433072090148926, cross_entropy: 3.6433072090148926
659
+ [2025-05-04 15:43:56,167][BaseTrainer][INFO] - [Epoch 1/10, Iter 172/3560] 3.890350818634033, cross_entropy: 3.890350818634033
660
+ [2025-05-04 15:43:57,217][BaseTrainer][INFO] - [Epoch 1/10, Iter 173/3560] 3.7742199897766113, cross_entropy: 3.7742199897766113
661
+ [2025-05-04 15:44:00,341][BaseTrainer][INFO] - [Epoch 1/10, Iter 174/3560] 4.139997959136963, cross_entropy: 4.139997959136963
662
+ [2025-05-04 15:44:01,397][BaseTrainer][INFO] - [Epoch 1/10, Iter 175/3560] 4.463659286499023, cross_entropy: 4.463659286499023
663
+ [2025-05-04 15:44:04,941][BaseTrainer][INFO] - [Epoch 1/10, Iter 176/3560] 3.935756206512451, cross_entropy: 3.935756206512451
664
+ [2025-05-04 15:44:05,996][BaseTrainer][INFO] - [Epoch 1/10, Iter 177/3560] 3.7262871265411377, cross_entropy: 3.7262871265411377
665
+ [2025-05-04 15:44:09,246][BaseTrainer][INFO] - [Epoch 1/10, Iter 178/3560] 3.7987825870513916, cross_entropy: 3.7987825870513916
666
+ [2025-05-04 15:44:10,308][BaseTrainer][INFO] - [Epoch 1/10, Iter 179/3560] 3.6240315437316895, cross_entropy: 3.6240315437316895
667
+ [2025-05-04 15:44:13,550][BaseTrainer][INFO] - [Epoch 1/10, Iter 180/3560] 3.700782060623169, cross_entropy: 3.700782060623169
668
+ [2025-05-04 15:44:14,581][BaseTrainer][INFO] - [Epoch 1/10, Iter 181/3560] 3.962472438812256, cross_entropy: 3.962472438812256
669
+ [2025-05-04 15:44:18,026][BaseTrainer][INFO] - [Epoch 1/10, Iter 182/3560] 3.7632155418395996, cross_entropy: 3.7632155418395996
670
+ [2025-05-04 15:44:19,058][BaseTrainer][INFO] - [Epoch 1/10, Iter 183/3560] 3.956826686859131, cross_entropy: 3.956826686859131
671
+ [2025-05-04 15:44:23,384][BaseTrainer][INFO] - [Epoch 1/10, Iter 184/3560] 4.246981620788574, cross_entropy: 4.246981620788574
672
+ [2025-05-04 15:44:24,441][BaseTrainer][INFO] - [Epoch 1/10, Iter 185/3560] 3.8731627464294434, cross_entropy: 3.8731627464294434
673
+ [2025-05-04 15:44:27,344][BaseTrainer][INFO] - [Epoch 1/10, Iter 186/3560] 4.063902378082275, cross_entropy: 4.063902378082275
674
+ [2025-05-04 15:44:28,377][BaseTrainer][INFO] - [Epoch 1/10, Iter 187/3560] 3.6894967555999756, cross_entropy: 3.6894967555999756
675
+ [2025-05-04 15:44:32,188][BaseTrainer][INFO] - [Epoch 1/10, Iter 188/3560] 4.099838733673096, cross_entropy: 4.099838733673096
676
+ [2025-05-04 15:44:33,243][BaseTrainer][INFO] - [Epoch 1/10, Iter 189/3560] 4.2275166511535645, cross_entropy: 4.2275166511535645
677
+ [2025-05-04 15:44:35,945][BaseTrainer][INFO] - [Epoch 1/10, Iter 190/3560] 3.468061923980713, cross_entropy: 3.468061923980713
678
+ [2025-05-04 15:44:36,977][BaseTrainer][INFO] - [Epoch 1/10, Iter 191/3560] 4.105533599853516, cross_entropy: 4.105533599853516
679
+ [2025-05-04 15:44:39,745][BaseTrainer][INFO] - [Epoch 1/10, Iter 192/3560] 3.783463716506958, cross_entropy: 3.783463716506958
680
+ [2025-05-04 15:44:40,797][BaseTrainer][INFO] - [Epoch 1/10, Iter 193/3560] 4.128844738006592, cross_entropy: 4.128844738006592
681
+ [2025-05-04 15:44:43,553][BaseTrainer][INFO] - [Epoch 1/10, Iter 194/3560] 3.705315351486206, cross_entropy: 3.705315351486206
682
+ [2025-05-04 15:44:44,584][BaseTrainer][INFO] - [Epoch 1/10, Iter 195/3560] 4.0855913162231445, cross_entropy: 4.0855913162231445
683
+ [2025-05-04 15:44:47,041][BaseTrainer][INFO] - [Epoch 1/10, Iter 196/3560] 3.921950578689575, cross_entropy: 3.921950578689575
684
+ [2025-05-04 15:44:48,223][BaseTrainer][INFO] - [Epoch 1/10, Iter 197/3560] 4.081811428070068, cross_entropy: 4.081811428070068
685
+ [2025-05-04 15:44:51,484][BaseTrainer][INFO] - [Epoch 1/10, Iter 198/3560] 3.353686571121216, cross_entropy: 3.353686571121216
686
+ [2025-05-04 15:44:52,563][BaseTrainer][INFO] - [Epoch 1/10, Iter 199/3560] 3.644573450088501, cross_entropy: 3.644573450088501
687
+ [2025-05-04 15:44:56,034][BaseTrainer][INFO] - [Epoch 1/10, Iter 200/3560] 3.7141542434692383, cross_entropy: 3.7141542434692383
688
+ [2025-05-04 15:44:57,084][BaseTrainer][INFO] - [Epoch 1/10, Iter 201/3560] 3.6063575744628906, cross_entropy: 3.6063575744628906
689
+ [2025-05-04 15:45:00,073][BaseTrainer][INFO] - [Epoch 1/10, Iter 202/3560] 3.51749849319458, cross_entropy: 3.51749849319458
690
+ [2025-05-04 15:45:01,284][BaseTrainer][INFO] - [Epoch 1/10, Iter 203/3560] 4.176861763000488, cross_entropy: 4.176861763000488
691
+ [2025-05-04 15:45:04,357][BaseTrainer][INFO] - [Epoch 1/10, Iter 204/3560] 3.717007875442505, cross_entropy: 3.717007875442505
692
+ [2025-05-04 15:45:05,403][BaseTrainer][INFO] - [Epoch 1/10, Iter 205/3560] 4.427414894104004, cross_entropy: 4.427414894104004
693
+ [2025-05-04 15:45:08,656][BaseTrainer][INFO] - [Epoch 1/10, Iter 206/3560] 3.8307533264160156, cross_entropy: 3.8307533264160156
694
+ [2025-05-04 15:45:09,853][BaseTrainer][INFO] - [Epoch 1/10, Iter 207/3560] 3.3228912353515625, cross_entropy: 3.3228912353515625
695
+ [2025-05-04 15:45:12,957][BaseTrainer][INFO] - [Epoch 1/10, Iter 208/3560] 3.6359074115753174, cross_entropy: 3.6359074115753174
696
+ [2025-05-04 15:45:14,007][BaseTrainer][INFO] - [Epoch 1/10, Iter 209/3560] 3.6332781314849854, cross_entropy: 3.6332781314849854
697
+ [2025-05-04 15:45:16,652][BaseTrainer][INFO] - [Epoch 1/10, Iter 210/3560] 3.547818183898926, cross_entropy: 3.547818183898926
698
+ [2025-05-04 15:45:17,687][BaseTrainer][INFO] - [Epoch 1/10, Iter 211/3560] 3.799422264099121, cross_entropy: 3.799422264099121
699
+ [2025-05-04 15:45:20,857][BaseTrainer][INFO] - [Epoch 1/10, Iter 212/3560] 3.9624199867248535, cross_entropy: 3.9624199867248535
700
+ [2025-05-04 15:45:21,986][BaseTrainer][INFO] - [Epoch 1/10, Iter 213/3560] 4.425595283508301, cross_entropy: 4.425595283508301
701
+ [2025-05-04 15:45:25,283][BaseTrainer][INFO] - [Epoch 1/10, Iter 214/3560] 3.5650367736816406, cross_entropy: 3.5650367736816406
702
+ [2025-05-04 15:45:26,335][BaseTrainer][INFO] - [Epoch 1/10, Iter 215/3560] 3.262298107147217, cross_entropy: 3.262298107147217
703
+ [2025-05-04 15:45:29,442][BaseTrainer][INFO] - [Epoch 1/10, Iter 216/3560] 4.325966835021973, cross_entropy: 4.325966835021973
704
+ [2025-05-04 15:45:30,496][BaseTrainer][INFO] - [Epoch 1/10, Iter 217/3560] 3.4640183448791504, cross_entropy: 3.4640183448791504
705
+ [2025-05-04 15:45:33,518][BaseTrainer][INFO] - [Epoch 1/10, Iter 218/3560] 3.00398588180542, cross_entropy: 3.00398588180542
706
+ [2025-05-04 15:45:34,571][BaseTrainer][INFO] - [Epoch 1/10, Iter 219/3560] 4.193729400634766, cross_entropy: 4.193729400634766
707
+ [2025-05-04 15:45:38,244][BaseTrainer][INFO] - [Epoch 1/10, Iter 220/3560] 4.179624557495117, cross_entropy: 4.179624557495117
708
+ [2025-05-04 15:45:39,346][BaseTrainer][INFO] - [Epoch 1/10, Iter 221/3560] 3.7965245246887207, cross_entropy: 3.7965245246887207
709
+ [2025-05-04 15:45:42,460][BaseTrainer][INFO] - [Epoch 1/10, Iter 222/3560] 3.20782470703125, cross_entropy: 3.20782470703125
710
+ [2025-05-04 15:45:43,509][BaseTrainer][INFO] - [Epoch 1/10, Iter 223/3560] 4.157542705535889, cross_entropy: 4.157542705535889
711
+ [2025-05-04 15:45:45,875][BaseTrainer][INFO] - [Epoch 1/10, Iter 224/3560] 3.989438533782959, cross_entropy: 3.989438533782959
712
+ [2025-05-04 15:45:47,171][BaseTrainer][INFO] - [Epoch 1/10, Iter 225/3560] 3.6186747550964355, cross_entropy: 3.6186747550964355
713
+ [2025-05-04 15:45:50,109][BaseTrainer][INFO] - [Epoch 1/10, Iter 226/3560] 3.1917433738708496, cross_entropy: 3.1917433738708496
714
+ [2025-05-04 15:45:51,730][BaseTrainer][INFO] - [Epoch 1/10, Iter 227/3560] 3.5871357917785645, cross_entropy: 3.5871357917785645
715
+ [2025-05-04 15:45:54,999][BaseTrainer][INFO] - [Epoch 1/10, Iter 228/3560] 3.428403377532959, cross_entropy: 3.428403377532959
716
+ [2025-05-04 15:45:56,089][BaseTrainer][INFO] - [Epoch 1/10, Iter 229/3560] 3.632758617401123, cross_entropy: 3.632758617401123
717
+ [2025-05-04 15:45:58,615][BaseTrainer][INFO] - [Epoch 1/10, Iter 230/3560] 3.972907066345215, cross_entropy: 3.972907066345215
718
+ [2025-05-04 15:45:59,732][BaseTrainer][INFO] - [Epoch 1/10, Iter 231/3560] 3.840569257736206, cross_entropy: 3.840569257736206
719
+ [2025-05-04 15:46:03,151][BaseTrainer][INFO] - [Epoch 1/10, Iter 232/3560] 3.692924976348877, cross_entropy: 3.692924976348877
720
+ [2025-05-04 15:46:04,208][BaseTrainer][INFO] - [Epoch 1/10, Iter 233/3560] 3.61606502532959, cross_entropy: 3.61606502532959
721
+ [2025-05-04 15:46:08,442][BaseTrainer][INFO] - [Epoch 1/10, Iter 234/3560] 3.608592987060547, cross_entropy: 3.608592987060547
722
+ [2025-05-04 15:46:09,543][BaseTrainer][INFO] - [Epoch 1/10, Iter 235/3560] 3.3924107551574707, cross_entropy: 3.3924107551574707
723
+ [2025-05-04 15:46:12,410][BaseTrainer][INFO] - [Epoch 1/10, Iter 236/3560] 3.389481544494629, cross_entropy: 3.389481544494629
724
+ [2025-05-04 15:46:13,458][BaseTrainer][INFO] - [Epoch 1/10, Iter 237/3560] 3.601651906967163, cross_entropy: 3.601651906967163
725
+ [2025-05-04 15:46:16,231][BaseTrainer][INFO] - [Epoch 1/10, Iter 238/3560] 3.541386604309082, cross_entropy: 3.541386604309082
726
+ [2025-05-04 15:46:17,281][BaseTrainer][INFO] - [Epoch 1/10, Iter 239/3560] 3.364543914794922, cross_entropy: 3.364543914794922
727
+ [2025-05-04 15:46:20,548][BaseTrainer][INFO] - [Epoch 1/10, Iter 240/3560] 3.4454920291900635, cross_entropy: 3.4454920291900635
728
+ [2025-05-04 15:46:21,604][BaseTrainer][INFO] - [Epoch 1/10, Iter 241/3560] 3.2512612342834473, cross_entropy: 3.2512612342834473
729
+ [2025-05-04 15:46:25,509][BaseTrainer][INFO] - [Epoch 1/10, Iter 242/3560] 3.982595443725586, cross_entropy: 3.982595443725586
730
+ [2025-05-04 15:46:26,564][BaseTrainer][INFO] - [Epoch 1/10, Iter 243/3560] 3.024195671081543, cross_entropy: 3.024195671081543
731
+ [2025-05-04 15:46:30,189][BaseTrainer][INFO] - [Epoch 1/10, Iter 244/3560] 3.38856840133667, cross_entropy: 3.38856840133667
732
+ [2025-05-04 15:46:31,222][BaseTrainer][INFO] - [Epoch 1/10, Iter 245/3560] 3.58903169631958, cross_entropy: 3.58903169631958
733
+ [2025-05-04 15:46:33,952][BaseTrainer][INFO] - [Epoch 1/10, Iter 246/3560] 3.0575690269470215, cross_entropy: 3.0575690269470215
734
+ [2025-05-04 15:46:35,005][BaseTrainer][INFO] - [Epoch 1/10, Iter 247/3560] 3.803276300430298, cross_entropy: 3.803276300430298
735
+ [2025-05-04 15:46:38,371][BaseTrainer][INFO] - [Epoch 1/10, Iter 248/3560] 4.254518508911133, cross_entropy: 4.254518508911133
736
+ [2025-05-04 15:46:39,430][BaseTrainer][INFO] - [Epoch 1/10, Iter 249/3560] 3.6299855709075928, cross_entropy: 3.6299855709075928
737
+ [2025-05-04 15:46:42,489][BaseTrainer][INFO] - [Epoch 1/10, Iter 250/3560] 4.0379180908203125, cross_entropy: 4.0379180908203125
738
+ [2025-05-04 15:46:43,543][BaseTrainer][INFO] - [Epoch 1/10, Iter 251/3560] 3.455204486846924, cross_entropy: 3.455204486846924
739
+ [2025-05-04 15:46:47,003][BaseTrainer][INFO] - [Epoch 1/10, Iter 252/3560] 3.641664743423462, cross_entropy: 3.641664743423462
740
+ [2025-05-04 15:46:48,054][BaseTrainer][INFO] - [Epoch 1/10, Iter 253/3560] 3.341184616088867, cross_entropy: 3.341184616088867
741
+ [2025-05-04 15:46:50,934][BaseTrainer][INFO] - [Epoch 1/10, Iter 254/3560] 3.268005609512329, cross_entropy: 3.268005609512329
742
+ [2025-05-04 15:46:51,968][BaseTrainer][INFO] - [Epoch 1/10, Iter 255/3560] 2.850250720977783, cross_entropy: 2.850250720977783
743
+ [2025-05-04 15:46:55,426][BaseTrainer][INFO] - [Epoch 1/10, Iter 256/3560] 3.390626907348633, cross_entropy: 3.390626907348633
744
+ [2025-05-04 15:46:56,480][BaseTrainer][INFO] - [Epoch 1/10, Iter 257/3560] 4.221792221069336, cross_entropy: 4.221792221069336
745
+ [2025-05-04 15:46:59,679][BaseTrainer][INFO] - [Epoch 1/10, Iter 258/3560] 3.77260160446167, cross_entropy: 3.77260160446167
746
+ [2025-05-04 15:47:00,734][BaseTrainer][INFO] - [Epoch 1/10, Iter 259/3560] 3.8018369674682617, cross_entropy: 3.8018369674682617
747
+ [2025-05-04 15:47:04,046][BaseTrainer][INFO] - [Epoch 1/10, Iter 260/3560] 3.230081558227539, cross_entropy: 3.230081558227539
748
+ [2025-05-04 15:47:05,102][BaseTrainer][INFO] - [Epoch 1/10, Iter 261/3560] 2.9646286964416504, cross_entropy: 2.9646286964416504
749
+ [2025-05-04 15:47:07,788][BaseTrainer][INFO] - [Epoch 1/10, Iter 262/3560] 4.133569717407227, cross_entropy: 4.133569717407227
750
+ [2025-05-04 15:47:08,830][BaseTrainer][INFO] - [Epoch 1/10, Iter 263/3560] 3.628683090209961, cross_entropy: 3.628683090209961
751
+ [2025-05-04 15:47:12,809][BaseTrainer][INFO] - [Epoch 1/10, Iter 264/3560] 3.167900562286377, cross_entropy: 3.167900562286377
752
+ [2025-05-04 15:47:13,868][BaseTrainer][INFO] - [Epoch 1/10, Iter 265/3560] 3.3190231323242188, cross_entropy: 3.3190231323242188
753
+ [2025-05-04 15:47:16,945][BaseTrainer][INFO] - [Epoch 1/10, Iter 266/3560] 3.4320666790008545, cross_entropy: 3.4320666790008545
754
+ [2025-05-04 15:47:18,014][BaseTrainer][INFO] - [Epoch 1/10, Iter 267/3560] 3.4276881217956543, cross_entropy: 3.4276881217956543
755
+ [2025-05-04 15:47:21,062][BaseTrainer][INFO] - [Epoch 1/10, Iter 268/3560] 2.7244088649749756, cross_entropy: 2.7244088649749756
756
+ [2025-05-04 15:47:22,108][BaseTrainer][INFO] - [Epoch 1/10, Iter 269/3560] 3.503838062286377, cross_entropy: 3.503838062286377
757
+ [2025-05-04 15:47:24,722][BaseTrainer][INFO] - [Epoch 1/10, Iter 270/3560] 4.146647930145264, cross_entropy: 4.146647930145264
758
+ [2025-05-04 15:47:25,778][BaseTrainer][INFO] - [Epoch 1/10, Iter 271/3560] 4.013218402862549, cross_entropy: 4.013218402862549
759
+ [2025-05-04 15:47:28,913][BaseTrainer][INFO] - [Epoch 1/10, Iter 272/3560] 3.860719919204712, cross_entropy: 3.860719919204712
760
+ [2025-05-04 15:47:29,976][BaseTrainer][INFO] - [Epoch 1/10, Iter 273/3560] 3.8546671867370605, cross_entropy: 3.8546671867370605
761
+ [2025-05-04 15:47:33,011][BaseTrainer][INFO] - [Epoch 1/10, Iter 274/3560] 3.229749917984009, cross_entropy: 3.229749917984009
762
+ [2025-05-04 15:47:34,073][BaseTrainer][INFO] - [Epoch 1/10, Iter 275/3560] 4.244211196899414, cross_entropy: 4.244211196899414
763
+ [2025-05-04 15:47:36,797][BaseTrainer][INFO] - [Epoch 1/10, Iter 276/3560] 3.8821959495544434, cross_entropy: 3.8821959495544434
764
+ [2025-05-04 15:47:37,828][BaseTrainer][INFO] - [Epoch 1/10, Iter 277/3560] 3.31406307220459, cross_entropy: 3.31406307220459
765
+ [2025-05-04 15:47:41,369][BaseTrainer][INFO] - [Epoch 1/10, Iter 278/3560] 3.015435218811035, cross_entropy: 3.015435218811035
766
+ [2025-05-04 15:47:42,419][BaseTrainer][INFO] - [Epoch 1/10, Iter 279/3560] 4.190752029418945, cross_entropy: 4.190752029418945
767
+ [2025-05-04 15:47:45,365][BaseTrainer][INFO] - [Epoch 1/10, Iter 280/3560] 3.2984061241149902, cross_entropy: 3.2984061241149902
768
+ [2025-05-04 15:47:46,416][BaseTrainer][INFO] - [Epoch 1/10, Iter 281/3560] 3.5504894256591797, cross_entropy: 3.5504894256591797
769
+ [2025-05-04 15:47:49,376][BaseTrainer][INFO] - [Epoch 1/10, Iter 282/3560] 3.105098009109497, cross_entropy: 3.105098009109497
770
+ [2025-05-04 15:47:50,428][BaseTrainer][INFO] - [Epoch 1/10, Iter 283/3560] 3.6767663955688477, cross_entropy: 3.6767663955688477
771
+ [2025-05-04 15:47:53,486][BaseTrainer][INFO] - [Epoch 1/10, Iter 284/3560] 3.078446865081787, cross_entropy: 3.078446865081787
772
+ [2025-05-04 15:47:54,536][BaseTrainer][INFO] - [Epoch 1/10, Iter 285/3560] 2.937081813812256, cross_entropy: 2.937081813812256
773
+ [2025-05-04 15:47:57,459][BaseTrainer][INFO] - [Epoch 1/10, Iter 286/3560] 3.384705066680908, cross_entropy: 3.384705066680908
774
+ [2025-05-04 15:47:58,496][BaseTrainer][INFO] - [Epoch 1/10, Iter 287/3560] 3.093351125717163, cross_entropy: 3.093351125717163
775
+ [2025-05-04 15:48:01,395][BaseTrainer][INFO] - [Epoch 1/10, Iter 288/3560] 2.6946282386779785, cross_entropy: 2.6946282386779785
776
+ [2025-05-04 15:48:02,449][BaseTrainer][INFO] - [Epoch 1/10, Iter 289/3560] 3.7547240257263184, cross_entropy: 3.7547240257263184
777
+ [2025-05-04 15:48:06,235][BaseTrainer][INFO] - [Epoch 1/10, Iter 290/3560] 2.8298704624176025, cross_entropy: 2.8298704624176025
778
+ [2025-05-04 15:48:07,269][BaseTrainer][INFO] - [Epoch 1/10, Iter 291/3560] 3.0255863666534424, cross_entropy: 3.0255863666534424
779
+ [2025-05-04 15:48:10,384][BaseTrainer][INFO] - [Epoch 1/10, Iter 292/3560] 3.471223831176758, cross_entropy: 3.471223831176758
780
+ [2025-05-04 15:48:11,439][BaseTrainer][INFO] - [Epoch 1/10, Iter 293/3560] 4.027524948120117, cross_entropy: 4.027524948120117
781
+ [2025-05-04 15:48:13,751][BaseTrainer][INFO] - [Epoch 1/10, Iter 294/3560] 2.9611196517944336, cross_entropy: 2.9611196517944336
782
+ [2025-05-04 15:48:14,788][BaseTrainer][INFO] - [Epoch 1/10, Iter 295/3560] 2.9723658561706543, cross_entropy: 2.9723658561706543
783
+ [2025-05-04 15:48:18,471][BaseTrainer][INFO] - [Epoch 1/10, Iter 296/3560] 3.3414180278778076, cross_entropy: 3.3414180278778076
784
+ [2025-05-04 15:48:19,576][BaseTrainer][INFO] - [Epoch 1/10, Iter 297/3560] 4.208791732788086, cross_entropy: 4.208791732788086
785
+ [2025-05-04 15:48:22,848][BaseTrainer][INFO] - [Epoch 1/10, Iter 298/3560] 2.780437707901001, cross_entropy: 2.780437707901001
786
+ [2025-05-04 15:48:23,886][BaseTrainer][INFO] - [Epoch 1/10, Iter 299/3560] 3.171988010406494, cross_entropy: 3.171988010406494
787
+ [2025-05-04 15:48:26,473][BaseTrainer][INFO] - [Epoch 1/10, Iter 300/3560] 2.800706386566162, cross_entropy: 2.800706386566162
788
+ [2025-05-04 15:48:27,553][BaseTrainer][INFO] - [Epoch 1/10, Iter 301/3560] 2.5326716899871826, cross_entropy: 2.5326716899871826
789
+ [2025-05-04 15:48:31,087][BaseTrainer][INFO] - [Epoch 1/10, Iter 302/3560] 3.0759329795837402, cross_entropy: 3.0759329795837402
790
+ [2025-05-04 15:48:32,170][BaseTrainer][INFO] - [Epoch 1/10, Iter 303/3560] 3.110572576522827, cross_entropy: 3.110572576522827
791
+ [2025-05-04 15:48:35,517][BaseTrainer][INFO] - [Epoch 1/10, Iter 304/3560] 2.567490816116333, cross_entropy: 2.567490816116333
792
+ [2025-05-04 15:48:36,725][BaseTrainer][INFO] - [Epoch 1/10, Iter 305/3560] 3.407499074935913, cross_entropy: 3.407499074935913
793
+ [2025-05-04 15:48:39,501][BaseTrainer][INFO] - [Epoch 1/10, Iter 306/3560] 2.7290828227996826, cross_entropy: 2.7290828227996826
794
+ [2025-05-04 15:48:41,177][BaseTrainer][INFO] - [Epoch 1/10, Iter 307/3560] 3.153337001800537, cross_entropy: 3.153337001800537
795
+ [2025-05-04 15:48:43,159][BaseTrainer][INFO] - [Epoch 1/10, Iter 308/3560] 3.1342296600341797, cross_entropy: 3.1342296600341797
796
+ [2025-05-04 15:48:45,243][BaseTrainer][INFO] - [Epoch 1/10, Iter 309/3560] 2.606759548187256, cross_entropy: 2.606759548187256
797
+ [2025-05-04 15:48:46,543][BaseTrainer][INFO] - [Epoch 1/10, Iter 310/3560] 2.7439932823181152, cross_entropy: 2.7439932823181152
798
+ [2025-05-04 15:48:49,531][BaseTrainer][INFO] - [Epoch 1/10, Iter 311/3560] 3.740546226501465, cross_entropy: 3.740546226501465
799
+ [2025-05-04 15:48:50,634][BaseTrainer][INFO] - [Epoch 1/10, Iter 312/3560] 3.289177417755127, cross_entropy: 3.289177417755127
800
+ [2025-05-04 15:48:53,534][BaseTrainer][INFO] - [Epoch 1/10, Iter 313/3560] 3.084707736968994, cross_entropy: 3.084707736968994
801
+ [2025-05-04 15:48:54,815][BaseTrainer][INFO] - [Epoch 1/10, Iter 314/3560] 3.2798752784729004, cross_entropy: 3.2798752784729004
802
+ [2025-05-04 15:48:57,330][BaseTrainer][INFO] - [Epoch 1/10, Iter 315/3560] 2.9585819244384766, cross_entropy: 2.9585819244384766
803
+ [2025-05-04 15:48:59,217][BaseTrainer][INFO] - [Epoch 1/10, Iter 316/3560] 3.504563808441162, cross_entropy: 3.504563808441162
804
+ [2025-05-04 15:49:01,274][BaseTrainer][INFO] - [Epoch 1/10, Iter 317/3560] 2.982103109359741, cross_entropy: 2.982103109359741
805
+ [2025-05-04 15:49:03,260][BaseTrainer][INFO] - [Epoch 1/10, Iter 318/3560] 3.3061461448669434, cross_entropy: 3.3061461448669434
806
+ [2025-05-04 15:49:05,403][BaseTrainer][INFO] - [Epoch 1/10, Iter 319/3560] 3.3542447090148926, cross_entropy: 3.3542447090148926
807
+ [2025-05-04 15:49:07,758][BaseTrainer][INFO] - [Epoch 1/10, Iter 320/3560] 2.830873966217041, cross_entropy: 2.830873966217041
808
+ [2025-05-04 15:49:09,518][BaseTrainer][INFO] - [Epoch 1/10, Iter 321/3560] 3.878793954849243, cross_entropy: 3.878793954849243
809
+ [2025-05-04 15:49:11,693][BaseTrainer][INFO] - [Epoch 1/10, Iter 322/3560] 3.0356221199035645, cross_entropy: 3.0356221199035645
810
+ [2025-05-04 15:49:13,799][BaseTrainer][INFO] - [Epoch 1/10, Iter 323/3560] 2.5573325157165527, cross_entropy: 2.5573325157165527
811
+ [2025-05-04 15:49:15,908][BaseTrainer][INFO] - [Epoch 1/10, Iter 324/3560] 3.7650721073150635, cross_entropy: 3.7650721073150635
812
+ [2025-05-04 15:49:17,897][BaseTrainer][INFO] - [Epoch 1/10, Iter 325/3560] 3.0595507621765137, cross_entropy: 3.0595507621765137
813
+ [2025-05-04 15:49:20,665][BaseTrainer][INFO] - [Epoch 1/10, Iter 326/3560] 2.892014741897583, cross_entropy: 2.892014741897583
814
+ [2025-05-04 15:49:22,147][BaseTrainer][INFO] - [Epoch 1/10, Iter 327/3560] 2.8922882080078125, cross_entropy: 2.8922882080078125
815
+ [2025-05-04 15:49:24,585][BaseTrainer][INFO] - [Epoch 1/10, Iter 328/3560] 3.8311381340026855, cross_entropy: 3.8311381340026855
816
+ [2025-05-04 15:49:26,335][BaseTrainer][INFO] - [Epoch 1/10, Iter 329/3560] 3.166689157485962, cross_entropy: 3.166689157485962
817
+ [2025-05-04 15:49:28,684][BaseTrainer][INFO] - [Epoch 1/10, Iter 330/3560] 3.9570493698120117, cross_entropy: 3.9570493698120117
818
+ [2025-05-04 15:49:29,781][BaseTrainer][INFO] - [Epoch 1/10, Iter 331/3560] 3.397995948791504, cross_entropy: 3.397995948791504
819
+ [2025-05-04 15:49:33,002][BaseTrainer][INFO] - [Epoch 1/10, Iter 332/3560] 2.5950450897216797, cross_entropy: 2.5950450897216797
820
+ [2025-05-04 15:49:34,073][BaseTrainer][INFO] - [Epoch 1/10, Iter 333/3560] 3.4235217571258545, cross_entropy: 3.4235217571258545
821
+ [2025-05-04 15:49:36,793][BaseTrainer][INFO] - [Epoch 1/10, Iter 334/3560] 2.8502418994903564, cross_entropy: 2.8502418994903564
822
+ [2025-05-04 15:49:37,986][BaseTrainer][INFO] - [Epoch 1/10, Iter 335/3560] 3.467989921569824, cross_entropy: 3.467989921569824
823
+ [2025-05-04 15:49:40,788][BaseTrainer][INFO] - [Epoch 1/10, Iter 336/3560] 2.778249740600586, cross_entropy: 2.778249740600586
824
+ [2025-05-04 15:49:41,886][BaseTrainer][INFO] - [Epoch 1/10, Iter 337/3560] 2.4447736740112305, cross_entropy: 2.4447736740112305
825
+ [2025-05-04 15:49:45,446][BaseTrainer][INFO] - [Epoch 1/10, Iter 338/3560] 2.629225254058838, cross_entropy: 2.629225254058838
826
+ [2025-05-04 15:49:46,480][BaseTrainer][INFO] - [Epoch 1/10, Iter 339/3560] 2.776583194732666, cross_entropy: 2.776583194732666
827
+ [2025-05-04 15:49:49,778][BaseTrainer][INFO] - [Epoch 1/10, Iter 340/3560] 2.8193931579589844, cross_entropy: 2.8193931579589844
828
+ [2025-05-04 15:49:50,837][BaseTrainer][INFO] - [Epoch 1/10, Iter 341/3560] 3.5558292865753174, cross_entropy: 3.5558292865753174
829
+ [2025-05-04 15:49:54,146][BaseTrainer][INFO] - [Epoch 1/10, Iter 342/3560] 3.3698346614837646, cross_entropy: 3.3698346614837646
830
+ [2025-05-04 15:49:55,200][BaseTrainer][INFO] - [Epoch 1/10, Iter 343/3560] 3.103781223297119, cross_entropy: 3.103781223297119
831
+ [2025-05-04 15:49:58,185][BaseTrainer][INFO] - [Epoch 1/10, Iter 344/3560] 2.4396743774414062, cross_entropy: 2.4396743774414062
832
+ [2025-05-04 15:49:59,219][BaseTrainer][INFO] - [Epoch 1/10, Iter 345/3560] 3.6696176528930664, cross_entropy: 3.6696176528930664
833
+ [2025-05-04 15:50:02,898][BaseTrainer][INFO] - [Epoch 1/10, Iter 346/3560] 3.1565327644348145, cross_entropy: 3.1565327644348145
834
+ [2025-05-04 15:50:03,953][BaseTrainer][INFO] - [Epoch 1/10, Iter 347/3560] 2.7521986961364746, cross_entropy: 2.7521986961364746
835
+ [2025-05-04 15:50:07,638][BaseTrainer][INFO] - [Epoch 1/10, Iter 348/3560] 2.695011854171753, cross_entropy: 2.695011854171753
836
+ [2025-05-04 15:50:08,676][BaseTrainer][INFO] - [Epoch 1/10, Iter 349/3560] 2.9952139854431152, cross_entropy: 2.9952139854431152
837
+ [2025-05-04 15:50:11,707][BaseTrainer][INFO] - [Epoch 1/10, Iter 350/3560] 2.670276641845703, cross_entropy: 2.670276641845703
838
+ [2025-05-04 15:50:12,764][BaseTrainer][INFO] - [Epoch 1/10, Iter 351/3560] 2.838888168334961, cross_entropy: 2.838888168334961
839
+ [2025-05-04 15:50:15,545][BaseTrainer][INFO] - [Epoch 1/10, Iter 352/3560] 2.5220344066619873, cross_entropy: 2.5220344066619873
840
+ [2025-05-04 15:50:16,593][BaseTrainer][INFO] - [Epoch 1/10, Iter 353/3560] 2.882241725921631, cross_entropy: 2.882241725921631
841
+ [2025-05-04 15:50:19,281][BaseTrainer][INFO] - [Epoch 1/10, Iter 354/3560] 2.993210792541504, cross_entropy: 2.993210792541504
842
+ [2025-05-04 15:50:20,313][BaseTrainer][INFO] - [Epoch 1/10, Iter 355/3560] 3.0053420066833496, cross_entropy: 3.0053420066833496
843
+ [2025-05-04 15:50:33,914][BaseTrainer][INFO] - [Epoch 1/10, Iter 356/3560] 3.3669323921203613, cross_entropy: 3.3669323921203613
844
+ [2025-05-04 15:53:41,940][BaseTrainer][INFO] - [Epoch 1/10] (train) 3.984675884246826, cross_entropy: 3.984675884246826
845
+ [2025-05-04 15:53:41,940][BaseTrainer][INFO] - [Epoch 1/10] (validation) 2.4288501739501953, cross_entropy: 2.4288501739501953
846
+ [2025-05-04 15:53:41,941][BaseTrainer][INFO] - [Epoch 1/10] (metrics) roc_auc: 0.9189022779464722
847
+ [2025-05-04 15:53:42,436][BaseTrainer][INFO] - Save model: ./exp/20250504-153714/model/best_epoch.pth.
848
+ [2025-05-04 15:53:42,919][BaseTrainer][INFO] - Save model: ./exp/20250504-153714/model/epoch1.pth.
849
+ [2025-05-04 15:53:43,395][BaseTrainer][INFO] - Save model: ./exp/20250504-153714/model/last.pth.
850
+ [2025-05-04 15:53:48,172][BaseTrainer][INFO] - [Epoch 2/10, Iter 357/3560] 2.757335662841797, cross_entropy: 2.757335662841797
851
+ [2025-05-04 15:53:49,263][BaseTrainer][INFO] - [Epoch 2/10, Iter 358/3560] 3.5638482570648193, cross_entropy: 3.5638482570648193
852
+ [2025-05-04 15:53:52,078][BaseTrainer][INFO] - [Epoch 2/10, Iter 359/3560] 3.5854949951171875, cross_entropy: 3.5854949951171875
853
+ [2025-05-04 15:53:53,266][BaseTrainer][INFO] - [Epoch 2/10, Iter 360/3560] 2.808537006378174, cross_entropy: 2.808537006378174