ms180 commited on
Commit
f56040d
·
verified ·
1 Parent(s): 11a1f0d

Upload folder using huggingface_hub

Browse files
bpe_30/bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7735319a0186e2ad272576bdbeb68243e5378d065d724e4570951557a93f88f
3
+ size 237952
bpe_30/bpe.vocab ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <unk> 0
2
+ <s> 0
3
+ </s> 0
4
+ EN -0
5
+ AR -1
6
+ EV -2
7
+ NE -3
8
+ TE -4
9
+ TY -5
10
+ ▁E -6
11
+ ▁S -7
12
+ E -8
13
+ ▁ -9
14
+ T -10
15
+ N -11
16
+ I -12
17
+ H -13
18
+ R -14
19
+ A -15
20
+ F -16
21
+ G -17
22
+ O -18
23
+ S -19
24
+ V -20
25
+ Y -21
26
+ C -22
27
+ D -23
28
+ L -24
29
+ M -25
30
+ W -26
bpe_30/tokens.txt ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <blank>
2
+ <unk>
3
+ EN
4
+ AR
5
+ EV
6
+ NE
7
+ TE
8
+ TY
9
+ ▁E
10
+ ▁S
11
+ E
12
+
13
+ T
14
+ N
15
+ I
16
+ H
17
+ R
18
+ A
19
+ F
20
+ G
21
+ O
22
+ S
23
+ V
24
+ Y
25
+ C
26
+ D
27
+ L
28
+ M
29
+ W
30
+ <sos/eos>
bpe_30/train.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ GO
2
+ MARCH THIRD NINETEEN TWENTY EIGHT
3
+ START
4
+ ELEVEN SEVENTEEN FIFTY ONE
bpe_30/train_tokenizer.log ADDED
@@ -0,0 +1,291 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2026-01-13 22:35:49 | INFO | espnet3 | === ESPnet3 run started: 2026-01-13T22:35:49.669775 ===
2
+ 2026-01-13 22:35:49 | INFO | espnet3 | Command: /data/user_data/msomeki/espnet3/.venv/bin/python3 run.py --stages create_dataset train_tokenizer collect_stats train infer measure --train_config conf/train.yaml --infer_config conf/infer.yaml --measure_config conf/measure.yaml
3
+ 2026-01-13 22:35:49 | INFO | espnet3 | Python: 3.11.13 (main, Aug 18 2025, 19:19:13) [Clang 20.1.4 ]
4
+ 2026-01-13 22:35:49 | INFO | espnet3 | Working directory: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr
5
+ 2026-01-13 22:35:49 | INFO | espnet3 | train config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/train_asr_rnn_data_aug_debug.yaml
6
+ 2026-01-13 22:35:49 | INFO | espnet3 | infer config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/infer.yaml
7
+ 2026-01-13 22:35:49 | INFO | espnet3 | measure config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/measure.yaml
8
+ 2026-01-13 22:35:49 | INFO | espnet3 | Git: commit=8509faad9811b58d5024f29fb9d68ffb026b5e73, short_commit=8509faad9, branch=espnet3/recipe/asr_ls100, worktree=dirty
9
+ 2026-01-13 22:35:49 | INFO | espnet3 | Cluster env:
10
+ OMPI_MCA_plm_slurm_args=--external-launcher
11
+ SLURM_CLUSTER_NAME=babel
12
+ SLURM_CONF=/var/spool/slurmd/conf-cache/slurm.conf
13
+ SLURM_CPUS_ON_NODE=1
14
+ SLURM_CPUS_PER_TASK=1
15
+ SLURM_CPU_BIND=quiet,mask_cpu:0x0000000000010000
16
+ SLURM_CPU_BIND_LIST=0x0000000000010000
17
+ SLURM_CPU_BIND_TYPE=mask_cpu:
18
+ SLURM_CPU_BIND_VERBOSE=quiet
19
+ SLURM_DISTRIBUTION=cyclic,pack
20
+ SLURM_GTIDS=0
21
+ SLURM_JOBID=6122041
22
+ SLURM_JOB_ACCOUNT=swatanab
23
+ SLURM_JOB_CPUS_PER_NODE=1
24
+ SLURM_JOB_END_TIME=1768401875
25
+ SLURM_JOB_GID=2709140
26
+ SLURM_JOB_GROUP=msomeki
27
+ SLURM_JOB_ID=6122041
28
+ SLURM_JOB_NAME=bash
29
+ SLURM_JOB_NODELIST=babel-o9-16
30
+ SLURM_JOB_NUM_NODES=1
31
+ SLURM_JOB_PARTITION=debug
32
+ SLURM_JOB_QOS=debug_qos
33
+ SLURM_JOB_START_TIME=1768358675
34
+ SLURM_JOB_UID=2709140
35
+ SLURM_JOB_USER=msomeki
36
+ SLURM_LAUNCH_NODE_IPADDR=172.16.1.2
37
+ SLURM_LOCALID=0
38
+ SLURM_MEM_PER_NODE=4096
39
+ SLURM_NNODES=1
40
+ SLURM_NODEID=0
41
+ SLURM_NODELIST=babel-o9-16
42
+ SLURM_NPROCS=1
43
+ SLURM_NTASKS=1
44
+ SLURM_NTASKS_PER_NODE=1
45
+ SLURM_PRIO_PROCESS=0
46
+ SLURM_PROCID=0
47
+ SLURM_PTY_PORT=40465
48
+ SLURM_PTY_WIN_COL=112
49
+ SLURM_PTY_WIN_ROW=61
50
+ SLURM_SCRIPT_CONTEXT=prolog_task
51
+ SLURM_SRUN_COMM_HOST=172.16.1.2
52
+ SLURM_SRUN_COMM_PORT=33789
53
+ SLURM_STEPID=0
54
+ SLURM_STEP_ID=0
55
+ SLURM_STEP_LAUNCHER_PORT=33789
56
+ SLURM_STEP_NODELIST=babel-o9-16
57
+ SLURM_STEP_NUM_NODES=1
58
+ SLURM_STEP_NUM_TASKS=1
59
+ SLURM_STEP_TASKS_PER_NODE=1
60
+ SLURM_SUBMIT_DIR=/home/msomeki/00_systems/espnet3
61
+ SLURM_SUBMIT_HOST=login1
62
+ SLURM_TASKS_PER_NODE=1
63
+ SLURM_TASK_PID=3334910
64
+ SLURM_TOPOLOGY_ADDR=babel-o9-16
65
+ SLURM_TOPOLOGY_ADDR_PATTERN=node
66
+ SLURM_TRES_PER_TASK=cpu=1
67
+ SLURM_UMASK=0027
68
+ 2026-01-13 22:35:49 | INFO | espnet3 | Runtime env:
69
+ LD_LIBRARY_PATH=/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:
70
+ PATH=/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/data/user_data/msomeki/espnet3/.venv/bin:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/usr/share/Modules/bin:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/home/msomeki/.local/bin:/home/msomeki/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
71
+ PYTHONPATH=/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:
72
+ 2026-01-13 22:35:49 | INFO | espnet3 | Train config content:
73
+ num_device: 1
74
+ num_nodes: 1
75
+ task: espnet3.systems.asr.task.ASRTask
76
+ recipe_dir: .
77
+ data_dir: ./data
78
+ exp_tag: train_asr_rnn_data_aug_debug
79
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
80
+ stats_dir: ./exp/stats
81
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
82
+ dataset_dir: ./data/mini_an4
83
+ create_dataset:
84
+ func: src.create_dataset.create_dataset
85
+ dataset_dir: ./data/mini_an4
86
+ archive_path: ./../../egs2/mini_an4/asr1/downloads.tar.gz
87
+ dataset:
88
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
89
+ train:
90
+ - name: train_nodev
91
+ dataset:
92
+ _target_: src.dataset.MiniAN4Dataset
93
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
94
+ valid:
95
+ - name: train_dev
96
+ dataset:
97
+ _target_: src.dataset.MiniAN4Dataset
98
+ manifest_path: ./data/mini_an4/manifest/train_dev.tsv
99
+ preprocessor:
100
+ _target_: espnet2.train.preprocessor.CommonPreprocessor
101
+ _convert_: all
102
+ fs: 16000
103
+ train: true
104
+ data_aug_effects:
105
+ - - 0.1
106
+ - contrast
107
+ - enhancement_amount: 75.0
108
+ - - 0.1
109
+ - highpass
110
+ - cutoff_freq: 5000
111
+ Q: 0.707
112
+ - - 0.1
113
+ - equalization
114
+ - center_freq: 1000
115
+ gain: 0
116
+ Q: 0.707
117
+ - - 0.1
118
+ - - - 0.3
119
+ - speed_perturb
120
+ - factor: 0.9
121
+ - - 0.3
122
+ - speed_perturb
123
+ - factor: 1.1
124
+ - - 0.3
125
+ - speed_perturb
126
+ - factor: 1.3
127
+ data_aug_num:
128
+ - 1
129
+ - 4
130
+ data_aug_prob: 1.0
131
+ token_type: bpe
132
+ token_list: ./data/bpe_30/tokens.txt
133
+ bpemodel: ./data/bpe_30/bpe.model
134
+ parallel:
135
+ env: local
136
+ n_workers: 1
137
+ dataloader:
138
+ collate_fn:
139
+ _target_: espnet2.train.collate_fn.CommonCollateFn
140
+ int_pad_value: -1
141
+ train:
142
+ multiple_iterator: false
143
+ num_shards: 1
144
+ iter_factory:
145
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
146
+ shuffle: true
147
+ collate_fn:
148
+ _target_: espnet2.train.collate_fn.CommonCollateFn
149
+ int_pad_value: -1
150
+ num_workers: 0
151
+ batches:
152
+ type: sorted
153
+ shape_files:
154
+ - ./exp/stats/train/feats_shape
155
+ batch_size: 2
156
+ batch_bins: 200000
157
+ valid:
158
+ multiple_iterator: false
159
+ num_shards: 1
160
+ iter_factory:
161
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
162
+ shuffle: false
163
+ collate_fn:
164
+ _target_: espnet2.train.collate_fn.CommonCollateFn
165
+ int_pad_value: -1
166
+ batches:
167
+ type: sorted
168
+ shape_files:
169
+ - ./exp/stats/valid/feats_shape
170
+ batch_size: 2
171
+ batch_bins: 200000
172
+ optim:
173
+ _target_: torch.optim.Adam
174
+ lr: 0.001
175
+ weight_decay: 0.0
176
+ scheduler:
177
+ _target_: torch.optim.lr_scheduler.ReduceLROnPlateau
178
+ mode: min
179
+ factor: 0.5
180
+ patience: 1
181
+ val_scheduler_criterion: valid/loss
182
+ best_model_criterion:
183
+ - - valid/acc
184
+ - 1
185
+ - max
186
+ trainer:
187
+ accelerator: auto
188
+ devices: 1
189
+ num_nodes: 1
190
+ accumulate_grad_batches: 1
191
+ check_val_every_n_epoch: 1
192
+ gradient_clip_val: 1.0
193
+ log_every_n_steps: 1
194
+ max_epochs: 1
195
+ limit_train_batches: 1
196
+ limit_val_batches: 1
197
+ precision: 32
198
+ logger:
199
+ - _target_: lightning.pytorch.loggers.TensorBoardLogger
200
+ save_dir: ./exp/train_asr_rnn_data_aug_debug/tensorboard
201
+ name: tb_logger
202
+ strategy: auto
203
+ tokenizer:
204
+ vocab_size: 30
205
+ character_coverage: 1.0
206
+ model_type: bpe
207
+ save_path: ./data/bpe_30
208
+ text_builder:
209
+ func: src.tokenizer.gather_training_text
210
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
211
+ model:
212
+ vocab_size: 30
213
+ token_list: ./data/bpe_30/tokens.txt
214
+ encoder: vgg_rnn
215
+ encoder_conf:
216
+ num_layers: 1
217
+ hidden_size: 2
218
+ output_size: 2
219
+ decoder: rnn
220
+ decoder_conf:
221
+ hidden_size: 2
222
+ normalize: utterance_mvn
223
+ normalize_conf: {}
224
+ model_conf:
225
+ ctc_weight: 0.3
226
+ lsm_weight: 0.1
227
+ length_normalized_loss: false
228
+ frontend: default
229
+ frontend_conf:
230
+ n_fft: 512
231
+ win_length: 400
232
+ hop_length: 160
233
+
234
+ 2026-01-13 22:35:50 | INFO | espnet3 | Infer config content:
235
+ num_device: 1
236
+ num_nodes: 1
237
+ recipe_dir: .
238
+ data_dir: ./data
239
+ exp_tag: train_asr_rnn_data_aug_debug
240
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
241
+ stats_dir: ./exp/stats
242
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
243
+ dataset_dir: ./data/mini_an4
244
+ dataset:
245
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
246
+ test:
247
+ - name: test
248
+ dataset:
249
+ _target_: src.dataset.MiniAN4Dataset
250
+ manifest_path: ./data/mini_an4/manifest/test.tsv
251
+ parallel:
252
+ env: local
253
+ n_workers: 1
254
+ model:
255
+ _target_: espnet2.bin.asr_inference.Speech2Text
256
+ asr_train_config: ./exp/train_asr_rnn_data_aug_debug/config.yaml
257
+ asr_model_file: ./exp/train_asr_rnn_data_aug_debug/last.ckpt
258
+ beam_size: 1
259
+ ctc_weight: 0.3
260
+ tokenizer:
261
+ vocab_size: 30
262
+ character_coverage: 1.0
263
+ model_type: bpe
264
+ save_path: ./data/bpe_30
265
+
266
+ 2026-01-13 22:35:50 | INFO | espnet3 | Measure config content:
267
+ recipe_dir: .
268
+ data_dir: ./data
269
+ exp_tag: train_asr_rnn_data_aug_debug
270
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
271
+ stats_dir: ./exp/stats
272
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
273
+ dataset_dir: ./data/mini_an4
274
+ dataset:
275
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
276
+ test:
277
+ - name: test
278
+ dataset:
279
+ _target_: src.dataset.MiniAN4Dataset
280
+ manifest_path: ./data/mini_an4/manifest/test.tsv
281
+ metrics:
282
+ - metric:
283
+ _target_: espnet3.systems.asr.metrics.wer.WER
284
+ clean_types: null
285
+ - metric:
286
+ _target_: espnet3.systems.asr.metrics.cer.CER
287
+ clean_types: null
288
+
289
+ 2026-01-13 22:35:50 | INFO | espnet3 | === [START] stage: train_tokenizer ===
290
+ 2026-01-13 22:35:50 | INFO | espnet3.systems.asr.system | Tokenizer already exists. Skipping train_tokenizer().
291
+ 2026-01-13 22:35:50 | INFO | espnet3 | === [DONE] stage: train_tokenizer (0.00s) ===
exp/config.yaml ADDED
@@ -0,0 +1,322 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ _convert_: all
2
+ accum_grad: 1
3
+ adapter: lora
4
+ adapter_conf: {}
5
+ allow_multi_rates: false
6
+ allow_variable_data_keys: false
7
+ aux_ctc_tasks: []
8
+ batch_bins: 1000000
9
+ batch_size: 20
10
+ batch_type: folded
11
+ best_model_criterion:
12
+ - - valid/acc
13
+ - 1
14
+ - max
15
+ bpemodel: ./data/bpe_30/bpe.model
16
+ category_sample_size: 10
17
+ category_upsampling_factor: 0.5
18
+ chunk_default_fs: null
19
+ chunk_discard_short_samples: true
20
+ chunk_excluded_key_prefixes: []
21
+ chunk_length: 500
22
+ chunk_max_abs_length: null
23
+ chunk_shift_ratio: 0.5
24
+ cleaner: null
25
+ collect_stats: false
26
+ create_dataset:
27
+ archive_path: ./../../egs2/mini_an4/asr1/downloads.tar.gz
28
+ dataset_dir: ./data/mini_an4
29
+ func: src.create_dataset.create_dataset
30
+ create_graph_in_tensorboard: false
31
+ ctc_conf:
32
+ brctc_group_strategy: end
33
+ brctc_risk_factor: 0.0
34
+ brctc_risk_strategy: exp
35
+ ctc_type: builtin
36
+ dropout_rate: 0.0
37
+ ignore_nan_grad: null
38
+ reduce: true
39
+ zero_infinity: true
40
+ cudnn_benchmark: false
41
+ cudnn_deterministic: true
42
+ cudnn_enabled: true
43
+ data_aug_effects:
44
+ - - 0.1
45
+ - contrast
46
+ - enhancement_amount: 75.0
47
+ - - 0.1
48
+ - highpass
49
+ - Q: 0.707
50
+ cutoff_freq: 5000
51
+ - - 0.1
52
+ - equalization
53
+ - Q: 0.707
54
+ center_freq: 1000
55
+ gain: 0
56
+ - - 0.1
57
+ - - - 0.3
58
+ - speed_perturb
59
+ - factor: 0.9
60
+ - - 0.3
61
+ - speed_perturb
62
+ - factor: 1.1
63
+ - - 0.3
64
+ - speed_perturb
65
+ - factor: 1.3
66
+ data_aug_num:
67
+ - 1
68
+ - 4
69
+ data_aug_prob: 1.0
70
+ data_dir: ./data
71
+ dataloader:
72
+ collate_fn:
73
+ _target_: espnet2.train.collate_fn.CommonCollateFn
74
+ int_pad_value: -1
75
+ train:
76
+ iter_factory:
77
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
78
+ batches:
79
+ batch_bins: 200000
80
+ batch_size: 2
81
+ shape_files:
82
+ - ./exp/stats/train/feats_shape
83
+ type: sorted
84
+ collate_fn:
85
+ _target_: espnet2.train.collate_fn.CommonCollateFn
86
+ int_pad_value: -1
87
+ num_workers: 0
88
+ shuffle: true
89
+ multiple_iterator: false
90
+ num_shards: 1
91
+ valid:
92
+ iter_factory:
93
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
94
+ batches:
95
+ batch_bins: 200000
96
+ batch_size: 2
97
+ shape_files:
98
+ - ./exp/stats/valid/feats_shape
99
+ type: sorted
100
+ collate_fn:
101
+ _target_: espnet2.train.collate_fn.CommonCollateFn
102
+ int_pad_value: -1
103
+ shuffle: false
104
+ multiple_iterator: false
105
+ num_shards: 1
106
+ dataset:
107
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
108
+ train:
109
+ - dataset:
110
+ _target_: src.dataset.MiniAN4Dataset
111
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
112
+ name: train_nodev
113
+ valid:
114
+ - dataset:
115
+ _target_: src.dataset.MiniAN4Dataset
116
+ manifest_path: ./data/mini_an4/manifest/train_dev.tsv
117
+ name: train_dev
118
+ dataset_dir: ./data/mini_an4
119
+ dataset_scaling_factor: 1.2
120
+ dataset_upsampling_factor: 0.5
121
+ ddp_comm_hook: null
122
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
123
+ decoder: rnn
124
+ decoder_conf:
125
+ hidden_size: 2
126
+ deepspeed_config: null
127
+ detect_anomaly: false
128
+ dist_backend: nccl
129
+ dist_init_method: env://
130
+ dist_launcher: null
131
+ dist_master_addr: null
132
+ dist_master_port: null
133
+ dist_rank: null
134
+ dist_world_size: null
135
+ drop_last_iter: false
136
+ dry_run: false
137
+ early_stopping_criterion:
138
+ - valid
139
+ - loss
140
+ - min
141
+ encoder: vgg_rnn
142
+ encoder_conf:
143
+ hidden_size: 2
144
+ num_layers: 1
145
+ output_size: 2
146
+ exclude_weight_decay: false
147
+ exclude_weight_decay_conf: {}
148
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
149
+ exp_tag: train_asr_rnn_data_aug_debug
150
+ fold_length: []
151
+ freeze_param: []
152
+ frontend: default
153
+ frontend_conf:
154
+ hop_length: 160
155
+ n_fft: 512
156
+ win_length: 400
157
+ fs: 16000
158
+ g2p: null
159
+ grad_clip: 5.0
160
+ grad_clip_type: 2.0
161
+ grad_noise: false
162
+ gradient_as_bucket_view: true
163
+ ignore_init_mismatch: false
164
+ init: null
165
+ init_param: []
166
+ input_size: null
167
+ iterator_type: sequence
168
+ joint_net_conf: {}
169
+ keep_nbest_models:
170
+ - 10
171
+ local_rank: null
172
+ log_interval: null
173
+ log_level: INFO
174
+ max_batch_size: null
175
+ max_cache_fd: 32
176
+ max_cache_size: 0.0
177
+ max_epoch: 40
178
+ min_batch_size: 1
179
+ model: espnet
180
+ model_conf:
181
+ ctc_weight: 0.3
182
+ length_normalized_loss: false
183
+ lsm_weight: 0.1
184
+ multi_task_dataset: false
185
+ multiple_iterator: false
186
+ multiprocessing_distributed: false
187
+ nbest_averaging_interval: 0
188
+ no_forward_run: false
189
+ noise_apply_prob: 1.0
190
+ noise_db_range: '13_15'
191
+ noise_scp: null
192
+ non_linguistic_symbols: null
193
+ normalize: utterance_mvn
194
+ normalize_conf:
195
+ eps: 1.0e-20
196
+ norm_means: true
197
+ norm_vars: false
198
+ num_att_plot: 3
199
+ num_cache_chunks: 1024
200
+ num_device: 1
201
+ num_iters_per_epoch: null
202
+ num_nodes: 1
203
+ num_workers: 1
204
+ optim:
205
+ _target_: torch.optim.Adam
206
+ lr: 0.001
207
+ weight_decay: 0.0
208
+ optim_conf:
209
+ capturable: false
210
+ differentiable: false
211
+ eps: 1.0e-06
212
+ foreach: null
213
+ lr: 1.0
214
+ maximize: false
215
+ rho: 0.9
216
+ weight_decay: 0
217
+ output_dir: null
218
+ parallel:
219
+ env: local
220
+ n_workers: 1
221
+ options: {}
222
+ patience: null
223
+ postencoder: null
224
+ postencoder_conf: {}
225
+ preencoder: null
226
+ preencoder_conf: {}
227
+ preprocessor: default
228
+ preprocessor_conf:
229
+ audio_pad_value: 0.0
230
+ data_aug_effects: null
231
+ data_aug_num:
232
+ - 1
233
+ - 1
234
+ data_aug_prob: 0.0
235
+ delimiter: null
236
+ force_single_channel: false
237
+ fs: 0
238
+ min_sample_size: -1
239
+ nonsplit_symbol: null
240
+ space_symbol: <space>
241
+ speech_name: speech
242
+ text_name: text
243
+ unk_symbol: <unk>
244
+ whisper_language: null
245
+ whisper_task: null
246
+ pretrain_path: null
247
+ recipe_dir: .
248
+ resume: false
249
+ rir_apply_prob: 1.0
250
+ rir_scp: null
251
+ save_strategy: all
252
+ scheduler:
253
+ _target_: torch.optim.lr_scheduler.ReduceLROnPlateau
254
+ factor: 0.5
255
+ mode: min
256
+ patience: 1
257
+ scheduler_conf: {}
258
+ seed: 0
259
+ sharded_ddp: false
260
+ short_noise_thres: 0.5
261
+ shuffle_within_batch: false
262
+ sort_batch: descending
263
+ sort_in_batch: descending
264
+ specaug: null
265
+ specaug_conf: {}
266
+ speech_volume_normalize: null
267
+ stats_dir: ./exp/stats
268
+ task: espnet3.systems.asr.task.ASRTask
269
+ token_list: ./data/bpe_30/tokens.txt
270
+ token_type: bpe
271
+ tokenizer:
272
+ character_coverage: 1.0
273
+ model_type: bpe
274
+ save_path: ./data/bpe_30
275
+ text_builder:
276
+ func: src.tokenizer.gather_training_text
277
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
278
+ vocab_size: 30
279
+ train: true
280
+ train_data_path_and_name_and_type: []
281
+ train_dtype: float32
282
+ train_shape_file: []
283
+ trainer:
284
+ accumulate_grad_batches: 1
285
+ check_val_every_n_epoch: 1
286
+ devices: 1
287
+ gradient_clip_val: 1.0
288
+ limit_train_batches: 1
289
+ limit_val_batches: 1
290
+ log_every_n_steps: 1
291
+ max_epochs: 1
292
+ num_nodes: 1
293
+ precision: 32
294
+ reload_dataloaders_every_n_epochs: 1
295
+ use_distributed_sampler: false
296
+ unused_parameters: false
297
+ upsampling_factor: 0.5
298
+ use_adapter: false
299
+ use_amp: false
300
+ use_deepspeed: false
301
+ use_lang_prompt: false
302
+ use_matplotlib: true
303
+ use_nlp_prompt: false
304
+ use_preprocessor: true
305
+ use_tensorboard: true
306
+ use_tf32: false
307
+ use_wandb: false
308
+ val_scheduler_criterion: valid/loss
309
+ valid_batch_bins: null
310
+ valid_batch_size: null
311
+ valid_batch_type: null
312
+ valid_data_path_and_name_and_type: []
313
+ valid_iterator_type: null
314
+ valid_max_cache_size: null
315
+ valid_shape_file: []
316
+ vocab_size: 30
317
+ wandb_entity: null
318
+ wandb_id: null
319
+ wandb_model_log_interval: -1
320
+ wandb_name: null
321
+ wandb_project: null
322
+ write_collected_feats: false
exp/epoch0_step1_valid.acc.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:67070c9ed4d0f5927ce820f5487c844ca03a1695956a1253b50e3ee4e6867aca
3
+ size 1325498
exp/last.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:799006a04b83914bd19599b0f8ab43d5f43244f75057f92e6ceefc8de5d82560
3
+ size 3813993
exp/pack_model.log ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2026-01-13 22:48:14 | INFO | espnet3 | === ESPnet3 run started: 2026-01-13T22:48:14.851894 ===
2
+ 2026-01-13 22:48:14 | INFO | espnet3 | Command: /data/user_data/msomeki/espnet3/.venv/bin/python run.py --stage pack_model upload_model --publish_config conf/publish.yaml --train_config conf/train.yaml
3
+ 2026-01-13 22:48:14 | INFO | espnet3 | Python: 3.11.13 (main, Aug 18 2025, 19:19:13) [Clang 20.1.4 ]
4
+ 2026-01-13 22:48:14 | INFO | espnet3 | Working directory: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr
5
+ 2026-01-13 22:48:14 | INFO | espnet3 | train config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/train_asr_rnn_data_aug_debug.yaml
6
+ 2026-01-13 22:48:14 | INFO | espnet3 | publish config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/publish.yaml
7
+ 2026-01-13 22:48:15 | INFO | espnet3 | Git: commit=8509faad9811b58d5024f29fb9d68ffb026b5e73, short_commit=8509faad9, branch=espnet3/recipe/asr_ls100, worktree=dirty
8
+ 2026-01-13 22:48:15 | INFO | espnet3 | Cluster env:
9
+ OMPI_MCA_plm_slurm_args=--external-launcher
10
+ SLURM_CLUSTER_NAME=babel
11
+ SLURM_CONF=/var/spool/slurmd/conf-cache/slurm.conf
12
+ SLURM_CPUS_ON_NODE=1
13
+ SLURM_CPUS_PER_TASK=1
14
+ SLURM_CPU_BIND=quiet,mask_cpu:0x0000000000010000
15
+ SLURM_CPU_BIND_LIST=0x0000000000010000
16
+ SLURM_CPU_BIND_TYPE=mask_cpu:
17
+ SLURM_CPU_BIND_VERBOSE=quiet
18
+ SLURM_DISTRIBUTION=cyclic,pack
19
+ SLURM_GTIDS=0
20
+ SLURM_JOBID=6122041
21
+ SLURM_JOB_ACCOUNT=swatanab
22
+ SLURM_JOB_CPUS_PER_NODE=1
23
+ SLURM_JOB_END_TIME=1768401875
24
+ SLURM_JOB_GID=2709140
25
+ SLURM_JOB_GROUP=msomeki
26
+ SLURM_JOB_ID=6122041
27
+ SLURM_JOB_NAME=bash
28
+ SLURM_JOB_NODELIST=babel-o9-16
29
+ SLURM_JOB_NUM_NODES=1
30
+ SLURM_JOB_PARTITION=debug
31
+ SLURM_JOB_QOS=debug_qos
32
+ SLURM_JOB_START_TIME=1768358675
33
+ SLURM_JOB_UID=2709140
34
+ SLURM_JOB_USER=msomeki
35
+ SLURM_LAUNCH_NODE_IPADDR=172.16.1.2
36
+ SLURM_LOCALID=0
37
+ SLURM_MEM_PER_NODE=4096
38
+ SLURM_NNODES=1
39
+ SLURM_NODEID=0
40
+ SLURM_NODELIST=babel-o9-16
41
+ SLURM_NPROCS=1
42
+ SLURM_NTASKS=1
43
+ SLURM_NTASKS_PER_NODE=1
44
+ SLURM_PRIO_PROCESS=0
45
+ SLURM_PROCID=0
46
+ SLURM_PTY_PORT=40465
47
+ SLURM_PTY_WIN_COL=112
48
+ SLURM_PTY_WIN_ROW=61
49
+ SLURM_SCRIPT_CONTEXT=prolog_task
50
+ SLURM_SRUN_COMM_HOST=172.16.1.2
51
+ SLURM_SRUN_COMM_PORT=33789
52
+ SLURM_STEPID=0
53
+ SLURM_STEP_ID=0
54
+ SLURM_STEP_LAUNCHER_PORT=33789
55
+ SLURM_STEP_NODELIST=babel-o9-16
56
+ SLURM_STEP_NUM_NODES=1
57
+ SLURM_STEP_NUM_TASKS=1
58
+ SLURM_STEP_TASKS_PER_NODE=1
59
+ SLURM_SUBMIT_DIR=/home/msomeki/00_systems/espnet3
60
+ SLURM_SUBMIT_HOST=login1
61
+ SLURM_TASKS_PER_NODE=1
62
+ SLURM_TASK_PID=3334910
63
+ SLURM_TOPOLOGY_ADDR=babel-o9-16
64
+ SLURM_TOPOLOGY_ADDR_PATTERN=node
65
+ SLURM_TRES_PER_TASK=cpu=1
66
+ SLURM_UMASK=0027
67
+ 2026-01-13 22:48:15 | INFO | espnet3 | Runtime env:
68
+ LD_LIBRARY_PATH=/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:
69
+ PATH=/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/data/user_data/msomeki/espnet3/.venv/bin:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/usr/share/Modules/bin:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/home/msomeki/.local/bin:/home/msomeki/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
70
+ PYTHONPATH=/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:
71
+ 2026-01-13 22:48:15 | INFO | espnet3 | Train config content:
72
+ num_device: 1
73
+ num_nodes: 1
74
+ task: espnet3.systems.asr.task.ASRTask
75
+ recipe_dir: .
76
+ data_dir: ./data
77
+ exp_tag: train_asr_rnn_data_aug_debug
78
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
79
+ stats_dir: ./exp/stats
80
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
81
+ dataset_dir: ./data/mini_an4
82
+ create_dataset:
83
+ func: src.create_dataset.create_dataset
84
+ dataset_dir: ./data/mini_an4
85
+ archive_path: ./../../egs2/mini_an4/asr1/downloads.tar.gz
86
+ dataset:
87
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
88
+ train:
89
+ - name: train_nodev
90
+ dataset:
91
+ _target_: src.dataset.MiniAN4Dataset
92
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
93
+ valid:
94
+ - name: train_dev
95
+ dataset:
96
+ _target_: src.dataset.MiniAN4Dataset
97
+ manifest_path: ./data/mini_an4/manifest/train_dev.tsv
98
+ preprocessor:
99
+ _target_: espnet2.train.preprocessor.CommonPreprocessor
100
+ _convert_: all
101
+ fs: 16000
102
+ train: true
103
+ data_aug_effects:
104
+ - - 0.1
105
+ - contrast
106
+ - enhancement_amount: 75.0
107
+ - - 0.1
108
+ - highpass
109
+ - cutoff_freq: 5000
110
+ Q: 0.707
111
+ - - 0.1
112
+ - equalization
113
+ - center_freq: 1000
114
+ gain: 0
115
+ Q: 0.707
116
+ - - 0.1
117
+ - - - 0.3
118
+ - speed_perturb
119
+ - factor: 0.9
120
+ - - 0.3
121
+ - speed_perturb
122
+ - factor: 1.1
123
+ - - 0.3
124
+ - speed_perturb
125
+ - factor: 1.3
126
+ data_aug_num:
127
+ - 1
128
+ - 4
129
+ data_aug_prob: 1.0
130
+ token_type: bpe
131
+ token_list: ./data/bpe_30/tokens.txt
132
+ bpemodel: ./data/bpe_30/bpe.model
133
+ parallel:
134
+ env: local
135
+ n_workers: 1
136
+ dataloader:
137
+ collate_fn:
138
+ _target_: espnet2.train.collate_fn.CommonCollateFn
139
+ int_pad_value: -1
140
+ train:
141
+ multiple_iterator: false
142
+ num_shards: 1
143
+ iter_factory:
144
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
145
+ shuffle: true
146
+ collate_fn:
147
+ _target_: espnet2.train.collate_fn.CommonCollateFn
148
+ int_pad_value: -1
149
+ num_workers: 0
150
+ batches:
151
+ type: sorted
152
+ shape_files:
153
+ - ./exp/stats/train/feats_shape
154
+ batch_size: 2
155
+ batch_bins: 200000
156
+ valid:
157
+ multiple_iterator: false
158
+ num_shards: 1
159
+ iter_factory:
160
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
161
+ shuffle: false
162
+ collate_fn:
163
+ _target_: espnet2.train.collate_fn.CommonCollateFn
164
+ int_pad_value: -1
165
+ batches:
166
+ type: sorted
167
+ shape_files:
168
+ - ./exp/stats/valid/feats_shape
169
+ batch_size: 2
170
+ batch_bins: 200000
171
+ optim:
172
+ _target_: torch.optim.Adam
173
+ lr: 0.001
174
+ weight_decay: 0.0
175
+ scheduler:
176
+ _target_: torch.optim.lr_scheduler.ReduceLROnPlateau
177
+ mode: min
178
+ factor: 0.5
179
+ patience: 1
180
+ val_scheduler_criterion: valid/loss
181
+ best_model_criterion:
182
+ - - valid/acc
183
+ - 1
184
+ - max
185
+ trainer:
186
+ accelerator: auto
187
+ devices: 1
188
+ num_nodes: 1
189
+ accumulate_grad_batches: 1
190
+ check_val_every_n_epoch: 1
191
+ gradient_clip_val: 1.0
192
+ log_every_n_steps: 1
193
+ max_epochs: 1
194
+ limit_train_batches: 1
195
+ limit_val_batches: 1
196
+ precision: 32
197
+ logger:
198
+ - _target_: lightning.pytorch.loggers.TensorBoardLogger
199
+ save_dir: ./exp/train_asr_rnn_data_aug_debug/tensorboard
200
+ name: tb_logger
201
+ strategy: auto
202
+ tokenizer:
203
+ vocab_size: 30
204
+ character_coverage: 1.0
205
+ model_type: bpe
206
+ save_path: ./data/bpe_30
207
+ text_builder:
208
+ func: src.tokenizer.gather_training_text
209
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
210
+ model:
211
+ vocab_size: 30
212
+ token_list: ./data/bpe_30/tokens.txt
213
+ encoder: vgg_rnn
214
+ encoder_conf:
215
+ num_layers: 1
216
+ hidden_size: 2
217
+ output_size: 2
218
+ decoder: rnn
219
+ decoder_conf:
220
+ hidden_size: 2
221
+ normalize: utterance_mvn
222
+ normalize_conf: {}
223
+ model_conf:
224
+ ctc_weight: 0.3
225
+ lsm_weight: 0.1
226
+ length_normalized_loss: false
227
+ frontend: default
228
+ frontend_conf:
229
+ n_fft: 512
230
+ win_length: 400
231
+ hop_length: 160
232
+
233
+ 2026-01-13 22:48:15 | INFO | espnet3 | Publish config content:
234
+ pack_model:
235
+ out_dir: exp/model_pack
236
+ include: []
237
+ extra: []
238
+ exclude:
239
+ - '**/*.log'
240
+ - '**/tensorboard/**'
241
+ - '**/wandb/**'
242
+ upload_model:
243
+ hf_repo: yourname/your-model-repo
244
+
245
+ 2026-01-13 22:48:15 | INFO | espnet3 | === [START] stage: pack_model ===
exp/run.log ADDED
The diff for this file is too large to render. See raw diff
 
exp/step1.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:799006a04b83914bd19599b0f8ab43d5f43244f75057f92e6ceefc8de5d82560
3
+ size 3813993
exp/train.log ADDED
@@ -0,0 +1,371 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2026-01-13 22:35:57 | INFO | espnet3 | === ESPnet3 run started: 2026-01-13T22:35:57.057168 ===
2
+ 2026-01-13 22:35:57 | INFO | espnet3 | Command: /data/user_data/msomeki/espnet3/.venv/bin/python3 run.py --stages create_dataset train_tokenizer collect_stats train infer measure --train_config conf/train.yaml --infer_config conf/infer.yaml --measure_config conf/measure.yaml
3
+ 2026-01-13 22:35:57 | INFO | espnet3 | Python: 3.11.13 (main, Aug 18 2025, 19:19:13) [Clang 20.1.4 ]
4
+ 2026-01-13 22:35:57 | INFO | espnet3 | Working directory: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr
5
+ 2026-01-13 22:35:57 | INFO | espnet3 | train config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/train_asr_rnn_data_aug_debug.yaml
6
+ 2026-01-13 22:35:57 | INFO | espnet3 | infer config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/infer.yaml
7
+ 2026-01-13 22:35:57 | INFO | espnet3 | measure config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/measure.yaml
8
+ 2026-01-13 22:35:57 | INFO | espnet3 | Git: commit=8509faad9811b58d5024f29fb9d68ffb026b5e73, short_commit=8509faad9, branch=espnet3/recipe/asr_ls100, worktree=dirty
9
+ 2026-01-13 22:35:57 | INFO | espnet3 | Cluster env:
10
+ OMPI_MCA_plm_slurm_args=--external-launcher
11
+ SLURM_CLUSTER_NAME=babel
12
+ SLURM_CONF=/var/spool/slurmd/conf-cache/slurm.conf
13
+ SLURM_CPUS_ON_NODE=1
14
+ SLURM_CPUS_PER_TASK=1
15
+ SLURM_CPU_BIND=quiet,mask_cpu:0x0000000000010000
16
+ SLURM_CPU_BIND_LIST=0x0000000000010000
17
+ SLURM_CPU_BIND_TYPE=mask_cpu:
18
+ SLURM_CPU_BIND_VERBOSE=quiet
19
+ SLURM_DISTRIBUTION=cyclic,pack
20
+ SLURM_GTIDS=0
21
+ SLURM_JOBID=6122041
22
+ SLURM_JOB_ACCOUNT=swatanab
23
+ SLURM_JOB_CPUS_PER_NODE=1
24
+ SLURM_JOB_END_TIME=1768401875
25
+ SLURM_JOB_GID=2709140
26
+ SLURM_JOB_GROUP=msomeki
27
+ SLURM_JOB_ID=6122041
28
+ SLURM_JOB_NAME=bash
29
+ SLURM_JOB_NODELIST=babel-o9-16
30
+ SLURM_JOB_NUM_NODES=1
31
+ SLURM_JOB_PARTITION=debug
32
+ SLURM_JOB_QOS=debug_qos
33
+ SLURM_JOB_START_TIME=1768358675
34
+ SLURM_JOB_UID=2709140
35
+ SLURM_JOB_USER=msomeki
36
+ SLURM_LAUNCH_NODE_IPADDR=172.16.1.2
37
+ SLURM_LOCALID=0
38
+ SLURM_MEM_PER_NODE=4096
39
+ SLURM_NNODES=1
40
+ SLURM_NODEID=0
41
+ SLURM_NODELIST=babel-o9-16
42
+ SLURM_NPROCS=1
43
+ SLURM_NTASKS=1
44
+ SLURM_NTASKS_PER_NODE=1
45
+ SLURM_PRIO_PROCESS=0
46
+ SLURM_PROCID=0
47
+ SLURM_PTY_PORT=40465
48
+ SLURM_PTY_WIN_COL=112
49
+ SLURM_PTY_WIN_ROW=61
50
+ SLURM_SCRIPT_CONTEXT=prolog_task
51
+ SLURM_SRUN_COMM_HOST=172.16.1.2
52
+ SLURM_SRUN_COMM_PORT=33789
53
+ SLURM_STEPID=0
54
+ SLURM_STEP_ID=0
55
+ SLURM_STEP_LAUNCHER_PORT=33789
56
+ SLURM_STEP_NODELIST=babel-o9-16
57
+ SLURM_STEP_NUM_NODES=1
58
+ SLURM_STEP_NUM_TASKS=1
59
+ SLURM_STEP_TASKS_PER_NODE=1
60
+ SLURM_SUBMIT_DIR=/home/msomeki/00_systems/espnet3
61
+ SLURM_SUBMIT_HOST=login1
62
+ SLURM_TASKS_PER_NODE=1
63
+ SLURM_TASK_PID=3334910
64
+ SLURM_TOPOLOGY_ADDR=babel-o9-16
65
+ SLURM_TOPOLOGY_ADDR_PATTERN=node
66
+ SLURM_TRES_PER_TASK=cpu=1
67
+ SLURM_UMASK=0027
68
+ 2026-01-13 22:35:57 | INFO | espnet3 | Runtime env:
69
+ LD_LIBRARY_PATH=/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:
70
+ PATH=/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/data/user_data/msomeki/espnet3/.venv/bin:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/usr/share/Modules/bin:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/home/msomeki/.local/bin:/home/msomeki/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
71
+ PYTHONPATH=/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:
72
+ 2026-01-13 22:35:57 | INFO | espnet3 | Train config content:
73
+ num_device: 1
74
+ num_nodes: 1
75
+ task: espnet3.systems.asr.task.ASRTask
76
+ recipe_dir: .
77
+ data_dir: ./data
78
+ exp_tag: train_asr_rnn_data_aug_debug
79
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
80
+ stats_dir: ./exp/stats
81
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
82
+ dataset_dir: ./data/mini_an4
83
+ create_dataset:
84
+ func: src.create_dataset.create_dataset
85
+ dataset_dir: ./data/mini_an4
86
+ archive_path: ./../../egs2/mini_an4/asr1/downloads.tar.gz
87
+ dataset:
88
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
89
+ train:
90
+ - name: train_nodev
91
+ dataset:
92
+ _target_: src.dataset.MiniAN4Dataset
93
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
94
+ valid:
95
+ - name: train_dev
96
+ dataset:
97
+ _target_: src.dataset.MiniAN4Dataset
98
+ manifest_path: ./data/mini_an4/manifest/train_dev.tsv
99
+ preprocessor:
100
+ _target_: espnet2.train.preprocessor.CommonPreprocessor
101
+ _convert_: all
102
+ fs: 16000
103
+ train: true
104
+ data_aug_effects:
105
+ - - 0.1
106
+ - contrast
107
+ - enhancement_amount: 75.0
108
+ - - 0.1
109
+ - highpass
110
+ - cutoff_freq: 5000
111
+ Q: 0.707
112
+ - - 0.1
113
+ - equalization
114
+ - center_freq: 1000
115
+ gain: 0
116
+ Q: 0.707
117
+ - - 0.1
118
+ - - - 0.3
119
+ - speed_perturb
120
+ - factor: 0.9
121
+ - - 0.3
122
+ - speed_perturb
123
+ - factor: 1.1
124
+ - - 0.3
125
+ - speed_perturb
126
+ - factor: 1.3
127
+ data_aug_num:
128
+ - 1
129
+ - 4
130
+ data_aug_prob: 1.0
131
+ token_type: bpe
132
+ token_list: ./data/bpe_30/tokens.txt
133
+ bpemodel: ./data/bpe_30/bpe.model
134
+ parallel:
135
+ env: local
136
+ n_workers: 1
137
+ options: {}
138
+ dataloader:
139
+ collate_fn:
140
+ _target_: espnet2.train.collate_fn.CommonCollateFn
141
+ int_pad_value: -1
142
+ train:
143
+ multiple_iterator: false
144
+ num_shards: 1
145
+ iter_factory:
146
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
147
+ shuffle: true
148
+ collate_fn:
149
+ _target_: espnet2.train.collate_fn.CommonCollateFn
150
+ int_pad_value: -1
151
+ num_workers: 0
152
+ batches:
153
+ type: sorted
154
+ shape_files:
155
+ - ./exp/stats/train/feats_shape
156
+ batch_size: 2
157
+ batch_bins: 200000
158
+ valid:
159
+ multiple_iterator: false
160
+ num_shards: 1
161
+ iter_factory:
162
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
163
+ shuffle: false
164
+ collate_fn:
165
+ _target_: espnet2.train.collate_fn.CommonCollateFn
166
+ int_pad_value: -1
167
+ batches:
168
+ type: sorted
169
+ shape_files:
170
+ - ./exp/stats/valid/feats_shape
171
+ batch_size: 2
172
+ batch_bins: 200000
173
+ optim:
174
+ _target_: torch.optim.Adam
175
+ lr: 0.001
176
+ weight_decay: 0.0
177
+ scheduler:
178
+ _target_: torch.optim.lr_scheduler.ReduceLROnPlateau
179
+ mode: min
180
+ factor: 0.5
181
+ patience: 1
182
+ val_scheduler_criterion: valid/loss
183
+ best_model_criterion:
184
+ - - valid/acc
185
+ - 1
186
+ - max
187
+ trainer:
188
+ devices: 1
189
+ num_nodes: 1
190
+ accumulate_grad_batches: 1
191
+ check_val_every_n_epoch: 1
192
+ gradient_clip_val: 1.0
193
+ log_every_n_steps: 1
194
+ max_epochs: 1
195
+ limit_train_batches: 1
196
+ limit_val_batches: 1
197
+ precision: 32
198
+ reload_dataloaders_every_n_epochs: 1
199
+ use_distributed_sampler: false
200
+ tokenizer:
201
+ vocab_size: 30
202
+ character_coverage: 1.0
203
+ model_type: bpe
204
+ save_path: ./data/bpe_30
205
+ text_builder:
206
+ func: src.tokenizer.gather_training_text
207
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
208
+ model:
209
+ vocab_size: 30
210
+ token_list: ./data/bpe_30/tokens.txt
211
+ encoder: vgg_rnn
212
+ encoder_conf:
213
+ num_layers: 1
214
+ hidden_size: 2
215
+ output_size: 2
216
+ decoder: rnn
217
+ decoder_conf:
218
+ hidden_size: 2
219
+ model_conf:
220
+ ctc_weight: 0.3
221
+ lsm_weight: 0.1
222
+ length_normalized_loss: false
223
+ frontend: default
224
+ frontend_conf:
225
+ n_fft: 512
226
+ win_length: 400
227
+ hop_length: 160
228
+
229
+ 2026-01-13 22:35:57 | INFO | espnet3 | Infer config content:
230
+ num_device: 1
231
+ num_nodes: 1
232
+ recipe_dir: .
233
+ data_dir: ./data
234
+ exp_tag: train_asr_rnn_data_aug_debug
235
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
236
+ stats_dir: ./exp/stats
237
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
238
+ dataset_dir: ./data/mini_an4
239
+ dataset:
240
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
241
+ test:
242
+ - name: test
243
+ dataset:
244
+ _target_: src.dataset.MiniAN4Dataset
245
+ manifest_path: ./data/mini_an4/manifest/test.tsv
246
+ parallel:
247
+ env: local
248
+ n_workers: 1
249
+ model:
250
+ _target_: espnet2.bin.asr_inference.Speech2Text
251
+ asr_train_config: ./exp/train_asr_rnn_data_aug_debug/config.yaml
252
+ asr_model_file: ./exp/train_asr_rnn_data_aug_debug/last.ckpt
253
+ beam_size: 1
254
+ ctc_weight: 0.3
255
+ tokenizer:
256
+ vocab_size: 30
257
+ character_coverage: 1.0
258
+ model_type: bpe
259
+ save_path: ./data/bpe_30
260
+
261
+ 2026-01-13 22:35:57 | INFO | espnet3 | Measure config content:
262
+ recipe_dir: .
263
+ data_dir: ./data
264
+ exp_tag: train_asr_rnn_data_aug_debug
265
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
266
+ stats_dir: ./exp/stats
267
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
268
+ dataset_dir: ./data/mini_an4
269
+ dataset:
270
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
271
+ test:
272
+ - name: test
273
+ dataset:
274
+ _target_: src.dataset.MiniAN4Dataset
275
+ manifest_path: ./data/mini_an4/manifest/test.tsv
276
+ metrics:
277
+ - metric:
278
+ _target_: espnet3.systems.asr.metrics.wer.WER
279
+ clean_types: null
280
+ - metric:
281
+ _target_: espnet3.systems.asr.metrics.cer.CER
282
+ clean_types: null
283
+
284
+ 2026-01-13 22:35:57 | INFO | espnet3 | === [START] stage: train ===
285
+ 2026-01-13 22:35:57 | INFO | espnet3.systems.asr.system | ASRSystem.train(): starting training process
286
+ 2026-01-13 22:35:57 | INFO | espnet3.systems.base.system | Training start | exp_dir=./exp/train_asr_rnn_data_aug_debug model=<unknown>
287
+ 2026-01-13 22:35:57 | INFO | root | Vocabulary size: 30
288
+ 2026-01-13 22:35:57 | INFO | espnet3.systems.base.train | Model:
289
+ ESPnetASRModel(
290
+ (frontend): DefaultFrontend(
291
+ (stft): Stft(n_fft=512, win_length=400, hop_length=160, center=True, normalized=False, onesided=True)
292
+ (frontend): Frontend()
293
+ (logmel): LogMel(sr=16000, n_fft=512, n_mels=80, fmin=0, fmax=8000.0, htk=False)
294
+ )
295
+ (normalize): UtteranceMVN(norm_means=True, norm_vars=False)
296
+ (encoder): VGGRNNEncoder(
297
+ (enc): ModuleList(
298
+ (0): VGG2L(
299
+ (conv1_1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
300
+ (conv1_2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
301
+ (conv2_1): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
302
+ (conv2_2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
303
+ )
304
+ (1): RNNP(
305
+ (birnn0): LSTM(2560, 2, batch_first=True, bidirectional=True)
306
+ (bt0): Linear(in_features=4, out_features=2, bias=True)
307
+ )
308
+ )
309
+ )
310
+ (decoder): RNNDecoder(
311
+ (embed): Embedding(30, 2)
312
+ (dropout_emb): Dropout(p=0.0, inplace=False)
313
+ (decoder): ModuleList(
314
+ (0): LSTMCell(4, 2)
315
+ )
316
+ (dropout_dec): ModuleList(
317
+ (0): Dropout(p=0.0, inplace=False)
318
+ )
319
+ (output): Linear(in_features=2, out_features=30, bias=True)
320
+ (att_list): ModuleList(
321
+ (0): AttLoc(
322
+ (mlp_enc): Linear(in_features=2, out_features=320, bias=True)
323
+ (mlp_dec): Linear(in_features=2, out_features=320, bias=False)
324
+ (mlp_att): Linear(in_features=10, out_features=320, bias=False)
325
+ (loc_conv): Conv2d(1, 10, kernel_size=(1, 201), stride=(1, 1), padding=(0, 100), bias=False)
326
+ (gvec): Linear(in_features=320, out_features=1, bias=True)
327
+ )
328
+ )
329
+ )
330
+ (criterion_att): LabelSmoothingLoss(
331
+ (criterion): KLDivLoss()
332
+ )
333
+ (ctc): CTC(
334
+ (ctc_lo): Linear(in_features=2, out_features=30, bias=True)
335
+ (ctc_loss): CTCLoss()
336
+ )
337
+ )
338
+ 2026-01-13 22:35:57 | WARNING | py.warnings | /data/user_data/msomeki/espnet3/.venv/lib/python3.11/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python3 run.py --stages create_dataset train_tokenizer coll ...
339
+
340
+ 2026-01-13 22:35:57 | INFO | lightning.pytorch.utilities.rank_zero | GPU available: False, used: False
341
+ 2026-01-13 22:35:57 | INFO | lightning.pytorch.utilities.rank_zero | TPU available: False, using: 0 TPU cores
342
+ 2026-01-13 22:35:57 | INFO | lightning.pytorch.utilities.rank_zero | `Trainer(limit_train_batches=1)` was configured so 1 batch per epoch will be used.
343
+ 2026-01-13 22:35:57 | INFO | lightning.pytorch.utilities.rank_zero | `Trainer(limit_val_batches=1)` was configured so 1 batch will be used.
344
+ 2026-01-13 22:35:58 | WARNING | py.warnings | /data/user_data/msomeki/espnet3/.venv/lib/python3.11/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:881: Checkpoint directory /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/exp/train_asr_rnn_data_aug_debug exists and is not empty.
345
+
346
+ 2026-01-13 22:35:58 | INFO | lightning.pytorch.callbacks.model_summary |
347
+ | Name | Type | Params | Mode | FLOPs
348
+ ---------------------------------------------------------
349
+ 0 | model | ESPnetASRModel | 307 K | train | 0
350
+ ---------------------------------------------------------
351
+ 307 K Trainable params
352
+ 0 Non-trainable params
353
+ 307 K Total params
354
+ 1.230 Total estimated model params size (MB)
355
+ 35 Modules in train mode
356
+ 1 Modules in eval mode
357
+ 0 Total Flops
358
+ 2026-01-13 22:35:58 | WARNING | py.warnings | /home/msomeki/00_systems/espnet3/espnet2/asr/espnet_model.py:402: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
359
+ with autocast(self.autocast_frontend, dtype=autocast_type):
360
+
361
+ 2026-01-13 22:35:58 | WARNING | root | Using make_pad_mask with a list of lengths is not tracable. If you try to trace this function with type(lengths) == list, please change the type of lengths to torch.LongTensor.
362
+ 2026-01-13 22:35:58 | WARNING | root | Using make_pad_mask with a list of lengths is not tracable. If you try to trace this function with type(lengths) == list, please change the type of lengths to torch.LongTensor.
363
+ 2026-01-13 22:35:58 | WARNING | py.warnings | /data/user_data/msomeki/espnet3/.venv/lib/python3.11/site-packages/lightning/pytorch/loops/fit_loop.py:534: Found 1 module(s) in eval mode at the start of training. This may lead to unexpected behavior during training. If this is intentional, you can ignore this warning.
364
+
365
+ 2026-01-13 22:35:58 | WARNING | root | Using make_pad_mask with a list of lengths is not tracable. If you try to trace this function with type(lengths) == list, please change the type of lengths to torch.LongTensor.
366
+ 2026-01-13 22:35:58 | WARNING | root | Using make_pad_mask with a list of lengths is not tracable. If you try to trace this function with type(lengths) == list, please change the type of lengths to torch.LongTensor.
367
+ 2026-01-13 22:35:58 | WARNING | root | Using make_pad_mask with a list of lengths is not tracable. If you try to trace this function with type(lengths) == list, please change the type of lengths to torch.LongTensor.
368
+ 2026-01-13 22:35:58 | WARNING | root | Using make_pad_mask with a list of lengths is not tracable. If you try to trace this function with type(lengths) == list, please change the type of lengths to torch.LongTensor.
369
+ 2026-01-13 22:35:58 | INFO | lightning.pytorch.utilities.rank_zero | `Trainer.fit` stopped: `max_epochs=1` reached.
370
+ 2026-01-13 22:35:58 | INFO | espnet3.systems.base.train | Training finished in 1.12s | exp_dir=./exp/train_asr_rnn_data_aug_debug model=None
371
+ 2026-01-13 22:35:58 | INFO | espnet3 | === [DONE] stage: train (1.13s) ===
scores.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "espnet3.systems.asr.metrics.wer.WER": {
3
+ "test": {
4
+ "WER": 100.0
5
+ }
6
+ },
7
+ "espnet3.systems.asr.metrics.cer.CER": {
8
+ "test": {
9
+ "CER": 355.22
10
+ }
11
+ }
12
+ }
stats/collect_stats.log ADDED
@@ -0,0 +1,351 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2026-01-13 22:35:50 | INFO | espnet3 | === ESPnet3 run started: 2026-01-13T22:35:50.223210 ===
2
+ 2026-01-13 22:35:50 | INFO | espnet3 | Command: /data/user_data/msomeki/espnet3/.venv/bin/python3 run.py --stages create_dataset train_tokenizer collect_stats train infer measure --train_config conf/train.yaml --infer_config conf/infer.yaml --measure_config conf/measure.yaml
3
+ 2026-01-13 22:35:50 | INFO | espnet3 | Python: 3.11.13 (main, Aug 18 2025, 19:19:13) [Clang 20.1.4 ]
4
+ 2026-01-13 22:35:50 | INFO | espnet3 | Working directory: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr
5
+ 2026-01-13 22:35:50 | INFO | espnet3 | train config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/train_asr_rnn_data_aug_debug.yaml
6
+ 2026-01-13 22:35:50 | INFO | espnet3 | infer config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/infer.yaml
7
+ 2026-01-13 22:35:50 | INFO | espnet3 | measure config: /home/msomeki/00_systems/espnet3/egs3/mini_an4/asr/conf/measure.yaml
8
+ 2026-01-13 22:35:50 | INFO | espnet3 | Git: commit=8509faad9811b58d5024f29fb9d68ffb026b5e73, short_commit=8509faad9, branch=espnet3/recipe/asr_ls100, worktree=dirty
9
+ 2026-01-13 22:35:50 | INFO | espnet3 | Cluster env:
10
+ OMPI_MCA_plm_slurm_args=--external-launcher
11
+ SLURM_CLUSTER_NAME=babel
12
+ SLURM_CONF=/var/spool/slurmd/conf-cache/slurm.conf
13
+ SLURM_CPUS_ON_NODE=1
14
+ SLURM_CPUS_PER_TASK=1
15
+ SLURM_CPU_BIND=quiet,mask_cpu:0x0000000000010000
16
+ SLURM_CPU_BIND_LIST=0x0000000000010000
17
+ SLURM_CPU_BIND_TYPE=mask_cpu:
18
+ SLURM_CPU_BIND_VERBOSE=quiet
19
+ SLURM_DISTRIBUTION=cyclic,pack
20
+ SLURM_GTIDS=0
21
+ SLURM_JOBID=6122041
22
+ SLURM_JOB_ACCOUNT=swatanab
23
+ SLURM_JOB_CPUS_PER_NODE=1
24
+ SLURM_JOB_END_TIME=1768401875
25
+ SLURM_JOB_GID=2709140
26
+ SLURM_JOB_GROUP=msomeki
27
+ SLURM_JOB_ID=6122041
28
+ SLURM_JOB_NAME=bash
29
+ SLURM_JOB_NODELIST=babel-o9-16
30
+ SLURM_JOB_NUM_NODES=1
31
+ SLURM_JOB_PARTITION=debug
32
+ SLURM_JOB_QOS=debug_qos
33
+ SLURM_JOB_START_TIME=1768358675
34
+ SLURM_JOB_UID=2709140
35
+ SLURM_JOB_USER=msomeki
36
+ SLURM_LAUNCH_NODE_IPADDR=172.16.1.2
37
+ SLURM_LOCALID=0
38
+ SLURM_MEM_PER_NODE=4096
39
+ SLURM_NNODES=1
40
+ SLURM_NODEID=0
41
+ SLURM_NODELIST=babel-o9-16
42
+ SLURM_NPROCS=1
43
+ SLURM_NTASKS=1
44
+ SLURM_NTASKS_PER_NODE=1
45
+ SLURM_PRIO_PROCESS=0
46
+ SLURM_PROCID=0
47
+ SLURM_PTY_PORT=40465
48
+ SLURM_PTY_WIN_COL=112
49
+ SLURM_PTY_WIN_ROW=61
50
+ SLURM_SCRIPT_CONTEXT=prolog_task
51
+ SLURM_SRUN_COMM_HOST=172.16.1.2
52
+ SLURM_SRUN_COMM_PORT=33789
53
+ SLURM_STEPID=0
54
+ SLURM_STEP_ID=0
55
+ SLURM_STEP_LAUNCHER_PORT=33789
56
+ SLURM_STEP_NODELIST=babel-o9-16
57
+ SLURM_STEP_NUM_NODES=1
58
+ SLURM_STEP_NUM_TASKS=1
59
+ SLURM_STEP_TASKS_PER_NODE=1
60
+ SLURM_SUBMIT_DIR=/home/msomeki/00_systems/espnet3
61
+ SLURM_SUBMIT_HOST=login1
62
+ SLURM_TASKS_PER_NODE=1
63
+ SLURM_TASK_PID=3334910
64
+ SLURM_TOPOLOGY_ADDR=babel-o9-16
65
+ SLURM_TOPOLOGY_ADDR_PATTERN=node
66
+ SLURM_TRES_PER_TASK=cpu=1
67
+ SLURM_UMASK=0027
68
+ 2026-01-13 22:35:50 | INFO | espnet3 | Runtime env:
69
+ LD_LIBRARY_PATH=/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:/home/msomeki/00_systems/espnet3/tools/espeak-ng/lib:/home/msomeki/00_systems/espnet3/tools/lib:/home/msomeki/00_systems/espnet3/tools/lib64:
70
+ PATH=/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/data/user_data/msomeki/espnet3/.venv/bin:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/usr/share/Modules/bin:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/00_systems/espnet3/tools/ffmpeg-release:/home/msomeki/00_systems/espnet3/tools/festival/bin:/home/msomeki/00_systems/espnet3/tools/MBROLA/Bin:/home/msomeki/00_systems/espnet3/tools/espeak-ng/bin:/home/msomeki/00_systems/espnet3/tools/BeamformIt:/home/msomeki/00_systems/espnet3/tools/kenlm/build/bin:/home/msomeki/00_systems/espnet3/tools/PESQ/P862_annex_A_2005_CD/source:/home/msomeki/00_systems/espnet3/tools/nkf/nkf-2.1.4:/home/msomeki/00_systems/espnet3/tools/moses/scripts/tokenizer:/home/msomeki/00_systems/espnet3/tools/moses/scripts/generic:/home/msomeki/00_systems/espnet3/tools/tools/moses/scripts/recaser:/home/msomeki/00_systems/espnet3/tools/moses/scripts/training:/home/msomeki/00_systems/espnet3/tools/mwerSegmenter:/home/msomeki/00_systems/espnet3/tools/sctk/bin:/home/msomeki/00_systems/espnet3/tools/sph2pipe:/home/msomeki/00_systems/espnet3/tools/sentencepiece_commands:/home/msomeki/.pixi/bin:/home/msomeki/local/bin:/home/msomeki/utils:/home/msomeki/.local/bin:/home/msomeki/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
71
+ PYTHONPATH=/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:../../../:../../TEMPLATE/asr:/home/msomeki/00_systems/espnet3/egs3/mini_an4/asr:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3:/home/msomeki/00_systems/espnet3/tools/RawNet/python/RawNet3/models:
72
+ 2026-01-13 22:35:50 | INFO | espnet3 | Train config content:
73
+ num_device: 1
74
+ num_nodes: 1
75
+ task: espnet3.systems.asr.task.ASRTask
76
+ recipe_dir: .
77
+ data_dir: ./data
78
+ exp_tag: train_asr_rnn_data_aug_debug
79
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
80
+ stats_dir: ./exp/stats
81
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
82
+ dataset_dir: ./data/mini_an4
83
+ create_dataset:
84
+ func: src.create_dataset.create_dataset
85
+ dataset_dir: ./data/mini_an4
86
+ archive_path: ./../../egs2/mini_an4/asr1/downloads.tar.gz
87
+ dataset:
88
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
89
+ train:
90
+ - name: train_nodev
91
+ dataset:
92
+ _target_: src.dataset.MiniAN4Dataset
93
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
94
+ valid:
95
+ - name: train_dev
96
+ dataset:
97
+ _target_: src.dataset.MiniAN4Dataset
98
+ manifest_path: ./data/mini_an4/manifest/train_dev.tsv
99
+ preprocessor:
100
+ _target_: espnet2.train.preprocessor.CommonPreprocessor
101
+ _convert_: all
102
+ fs: 16000
103
+ train: true
104
+ data_aug_effects:
105
+ - - 0.1
106
+ - contrast
107
+ - enhancement_amount: 75.0
108
+ - - 0.1
109
+ - highpass
110
+ - cutoff_freq: 5000
111
+ Q: 0.707
112
+ - - 0.1
113
+ - equalization
114
+ - center_freq: 1000
115
+ gain: 0
116
+ Q: 0.707
117
+ - - 0.1
118
+ - - - 0.3
119
+ - speed_perturb
120
+ - factor: 0.9
121
+ - - 0.3
122
+ - speed_perturb
123
+ - factor: 1.1
124
+ - - 0.3
125
+ - speed_perturb
126
+ - factor: 1.3
127
+ data_aug_num:
128
+ - 1
129
+ - 4
130
+ data_aug_prob: 1.0
131
+ token_type: bpe
132
+ token_list: ./data/bpe_30/tokens.txt
133
+ bpemodel: ./data/bpe_30/bpe.model
134
+ parallel:
135
+ env: local
136
+ n_workers: 1
137
+ dataloader:
138
+ collate_fn:
139
+ _target_: espnet2.train.collate_fn.CommonCollateFn
140
+ int_pad_value: -1
141
+ train:
142
+ multiple_iterator: false
143
+ num_shards: 1
144
+ iter_factory:
145
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
146
+ shuffle: true
147
+ collate_fn:
148
+ _target_: espnet2.train.collate_fn.CommonCollateFn
149
+ int_pad_value: -1
150
+ num_workers: 0
151
+ batches:
152
+ type: sorted
153
+ shape_files:
154
+ - ./exp/stats/train/feats_shape
155
+ batch_size: 2
156
+ batch_bins: 200000
157
+ valid:
158
+ multiple_iterator: false
159
+ num_shards: 1
160
+ iter_factory:
161
+ _target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
162
+ shuffle: false
163
+ collate_fn:
164
+ _target_: espnet2.train.collate_fn.CommonCollateFn
165
+ int_pad_value: -1
166
+ batches:
167
+ type: sorted
168
+ shape_files:
169
+ - ./exp/stats/valid/feats_shape
170
+ batch_size: 2
171
+ batch_bins: 200000
172
+ optim:
173
+ _target_: torch.optim.Adam
174
+ lr: 0.001
175
+ weight_decay: 0.0
176
+ scheduler:
177
+ _target_: torch.optim.lr_scheduler.ReduceLROnPlateau
178
+ mode: min
179
+ factor: 0.5
180
+ patience: 1
181
+ val_scheduler_criterion: valid/loss
182
+ best_model_criterion:
183
+ - - valid/acc
184
+ - 1
185
+ - max
186
+ trainer:
187
+ accelerator: auto
188
+ devices: 1
189
+ num_nodes: 1
190
+ accumulate_grad_batches: 1
191
+ check_val_every_n_epoch: 1
192
+ gradient_clip_val: 1.0
193
+ log_every_n_steps: 1
194
+ max_epochs: 1
195
+ limit_train_batches: 1
196
+ limit_val_batches: 1
197
+ precision: 32
198
+ logger:
199
+ - _target_: lightning.pytorch.loggers.TensorBoardLogger
200
+ save_dir: ./exp/train_asr_rnn_data_aug_debug/tensorboard
201
+ name: tb_logger
202
+ strategy: auto
203
+ tokenizer:
204
+ vocab_size: 30
205
+ character_coverage: 1.0
206
+ model_type: bpe
207
+ save_path: ./data/bpe_30
208
+ text_builder:
209
+ func: src.tokenizer.gather_training_text
210
+ manifest_path: ./data/mini_an4/manifest/train_nodev.tsv
211
+ model:
212
+ vocab_size: 30
213
+ token_list: ./data/bpe_30/tokens.txt
214
+ encoder: vgg_rnn
215
+ encoder_conf:
216
+ num_layers: 1
217
+ hidden_size: 2
218
+ output_size: 2
219
+ decoder: rnn
220
+ decoder_conf:
221
+ hidden_size: 2
222
+ normalize: utterance_mvn
223
+ normalize_conf: {}
224
+ model_conf:
225
+ ctc_weight: 0.3
226
+ lsm_weight: 0.1
227
+ length_normalized_loss: false
228
+ frontend: default
229
+ frontend_conf:
230
+ n_fft: 512
231
+ win_length: 400
232
+ hop_length: 160
233
+
234
+ 2026-01-13 22:35:50 | INFO | espnet3 | Infer config content:
235
+ num_device: 1
236
+ num_nodes: 1
237
+ recipe_dir: .
238
+ data_dir: ./data
239
+ exp_tag: train_asr_rnn_data_aug_debug
240
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
241
+ stats_dir: ./exp/stats
242
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
243
+ dataset_dir: ./data/mini_an4
244
+ dataset:
245
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
246
+ test:
247
+ - name: test
248
+ dataset:
249
+ _target_: src.dataset.MiniAN4Dataset
250
+ manifest_path: ./data/mini_an4/manifest/test.tsv
251
+ parallel:
252
+ env: local
253
+ n_workers: 1
254
+ model:
255
+ _target_: espnet2.bin.asr_inference.Speech2Text
256
+ asr_train_config: ./exp/train_asr_rnn_data_aug_debug/config.yaml
257
+ asr_model_file: ./exp/train_asr_rnn_data_aug_debug/last.ckpt
258
+ beam_size: 1
259
+ ctc_weight: 0.3
260
+ tokenizer:
261
+ vocab_size: 30
262
+ character_coverage: 1.0
263
+ model_type: bpe
264
+ save_path: ./data/bpe_30
265
+
266
+ 2026-01-13 22:35:50 | INFO | espnet3 | Measure config content:
267
+ recipe_dir: .
268
+ data_dir: ./data
269
+ exp_tag: train_asr_rnn_data_aug_debug
270
+ exp_dir: ./exp/train_asr_rnn_data_aug_debug
271
+ stats_dir: ./exp/stats
272
+ decode_dir: ./exp/train_asr_rnn_data_aug_debug/decode
273
+ dataset_dir: ./data/mini_an4
274
+ dataset:
275
+ _target_: espnet3.components.data.data_organizer.DataOrganizer
276
+ test:
277
+ - name: test
278
+ dataset:
279
+ _target_: src.dataset.MiniAN4Dataset
280
+ manifest_path: ./data/mini_an4/manifest/test.tsv
281
+ metrics:
282
+ - metric:
283
+ _target_: espnet3.systems.asr.metrics.wer.WER
284
+ clean_types: null
285
+ - metric:
286
+ _target_: espnet3.systems.asr.metrics.cer.CER
287
+ clean_types: null
288
+
289
+ 2026-01-13 22:35:50 | INFO | espnet3 | === [START] stage: collect_stats ===
290
+ 2026-01-13 22:35:50 | INFO | espnet3.systems.base.system | Collecting stats | exp_dir=./exp/train_asr_rnn_data_aug_debug stats_dir=./exp/stats
291
+ 2026-01-13 22:35:55 | INFO | root | Vocabulary size: 30
292
+ 2026-01-13 22:35:56 | INFO | espnet3.systems.base.train | Model:
293
+ ESPnetASRModel(
294
+ (frontend): DefaultFrontend(
295
+ (stft): Stft(n_fft=512, win_length=400, hop_length=160, center=True, normalized=False, onesided=True)
296
+ (frontend): Frontend()
297
+ (logmel): LogMel(sr=16000, n_fft=512, n_mels=80, fmin=0, fmax=8000.0, htk=False)
298
+ )
299
+ (normalize): UtteranceMVN(norm_means=True, norm_vars=False)
300
+ (encoder): VGGRNNEncoder(
301
+ (enc): ModuleList(
302
+ (0): VGG2L(
303
+ (conv1_1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
304
+ (conv1_2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
305
+ (conv2_1): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
306
+ (conv2_2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
307
+ )
308
+ (1): RNNP(
309
+ (birnn0): LSTM(2560, 2, batch_first=True, bidirectional=True)
310
+ (bt0): Linear(in_features=4, out_features=2, bias=True)
311
+ )
312
+ )
313
+ )
314
+ (decoder): RNNDecoder(
315
+ (embed): Embedding(30, 2)
316
+ (dropout_emb): Dropout(p=0.0, inplace=False)
317
+ (decoder): ModuleList(
318
+ (0): LSTMCell(4, 2)
319
+ )
320
+ (dropout_dec): ModuleList(
321
+ (0): Dropout(p=0.0, inplace=False)
322
+ )
323
+ (output): Linear(in_features=2, out_features=30, bias=True)
324
+ (att_list): ModuleList(
325
+ (0): AttLoc(
326
+ (mlp_enc): Linear(in_features=2, out_features=320, bias=True)
327
+ (mlp_dec): Linear(in_features=2, out_features=320, bias=False)
328
+ (mlp_att): Linear(in_features=10, out_features=320, bias=False)
329
+ (loc_conv): Conv2d(1, 10, kernel_size=(1, 201), stride=(1, 1), padding=(0, 100), bias=False)
330
+ (gvec): Linear(in_features=320, out_features=1, bias=True)
331
+ )
332
+ )
333
+ )
334
+ (criterion_att): LabelSmoothingLoss(
335
+ (criterion): KLDivLoss()
336
+ )
337
+ (ctc): CTC(
338
+ (ctc_lo): Linear(in_features=2, out_features=30, bias=True)
339
+ (ctc_loss): CTCLoss()
340
+ )
341
+ )
342
+ 2026-01-13 22:35:56 | WARNING | py.warnings | /data/user_data/msomeki/espnet3/.venv/lib/python3.11/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python3 run.py --stages create_dataset train_tokenizer coll ...
343
+
344
+ 2026-01-13 22:35:56 | INFO | lightning.pytorch.utilities.rank_zero | GPU available: False, used: False
345
+ 2026-01-13 22:35:56 | INFO | lightning.pytorch.utilities.rank_zero | TPU available: False, using: 0 TPU cores
346
+ 2026-01-13 22:35:56 | INFO | lightning.pytorch.utilities.rank_zero | `Trainer(limit_train_batches=1)` was configured so 1 batch per epoch will be used.
347
+ 2026-01-13 22:35:56 | INFO | lightning.pytorch.utilities.rank_zero | `Trainer(limit_val_batches=1)` was configured so 1 batch will be used.
348
+ 2026-01-13 22:35:56 | INFO | root | Vocabulary size: 30
349
+ 2026-01-13 22:35:57 | INFO | root | Vocabulary size: 30
350
+ 2026-01-13 22:35:57 | INFO | espnet3.systems.base.train | Collect stats finished in 6.28s | exp_dir=./exp/train_asr_rnn_data_aug_debug stats_dir=./exp/stats
351
+ 2026-01-13 22:35:57 | INFO | espnet3 | === [DONE] stage: collect_stats (6.28s) ===
stats/train/feats_lengths_shape ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ 0 1
2
+ 1 1
3
+ 2 1
4
+ 3 1
stats/train/feats_lengths_stats.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4eab3dc1c93ba1f9671895af7f707f245a49e46d85276682b70db5277ffe22e
3
+ size 778
stats/train/feats_shape ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ 0 64,80
2
+ 1 281,80
3
+ 2 101,80
4
+ 3 201,80
stats/train/feats_stats.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:521d8a7eae978a400b0981eb5c95d11bf354aee36f33c6cd3cc2ac4d4650a27f
3
+ size 1402
stats/train/stats_keys ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ feats
2
+ feats_lengths
stats/valid/feats_lengths_shape ADDED
@@ -0,0 +1 @@
 
 
1
+ 0 1
stats/valid/feats_lengths_stats.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fb64b8383da5856dc2664e11faf0345352b351775726cde7f18efb3876d04990
3
+ size 778
stats/valid/feats_shape ADDED
@@ -0,0 +1 @@
 
 
1
+ 0 101,80
stats/valid/feats_stats.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:14b0a8ef9b16caa293a71b7f0f79f69e305ad4f8015f0d7f97974bdeccb006df
3
+ size 1402
stats/valid/stats_keys ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ feats
2
+ feats_lengths