ESPnet3 asr model
Packed model bundle generated from egs3/mini_an4/asr.
Model
- Repository:
ms180/CI_mini_an4_training_asr_transformer - Recipe:
egs3/mini_an4/asr - Task:
asr - System:
espnet3.systems.asr.task.ASRTask - Creator:
masao - Created:
2026-05-11T19:28:10 - Branch:
espnet3/publish_stage - Commit:
a0087239784f92b800ee9f12878af6cdb0e10c63(a008723978) - Worktree: dirty
- Origin: git@github.com:Masao-Someki/espnet.git
Usage
from espnet3.publication import InferenceModel
model = InferenceModel.from_pretrained("ms180/CI_mini_an4_training_asr_transformer", trust_user_code=True)
result = model(sample)
Packaging
- Bundle:
model_pack - Exp dir:
./exp/training_asr_transformer - Strategy:
copy experiment outputs; include extra recipe assets; register named artifact files; apply exclude filters
Results
| dataset | CER | WER |
|---|---|---|
| test | 213.43 | 100.0 |
| valid | 933.33 | 100.0 |
Training config
expand
num_device: 1
num_nodes: 1
task: espnet3.systems.asr.task.ASRTask
recipe_dir: .
data_dir: ./data
exp_tag: training_asr_transformer
exp_dir: ./exp/training_asr_transformer
stats_dir: ./exp/stats
dataset_dir: /path/to/your/dataset
create_dataset:
func: src.creating_dataset.create_dataset
dataset_dir: /path/to/your/dataset
recipe_dir: .
dataset:
_target_: espnet3.components.data.data_organizer.DataOrganizer
recipe_dir: .
train:
- data_src_args:
split: train
valid:
- data_src_args:
split: valid
test: null
preprocessor:
_target_: espnet2.train.preprocessor.CommonPreprocessor
fs: 16000
train: true
data_aug_effects:
- - 0.1
- contrast
- enhancement_amount: 75.0
- - 0.1
- highpass
- cutoff_freq: 5000
Q: 0.707
- - 0.1
- equalization
- center_freq: 1000
gain: 0
Q: 0.707
- - 0.1
- - - 0.3
- speed_perturb
- factor: 0.9
- - 0.3
- speed_perturb
- factor: 1.1
- - 0.3
- speed_perturb
- factor: 1.3
data_aug_num:
- 1
- 4
data_aug_prob: 1.0
token_type: bpe
token_list: ./data/bpe_30/tokens.txt
bpemodel: ./data/bpe_30/bpe.model
_convert_: all
_convert_: all
tokenizer:
vocab_size: 30
character_coverage: 1.0
model_type: bpe
save_path: ./data/bpe_30
text_builder:
func: src.tokenizer.gather_training_text
manifest_path: ./data/manifest/train.tsv
model:
vocab_size: 30
token_list: ./data/bpe_30/tokens.txt
encoder: transformer
encoder_conf:
output_size: 2
attention_heads: 2
linear_units: 2
num_blocks: 2
dropout_rate: 0.1
positional_dropout_rate: 0.1
attention_dropout_rate: 0.0
input_layer: conv1d2
normalize_before: true
decoder: transformer
decoder_conf:
attention_heads: 2
linear_units: 2
num_blocks: 2
dropout_rate: 0.1
positional_dropout_rate: 0.1
self_attention_dropout_rate: 0.0
src_attention_dropout_rate: 0.0
model: espnet
model_conf:
ctc_weight: 0.3
lsm_weight: 0.1
length_normalized_loss: false
frontend: default
frontend_conf:
n_fft: 512
win_length: 400
hop_length: 160
optimizer:
_target_: torch.optim.Adam
lr: 0.005
weight_decay: 1.0e-06
_convert_: all
scheduler:
_target_: espnet2.schedulers.warmup_lr.WarmupLR
warmup_steps: 100
_convert_: all
scheduler_interval: step
scheduler_monitor: null
best_model_criterion:
- - valid/loss
- 10
- min
seed: null
init: xavier_uniform
parallel:
env: local
n_workers: 1
options: {}
dataloader:
collate_fn:
_target_: espnet2.train.collate_fn.CommonCollateFn
int_pad_value: -1
_convert_: all
train:
total_shards: 1
dist_world_size: 1
iter_factory:
_target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
shuffle: true
collate_fn:
_target_: espnet2.train.collate_fn.CommonCollateFn
int_pad_value: -1
_convert_: all
batches:
type: unsorted
shape_files:
- ./exp/stats/train/feats_shape
batch_size: 2
batch_bins: 4000000
_convert_: all
valid:
total_shards: 1
dist_world_size: 1
iter_factory:
_target_: espnet2.iterators.sequence_iter_factory.SequenceIterFactory
shuffle: false
collate_fn:
_target_: espnet2.train.collate_fn.CommonCollateFn
int_pad_value: -1
_convert_: all
batches:
type: unsorted
shape_files:
- ./exp/stats/valid/feats_shape
batch_size: 2
batch_bins: 4000000
_convert_: all
trainer:
accelerator: auto
devices: 1
num_nodes: 1
accumulate_grad_batches: 1
check_val_every_n_epoch: 1
gradient_clip_val: 1.0
log_every_n_steps: 100
max_epochs: 1
logger:
- _target_: lightning.pytorch.loggers.TensorBoardLogger
save_dir: ./exp/training_asr_transformer/tensorboard
name: tb_logger
_convert_: all
strategy: auto
limit_train_batches: 1
limit_val_batches: 1
reload_dataloaders_every_n_epochs: 1
use_distributed_sampler: false
fit: {}
Citing ESPnet
@inproceedings{watanabe2018espnet,
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and
Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner
and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
title={{ESPnet}: End-to-End Speech Processing Toolkit},
year={2018},
booktitle={Proceedings of Interspeech},
pages={2207--2211},
doi={10.21437/Interspeech.2018-1456}
}
- Downloads last month
- 17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support