Bartelds

Upload checkpoint, sanitized config, and transcripts for ctc-baseline_mms_set_2

c8342d3 10 months ago

1.17 kB

	---
	title: "CTC-DRO MMS-based ASR model - set 2"
	language: multilingual
	tags:
	- asr
	- ctc-dro
	- MMS
	license: cc-by-nc-4.0
	---

	# CTC-Baseline MMS-based ASR model - set 2

	This repository contains a CTC-Baseline MMS-based automatic speech recognition (ASR) model trained with ESPnet.
	The model was trained on balanced training data from set 2.

	## Intended Use

	This model is intended for ASR. Users can run inference using the provided checkpoint (`valid.loss.best.pth`) and configuration file (`config.yaml`):
	```bash
	import soundfile as sf
	from espnet2.bin.asr_inference import Speech2Text

	asr_train_config = "ctc-baseline_mms_set_2/config.yaml"
	asr_model_file = "ctc-baseline_mms_set_2/valid.loss.best.pth"

	model = Speech2Text.from_pretrained(
	asr_train_config=asr_train_config,
	asr_model_file=asr_model_file
	)

	speech, _ = sf.read("input.wav")
	text, *_ = model(speech)[0]

	print("Recognized text:", text)
	```

	## How to Use

	1. Clone this repository.
	2. Use ESPnet’s inference scripts with the provided `config.yaml` and checkpoint file.
	3. Ensure any external resources referenced in `config.yaml` are available at the indicated relative paths.