sheldonrobinson
/

Kroko-ASR

Automatic Speech Recognition

Model card Files Files and versions

Kroko-ASR / README.md

sheldonrobinson's picture

sheldonrobinson

Upload folder using huggingface_hub

ddffcbc verified 5 months ago

|

history blame contribute delete

1.5 kB

	---
	license: other
	license_name: test
	license_link: LICENSE
	language:
	- en
	- fr
	- de
	- es
	- pt
	metrics:
	- accuracy
	- cer
	pipeline_tag: automatic-speech-recognition
	---
	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	> ( update august 2025 - CC-BY models are coming soon. )

	## Overview
	This is a family of low-latency streaming models designed for use on edge devices.
	Goal: Provide faster or higher-quality performance compared to similarly sized Whisper and other models.

	- Languages: English, French, German (7 more languages coming).

	## Demos
	- [Browser Demo (CPU)](https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm)
	(Runs entirely in the browser using CPU.)
	- [Gradio / Python Demo](https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Python)

	## License
	The license is still under consideration (likely Coqui). The model is intended to be dual-licensed:
	- Free for non-commercial use.
	- Affordable license for commercial use.



	## Training
	- Training is done with a modified k2/Icefall pipeline.
	- Inference can be performed with the standard Sherpa project.
	- Silence padding and volume normalization may help produce better results.

	## Acknowledgements
	Special thanks to the [Lhotse](https://github.com/lhotse-speech/lhotse), [Sherpa](https://github.com/k2-fsa/sherpa), [k2](https://github.com/k2-fsa/k2), and [Icefall](https://github.com/k2-fsa/icefall) teams for their support and tools.