Banafo commited on
Commit
d613389
Β·
verified Β·
1 Parent(s): 4ade228

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -20
README.md CHANGED
@@ -13,36 +13,82 @@ metrics:
13
  - cer
14
  pipeline_tag: automatic-speech-recognition
15
  ---
16
- # Model Card for Model ID
17
 
18
- <!-- Provide a quick summary of what the model is/does. -->
19
 
 
20
  >
21
- >
22
- **( update september 2025 - CC-BY-SA models were just uploaded, the new ones (with .data extension ) are CC-BY-SA licensed, the .onnx are still non-commercial only. Github and readme updates coming soon. )**
23
 
24
- ## Overview
25
- This is a family of low-latency streaming models designed for use on edge devices.
26
- **Goal**: Provide faster or higher-quality performance compared to similarly sized Whisper and other models.
27
 
28
- - **Languages**: English, French, German (7 more languages coming).
 
 
 
 
 
 
 
29
 
30
  ## Demos
31
- - [**Browser Demo (CPU)**](https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm)
32
- *(Runs entirely in the browser using CPU.)*
33
- - [**Gradio / Python Demo**](https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Python)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ## License
36
- The license is still under consideration (likely Coqui). The model is intended to be **dual-licensed**:
37
- - **Free for non-commercial use**.
38
- - **Affordable license for commercial use**.
39
 
 
 
 
40
 
 
41
 
42
- ## Training
43
- - Training is done with a modified k2/Icefall pipeline.
44
- - Inference can be performed with the standard Sherpa project.
45
- - Silence padding and volume normalization may help produce better results.
46
 
47
- ## Acknowledgements
48
- Special thanks to the [Lhotse](https://github.com/lhotse-speech/lhotse), [Sherpa](https://github.com/k2-fsa/sherpa), [k2](https://github.com/k2-fsa/k2), and [Icefall](https://github.com/k2-fsa/icefall) teams for their support and tools.
 
 
13
  - cer
14
  pipeline_tag: automatic-speech-recognition
15
  ---
 
16
 
17
+ # Welcome to Kroko πŸ‘‹
18
 
19
+ ## **Open-source speech recognition built for developers.**
20
  >
21
+ > Our engine is fully open-source, and you choose how to deploy models: use our **CC-BY-SA licensed community models** or upgrade to **commercial models** with premium performance. We focus on building **fast, high-quality production models** and providing **examples that take the guesswork out** of integration.
 
22
 
23
+ ## Why Kroko ASR?
 
 
24
 
25
+ - ⚑ **Fast & lightweight** – optimized Zipformer models (Whisper and parakeet style coming).
26
+ - 🧩 **Flexible licensing** – use **fully open-source CC-BY community models** or integrate **commercial/OEM models** for premium accuracy.
27
+ - 🌍 **Runs anywhere** – cross-platform and with support for many programming languages.
28
+ - πŸ“± **Mobile & web ready** – works on Android, (iOS coming soon) in the browser via WASM, and with WebSockets for streaming.
29
+ - 🧰 **Production focus** – we prioritize real-world performance, stability, and examples.
30
+ - 🀝 **Customizable** – bring your own model, fine-tune for domain-specific vocabularies, or commission us.
31
+
32
+ > Our mission: **fast, high-quality ASR with licensing that works for both open-source and closed-source projects.**
33
 
34
  ## Demos
35
+
36
+ ### ▢️ Android App
37
+ Run speech recognition **natively on your phone** using ONNX Runtime.
38
+ - [Kroko ASR Model Explorer](https://play.google.com/store/apps/details?id=com.krokoasr.demo&hl=en)
39
+
40
+ ### 🌐 Browser (WASM)
41
+ Experience transcription **directly in your browser**, no server required.
42
+ - [Hugging Face Spaces Demo](https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm)
43
+
44
+ ## Models
45
+
46
+ Kroko ASR follows a **unique dual-model strategy**:
47
+
48
+ ### 1. Community Models (free, open-source)
49
+
50
+ - Licensed under **CC-BY-SA**.
51
+ - Low-latency, lightweight models.
52
+ - Perfect for hobby projects, research, or free tiers.
53
+ - Faster and smaller than Whisper/Parakeet in many scenarios.
54
+
55
+ ### 2. Commercial & OEM Models
56
+
57
+ - Premium accuracy and robustness.
58
+ - Licensed for professional and production products.
59
+ - Designed for SaaS, dev tools, and enterprise integration.
60
+
61
+ ### 3. Bring, Train, or Commission Your Own
62
+
63
+ - **DIY:** Use our training guides to build and distribute your own models.
64
+ - **Professional services:** Work with us to create fine-tuned models for accents, jargon, or specialized domains.
65
+
66
+ > This gives you **full freedom**: start free, scale commercially, or roll your own.
67
+
68
+ ## Our Community
69
+
70
+ Join the Kroko community to learn, share, and contribute:
71
+
72
+ - πŸ’¬ **[Discord](https://discord.gg/JT7wdtnK79)** – chat with developers, ask questions, and share projects.
73
+ - πŸ“’ **[Reddit](https://www.reddit.com/r/kroko_ai/)** – join discussions, showcase your integrations, and follow updates.
74
+ - πŸ€— **[Hugging Face](https://huggingface.co/Banafo/Kroko-ASR)** – explore our models, try live demos, and contribute feedback.
75
+
76
+ ## Contributing
77
+
78
+ PRs welcome! Run `ruff`, `black`, and `pytest` before submitting.
79
+
80
+ ---
81
 
82
  ## License
 
 
 
83
 
84
+ Apache-2.0 engine. Models licensed separately (CC-BY community or commercial OEM).
85
+
86
+ ---
87
 
88
+ ## Credits
89
 
90
+ Kroko ASR is built on top of [**Sherpa-ONNX**](https://k2-fsa.github.io/sherpa/).
 
 
 
91
 
92
+ ⚠️ **Note:** Kroko ASR is an independent project and is **not affiliated with Sherpa-ONNX**. We build on their excellent open-source engine, but our models, demos, and packaging are developed and maintained separately.
93
+
94
+ ---