--- license: apache-2.0 language: - ru tags: - automatic-speech-recognition - speaker-diarization - onnx - russian - asr - gigaam - 3d-speaker - camplus - eres2net - mobile - offline library_name: onnx --- # ProtocolVoice ASR Models ONNX models for offline Russian speech recognition and speaker diarization, packaged for the [ProtocolVoice](https://github.com/protocolvoice) Android app. ## Contents | File | Size | Purpose | Original source | Original license | |---|---|---|---|---| | `gigaam_v3_e2e_ctc_int8.onnx` | 305 MB | Russian ASR with built-in punctuation | [Sber/SaluteDevices GigaAM](https://github.com/salute-developers/GigaAM) (v3, e2e CTC, int8-quantized) | MIT | | `speaker_embedding_camplus.onnx` | 27 MB | Speaker embedding (CAM++) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 | | `speaker_embedding.onnx` | 111 MB | Speaker embedding (ERes2Net) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 | | `speaker_embedding_v2.onnx` | 68 MB | Speaker embedding (ERes2NetV2) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 | | `manifest.json` | < 1 KB | SHA-256 hashes of all models | this repo | Apache-2.0 | ## Important These are NOT new models — this repository **redistributes existing models** in ONNX format for convenient mobile delivery. The original authors retain all credit and copyright. We did not train, fine-tune, or modify the model weights. **Please cite the original projects, not this redistribution:** - **GigaAM-v3** (ASR): Sber AI, SaluteDevices — https://github.com/salute-developers/GigaAM - **3D-Speaker** (CAM++, ERes2Net, ERes2NetV2): ModelScope, Alibaba — https://github.com/modelscope/3D-Speaker The ONNX conversions and runtime were prepared via [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) (Apache-2.0). ## Why this redistribution The ProtocolVoice mobile app needs to download these models on first run from a mirror that: - supports files larger than 100 MB without git-lfs limits, - has fast CDN reachable from Russia, - is the conventional hosting platform for ML models. All redistributed files retain their original licenses. This README serves as the required attribution under those licenses. ## How to use Each model is loaded by [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) on the device. The ProtocolVoice app: 1. Downloads each `.onnx` file by HTTP from `https://huggingface.co/protocolvoice/asr-models/resolve/main/{filename}`, 2. Verifies SHA-256 against `manifest.json`, 3. Loads via sherpa-onnx for offline inference. You can also use these files directly with sherpa-onnx in any project that respects the original licenses. ## Verifying integrity ```python import hashlib with open("gigaam_v3_e2e_ctc_int8.onnx", "rb") as f: print(hashlib.sha256(f.read()).hexdigest()) # expected: 0aacb41f70f0f5aaac4b45dd430337b9e16b180f22c72af04db8516e7609c3c0 ``` Hashes for all files are in `manifest.json`. ## License This repository's metadata, README, and packaging scripts are released under **Apache-2.0**. Each model file remains under its original license (see the table above). By using a model, you accept its original license — not just this repository's. ## Removal request If you are an author of one of the upstream projects and have any concerns about this redistribution (attribution, hosting, anything else), please open a discussion on this Hugging Face repo or email the maintainers — the files will be amended or removed as requested.