LocalAI-io
/

LocalVQE

@@ -54,6 +54,35 @@ factor (higher is faster than realtime).
 - `bf16` GGUFs are ~12 % smaller with identical quality and speed; pick `f32`
   unless download size matters.
 ## Files in this repository
 | File | Size | Model |
@@ -61,6 +90,8 @@ factor (higher is faster than realtime).
 | `localvqe-v1.4-aec-200K-f32.gguf` | 3 MB | v1.4-AEC (echo only) |
 | `localvqe-v1.4-aec-200K-bf16.gguf` | 2.6 MB | v1.4-AEC, conv weights in BF16 |
 | `localvqe-v1.4-aec-2.7K-f32.gguf` | 17 KB | v1.4-AEC front-end only (adaptive filter, no mask) |
 | `localvqe-v1.3-4.8M-f32.gguf` | 19 MB | v1.3 joint — GGUF the engine loads |
 | `localvqe-v1.3-4.8M.pt` | 55 MB | v1.3 joint — PyTorch checkpoint (research) |
 | `localvqe-v1.2-1.3M-f32.gguf` | 5 MB | v1.2 joint — GGUF |
@@ -176,6 +207,21 @@ button produces APA / BibTeX), and the upstream DeepVQE paper:
 }
 ```
 ## Dataset attribution
 Weights are trained on the

 - `bf16` GGUFs are ~12 % smaller with identical quality and speed; pick `f32`
   unless download size matters.
+### Compact line — GTCRN-AEC (for lower-power CPUs)
+A separate, much smaller second line of models for lower-power CPUs: a
+~49 K-parameter **GTCRN-AEC** network — a distinct architecture based on
+[GTCRN](https://github.com/Xiaobin-Rong/gtcrn) (Rong et al., ICASSP 2024) —
+paired with the project's DSP echo-cancellation front-end. The GGUFs are
+self-contained, so they run with the same single command as every other model.
+Two variants share the architecture:
+| Model | Does | Params |
+|---|---|---:|
+| **localvqe-pi-v1-49k** | AEC + NS + dereverb (full enhance) | 49 K |
+| **localvqe-pi-aec-v1-49k** | echo only — keeps noise + room | 49 K |
+Whole-clip real-time factor on the real ggml graph, benchmarked on a Raspberry
+Pi 5 (one example of a low-power target; `test_gtcrn --bench`, Cortex-A76,
+Ubuntu 24.04), parity-verified to the PyTorch reference within ~1e-6 on-device.
+RTF is identical for both variants:
+| Threads | 8 s clip | RTF | RT factor |
+|--:|--:|--:|--:|
+| 1 | 388 ms | 0.048 | ~21× |
+| 2 | 219 ms | 0.027 | ~37× |
+| 4 | 163 ms | 0.020 | ~49× |
+That is ~0.78 ms per 16 ms hop single-threaded. Runs on any CPU; for single-board
+ARM, cross-compile for aarch64 with `ggml/docker/Dockerfile.arm64` (docker buildx
++ qemu). `f16`/`q8` quantizations are published only if/when released.
 ## Files in this repository
 | File | Size | Model |
 | `localvqe-v1.4-aec-200K-f32.gguf` | 3 MB | v1.4-AEC (echo only) |
 | `localvqe-v1.4-aec-200K-bf16.gguf` | 2.6 MB | v1.4-AEC, conv weights in BF16 |
 | `localvqe-v1.4-aec-2.7K-f32.gguf` | 17 KB | v1.4-AEC front-end only (adaptive filter, no mask) |
+| `localvqe-pi-v1-49k-f32.gguf` | 2.3 MB | Compact line — GTCRN-AEC full enhance (echo + NS + dereverb) |
+| `localvqe-pi-aec-v1-49k-f32.gguf` | 2.3 MB | Compact line — GTCRN-AEC echo-only (keeps noise + room) |
 | `localvqe-v1.3-4.8M-f32.gguf` | 19 MB | v1.3 joint — GGUF the engine loads |
 | `localvqe-v1.3-4.8M.pt` | 55 MB | v1.3 joint — PyTorch checkpoint (research) |
 | `localvqe-v1.2-1.3M-f32.gguf` | 5 MB | v1.2 joint — GGUF |
 }
 ```
+The compact GTCRN-AEC line is based on **GTCRN** — please also cite:
+```bibtex
+@inproceedings{rong2024gtcrn,
+  title     = {GTCRN: A Speech Enhancement Model Requiring Ultralow
+               Computational Resources},
+  author    = {Rong, Xiaobin and Sun, Tianchi and Zhang, Xu and Hu, Yuxiang
+               and Zhu, Changbao and Lu, Jing},
+  booktitle = {ICASSP 2024 - 2024 IEEE International Conference on Acoustics,
+               Speech and Signal Processing (ICASSP)},
+  pages     = {971--975}, year = {2024},
+  doi       = {10.1109/ICASSP48485.2024.10448310}
+}
+```
 ## Dataset attribution
 Weights are trained on the