| --- |
| license: apache-2.0 |
| tags: |
| - eeg |
| - ecg |
| - emg |
| - biosignal |
| - tokenizer |
| - neuroscience |
| - bci |
| - brain-computer-interface |
| - vector-quantization |
| - rvq |
| - safetensors |
| - burn |
| - rust |
| language: |
| - en |
| library_name: neurorvq-rs |
| pipeline_tag: feature-extraction |
| --- |
| |
| # NeuroRVQ β Safetensors Weights |
|
|
| Pre-converted [safetensors](https://github.com/huggingface/safetensors) weights for the [NeuroRVQ](https://github.com/KonstantinosBarmpas/NeuroRVQ) multi-scale biosignal tokenizer, ready for use with **[neurorvq-rs](https://github.com/eugenehp/neurorvq-rs)** (pure-Rust inference on [Burn 0.20](https://burn.dev)) or any framework that supports safetensors. |
|
|
| Weights are converted from the official PyTorch `.pt` checkpoints published at [ntinosbarmpas/NeuroRVQ](https://huggingface.co/ntinosbarmpas/NeuroRVQ). |
|
|
| ## Model Files |
|
|
| ### Tokenizers (encoder β RVQ β decoder) |
|
|
| | File | Modality | Params | Size | Embed | Patch | RVQ | |
| |------|----------|--------|------|-------|-------|-----| |
| | [`NeuroRVQ_EEG_tokenizer_v1.safetensors`](NeuroRVQ_EEG_tokenizer_v1.safetensors) | **EEG** | 76.0 M | 304 MB | 200 | 200 | 8 levels | |
| | [`NeuroRVQ_ECG_tokenizer_v1.safetensors`](NeuroRVQ_ECG_tokenizer_v1.safetensors) | **ECG** | 68.1 M | 272 MB | 40 | 40 | 8 levels | |
| | [`NeuroRVQ_EMG_tokenizer_v1.safetensors`](NeuroRVQ_EMG_tokenizer_v1.safetensors) | **EMG** | 143.6 M | 574 MB | 200 | 200 | 16 levels | |
|
|
| ### Foundation Models (encoder only) |
|
|
| | File | Modality | Params | Size | Depth | |
| |------|----------|--------|------|-------| |
| | [`NeuroRVQ_EEG_foundation_model_v1.safetensors`](NeuroRVQ_EEG_foundation_model_v1.safetensors) | **EEG** | 58.6 M | 234 MB | 12 blocks | |
| | [`NeuroRVQ_EMG_foundation_model_v1.safetensors`](NeuroRVQ_EMG_foundation_model_v1.safetensors) | **EMG** | 111.2 M | 445 MB | 12 blocks | |
|
|
| ### Config Flags |
|
|
| | File | Description | |
| |------|-------------| |
| | [`flags/NeuroRVQ_EEG_v1.yml`](flags/NeuroRVQ_EEG_v1.yml) | EEG β 103 channels, patch=200, embed=200 | |
| | [`flags/NeuroRVQ_ECG_v1.yml`](flags/NeuroRVQ_ECG_v1.yml) | ECG β 15 channels, patch=40, embed=40 | |
| | [`flags/NeuroRVQ_EMG_v1.yml`](flags/NeuroRVQ_EMG_v1.yml) | EMG β 16 channels, patch=200, embed=200 | |
|
|
| ## Quick Start β Rust |
|
|
| ```bash |
| # Install |
| cargo add neurorvq-rs |
| |
| # Download weights + config |
| huggingface-cli download eugenehp/NeuroRVQ \ |
| NeuroRVQ_EEG_tokenizer_v1.safetensors \ |
| flags/NeuroRVQ_EEG_v1.yml \ |
| --local-dir weights/ |
| |
| # Run tokenization |
| cargo run --release --bin infer -- \ |
| --config weights/flags/NeuroRVQ_EEG_v1.yml \ |
| --weights weights/NeuroRVQ_EEG_tokenizer_v1.safetensors |
| ``` |
|
|
| ### Library API |
|
|
| ```rust |
| use neurorvq_rs::{NeuroRVQEncoder, Modality, data, channels}; |
| use std::path::Path; |
| |
| let (model, _ms) = NeuroRVQEncoder::<B>::load_with_modality( |
| Path::new("flags/NeuroRVQ_EEG_v1.yml"), |
| Path::new("NeuroRVQ_EEG_tokenizer_v1.safetensors"), |
| Modality::EEG, |
| device, |
| )?; |
| |
| // Tokenize β 4 branches Γ 8 RVQ levels of discrete indices |
| let tokens = model.tokenize(&batch)?; |
| for (br, levels) in tokens.branch_tokens.iter().enumerate() { |
| for (lv, indices) in levels.iter().enumerate() { |
| println!("Branch {} Level {}: {} tokens", br, lv, indices.len()); |
| } |
| } |
| ``` |
|
|
| ### Foundation Model API |
|
|
| ```rust |
| use neurorvq_rs::{NeuroRVQFoundationModel, Modality}; |
| |
| let (fm, _ms) = NeuroRVQFoundationModel::<B>::load( |
| Path::new("flags/NeuroRVQ_EEG_v1.yml"), |
| Path::new("NeuroRVQ_EEG_foundation_model_v1.safetensors"), |
| Modality::EEG, |
| device, |
| )?; |
| |
| let features = fm.encode(&batch)?; // 4 branch feature vectors |
| let pooled = fm.encode_pooled(&batch)?; // Mean-pooled for classification |
| ``` |
|
|
| ## Quick Start β Python |
|
|
| ```python |
| from safetensors.torch import load_file |
| |
| state_dict = load_file("NeuroRVQ_EEG_tokenizer_v1.safetensors") |
| model.load_state_dict(state_dict, strict=False) |
| ``` |
|
|
| ## Architecture |
|
|
| ``` |
| Raw Signal [B, N, T] |
| β |
| βΌ |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Multi-Scale Temporal Conv β 4 parallel branches |
| β EEG/ECG: k=21,15,9,5 β modality-specific kernels |
| β EMG: k=51,17,8,5 β |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Γ4 branches |
| βΌ |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Transformer Encoder β 12 blocks, 10 heads |
| β + spatial / temporal pos. embeds β shared weights across branches |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Γ4 branches |
| βΌ |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Encode Heads β Linear β Tanh β Linear |
| β embed_dim β code_dim (128) β |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Γ4 branches |
| βΌ |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Residual Vector Quantization β 8 levels (EEG/ECG) |
| β L2-norm codebook lookup β 16 levels (EMG) |
| β codebook: 8192 Γ 128 β |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Γ4 branches β discrete token indices |
| βΌ |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Transformer Decoder β 3 blocks |
| β per-branch PatchEmbed (1Γ1 conv) β |
| ββββββββββββββββββββββββββββββββββββββββ |
| β concat 4 branches |
| βΌ |
| ββββββββββββββββββββββββββββββββββββββββ |
| β Decode Heads β Amplitude (GELU) |
| β 4Γembed_dim β decoder_out_dim β Sin/Cos phase (Tanh) |
| ββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| Inverse FFT β Reconstructed Signal |
| ``` |
|
|
| ## Numerical Parity (Rust vs Python) |
|
|
| Verified against the official PyTorch reference implementation: |
|
|
| | Layer | Max Abs Error | Notes | |
| |-------|:---:|-------| |
| | Encoder features | < 8 Γ 10β»Β³ | 12 transformer layers, f32 accumulation | |
| | Encode heads | < 2 Γ 10β»Β³ | After Tanh squashing | |
| | RVQ quantized vectors | β 0 ΒΉ | Exact with random-init codebooks | |
| | Token indices | **99.3%** exact Β² | Pretrained weights | |
| | Decode outputs | < 8 Γ 10β»ΒΉ ΒΉ | Dominated by β€0.7% boundary tokens | |
|
|
| ΒΉ Differences stem from the β€0.7% of tokens near codebook decision boundaries β a natural consequence of f32 arithmetic differences between frameworks. |
|
|
| Β² With random-init weights: **100%** match (all "mismatches" resolve to identical codebook vectors, i.e., ties). |
|
|
| ## Benchmarks |
|
|
| **Platform:** Apple M4 Pro, 64 GB RAM, macOS 15 (arm64) |
|
|
| ### Tokenize Latency β All Backends |
|
|
| | Configuration | Modality | PyTorch CPU | Rust NdArray | Rust wgpu (GPU) | |
| |---|:---:|---:|---:|---:| |
| | EEG 4ch Γ 64t | EEG | **179 ms** | 661 ms | 51 ms | |
| | EEG 8ch Γ 32t | EEG | **180 ms** | 662 ms | 60 ms | |
| | EEG 16ch Γ 16t | EEG | **180 ms** | 664 ms | 62 ms | |
| | EEG 32ch Γ 8t | EEG | **178 ms** | 664 ms | 65 ms | |
| | EEG 64ch Γ 4t | EEG | **179 ms** | 664 ms | 68 ms | |
| | ECG 4ch Γ 150t | ECG | **272 ms** | 1881 ms | 92 ms | |
| | ECG 8ch Γ 75t | ECG | **273 ms** | 1874 ms | 92 ms | |
| | ECG 12ch Γ 50t | ECG | **272 ms** | 1877 ms | 93 ms | |
| | ECG 15ch Γ 40t | ECG | **272 ms** | 1878 ms | 93 ms | |
| | EMG 4ch Γ 64t | EMG | **255 ms** | 998 ms | 90 ms | |
| | EMG 8ch Γ 32t | EMG | **255 ms** | 998 ms | 88 ms | |
| | EMG 16ch Γ 16t | EMG | **254 ms** | 1001 ms | 90 ms | |
|
|
| ### Tokenize Latency: NdArray vs wgpu vs PyTorch |
|
|
|  |
|
|
| ### Encode Latency: NdArray vs wgpu vs PyTorch |
|
|
|  |
|
|
| ### Rust β Tokenize Latency by Configuration |
|
|
|  |
|
|
| ### Rust β EEG Scaling by Channel Count |
|
|
|  |
|
|
| ### Rust β Model Construction Time |
|
|
|  |
|
|
| ### Backend Comparison Summary |
|
|
| | Comparison | Result | |
| |---|---| |
| | **wgpu vs NdArray** | wgpu is **~12Γ faster** (GPU acceleration) | |
| | **wgpu vs PyTorch CPU** | wgpu is **~3Γ faster** for EEG/EMG/ECG | |
| | **NdArray vs PyTorch CPU** | PyTorch is **~3.7Γ faster** (optimized BLAS) | |
|
|
| ### Key Observations |
|
|
| - **wgpu (GPU) is the fastest backend** β 51β93 ms across all configurations |
| - **PyTorch CPU** uses Apple Accelerate/AMX BLAS and fused operators, making it faster than Rust NdArray on CPU |
| - **Latency scales with total patch count**, not the channel/time decomposition β EEG (256 patches) < EMG (256 patches, 16 RVQ) < ECG (600 patches) |
| - **Construction time** is ~13 ms (warm) / ~54 ms (cold start for EMG with larger kernels) |
| - **Standard deviation < 1%** β highly stable inference latency |
|
|
| ### Why Rust? |
|
|
| | | Python + PyTorch | Rust + Burn | |
| |---|---|---| |
| | Dependencies | pip, torch, numpy, einops, ... | Zero (single static binary) | |
| | GPU support | CUDA, MPS | wgpu (Metal, Vulkan, WebGPU) | |
| | Deployment | Interpreter + venv | Single binary, WASM, embedded | |
| | Memory | GC pauses | Deterministic, no GC | |
| | Latency (GPU) | β | **51β93 ms** (wgpu Metal) | |
|
|
| ## Conversion |
|
|
| These weights were converted from the official `.pt` files: |
|
|
| ```python |
| import torch |
| from safetensors.torch import save_file |
| |
| state_dict = torch.load("model.pt", map_location="cpu") |
| converted = {k: v.float().contiguous() for k, v in state_dict.items() |
| if isinstance(v, torch.Tensor) and v.is_floating_point()} |
| save_file(converted, "model.safetensors") |
| ``` |
|
|
| Or use the included script: |
|
|
| ```bash |
| python scripts/convert_pt_to_safetensors.py \ |
| --input NeuroRVQ_EEG_tokenizer_v1.pt \ |
| --output NeuroRVQ_EEG_tokenizer_v1.safetensors |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{barmpas2024neurorvq, |
| title={NeuroRVQ: Joint Neurophysiological Multi-Scale Temporal Tokenization and Reconstruction}, |
| author={Barmpas, Konstantinos and others}, |
| year={2024} |
| } |
| ``` |
|
|
| ## License |
|
|
| Apache-2.0 β same as the original NeuroRVQ release. |
|
|
| ## Links |
|
|
| | | | |
| |---|---| |
| | **Rust crate** | [github.com/eugenehp/neurorvq-rs](https://github.com/eugenehp/neurorvq-rs) | |
| | **Original weights** | [huggingface.co/ntinosbarmpas/NeuroRVQ](https://huggingface.co/ntinosbarmpas/NeuroRVQ) | |
| | **Paper / Code** | [github.com/KonstantinosBarmpas/NeuroRVQ](https://github.com/KonstantinosBarmpas/NeuroRVQ) | |
| | **Burn framework** | [burn.dev](https://burn.dev) | |
|
|