File size: 11,045 Bytes
2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 2a32598 c98dfe4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 | ---
license: apache-2.0
tags:
- eeg
- ecg
- emg
- biosignal
- tokenizer
- neuroscience
- bci
- brain-computer-interface
- vector-quantization
- rvq
- safetensors
- burn
- rust
language:
- en
library_name: neurorvq-rs
pipeline_tag: feature-extraction
---
# NeuroRVQ β Safetensors Weights
Pre-converted [safetensors](https://github.com/huggingface/safetensors) weights for the [NeuroRVQ](https://github.com/KonstantinosBarmpas/NeuroRVQ) multi-scale biosignal tokenizer, ready for use with **[neurorvq-rs](https://github.com/eugenehp/neurorvq-rs)** (pure-Rust inference on [Burn 0.20](https://burn.dev)) or any framework that supports safetensors.
Weights are converted from the official PyTorch `.pt` checkpoints published at [ntinosbarmpas/NeuroRVQ](https://huggingface.co/ntinosbarmpas/NeuroRVQ).
## Model Files
### Tokenizers (encoder β RVQ β decoder)
| File | Modality | Params | Size | Embed | Patch | RVQ |
|------|----------|--------|------|-------|-------|-----|
| [`NeuroRVQ_EEG_tokenizer_v1.safetensors`](NeuroRVQ_EEG_tokenizer_v1.safetensors) | **EEG** | 76.0 M | 304 MB | 200 | 200 | 8 levels |
| [`NeuroRVQ_ECG_tokenizer_v1.safetensors`](NeuroRVQ_ECG_tokenizer_v1.safetensors) | **ECG** | 68.1 M | 272 MB | 40 | 40 | 8 levels |
| [`NeuroRVQ_EMG_tokenizer_v1.safetensors`](NeuroRVQ_EMG_tokenizer_v1.safetensors) | **EMG** | 143.6 M | 574 MB | 200 | 200 | 16 levels |
### Foundation Models (encoder only)
| File | Modality | Params | Size | Depth |
|------|----------|--------|------|-------|
| [`NeuroRVQ_EEG_foundation_model_v1.safetensors`](NeuroRVQ_EEG_foundation_model_v1.safetensors) | **EEG** | 58.6 M | 234 MB | 12 blocks |
| [`NeuroRVQ_EMG_foundation_model_v1.safetensors`](NeuroRVQ_EMG_foundation_model_v1.safetensors) | **EMG** | 111.2 M | 445 MB | 12 blocks |
### Config Flags
| File | Description |
|------|-------------|
| [`flags/NeuroRVQ_EEG_v1.yml`](flags/NeuroRVQ_EEG_v1.yml) | EEG β 103 channels, patch=200, embed=200 |
| [`flags/NeuroRVQ_ECG_v1.yml`](flags/NeuroRVQ_ECG_v1.yml) | ECG β 15 channels, patch=40, embed=40 |
| [`flags/NeuroRVQ_EMG_v1.yml`](flags/NeuroRVQ_EMG_v1.yml) | EMG β 16 channels, patch=200, embed=200 |
## Quick Start β Rust
```bash
# Install
cargo add neurorvq-rs
# Download weights + config
huggingface-cli download eugenehp/NeuroRVQ \
NeuroRVQ_EEG_tokenizer_v1.safetensors \
flags/NeuroRVQ_EEG_v1.yml \
--local-dir weights/
# Run tokenization
cargo run --release --bin infer -- \
--config weights/flags/NeuroRVQ_EEG_v1.yml \
--weights weights/NeuroRVQ_EEG_tokenizer_v1.safetensors
```
### Library API
```rust
use neurorvq_rs::{NeuroRVQEncoder, Modality, data, channels};
use std::path::Path;
let (model, _ms) = NeuroRVQEncoder::<B>::load_with_modality(
Path::new("flags/NeuroRVQ_EEG_v1.yml"),
Path::new("NeuroRVQ_EEG_tokenizer_v1.safetensors"),
Modality::EEG,
device,
)?;
// Tokenize β 4 branches Γ 8 RVQ levels of discrete indices
let tokens = model.tokenize(&batch)?;
for (br, levels) in tokens.branch_tokens.iter().enumerate() {
for (lv, indices) in levels.iter().enumerate() {
println!("Branch {} Level {}: {} tokens", br, lv, indices.len());
}
}
```
### Foundation Model API
```rust
use neurorvq_rs::{NeuroRVQFoundationModel, Modality};
let (fm, _ms) = NeuroRVQFoundationModel::<B>::load(
Path::new("flags/NeuroRVQ_EEG_v1.yml"),
Path::new("NeuroRVQ_EEG_foundation_model_v1.safetensors"),
Modality::EEG,
device,
)?;
let features = fm.encode(&batch)?; // 4 branch feature vectors
let pooled = fm.encode_pooled(&batch)?; // Mean-pooled for classification
```
## Quick Start β Python
```python
from safetensors.torch import load_file
state_dict = load_file("NeuroRVQ_EEG_tokenizer_v1.safetensors")
model.load_state_dict(state_dict, strict=False)
```
## Architecture
```
Raw Signal [B, N, T]
β
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Multi-Scale Temporal Conv β 4 parallel branches
β EEG/ECG: k=21,15,9,5 β modality-specific kernels
β EMG: k=51,17,8,5 β
ββββββββββββββββββββββββββββββββββββββββ
β Γ4 branches
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Transformer Encoder β 12 blocks, 10 heads
β + spatial / temporal pos. embeds β shared weights across branches
ββββββββββββββββββββββββββββββββββββββββ
β Γ4 branches
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Encode Heads β Linear β Tanh β Linear
β embed_dim β code_dim (128) β
ββββββββββββββββββββββββββββββββββββββββ
β Γ4 branches
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Residual Vector Quantization β 8 levels (EEG/ECG)
β L2-norm codebook lookup β 16 levels (EMG)
β codebook: 8192 Γ 128 β
ββββββββββββββββββββββββββββββββββββββββ
β Γ4 branches β discrete token indices
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Transformer Decoder β 3 blocks
β per-branch PatchEmbed (1Γ1 conv) β
ββββββββββββββββββββββββββββββββββββββββ
β concat 4 branches
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Decode Heads β Amplitude (GELU)
β 4Γembed_dim β decoder_out_dim β Sin/Cos phase (Tanh)
ββββββββββββββββββββββββββββββββββββββββ
β
βΌ
Inverse FFT β Reconstructed Signal
```
## Numerical Parity (Rust vs Python)
Verified against the official PyTorch reference implementation:
| Layer | Max Abs Error | Notes |
|-------|:---:|-------|
| Encoder features | < 8 Γ 10β»Β³ | 12 transformer layers, f32 accumulation |
| Encode heads | < 2 Γ 10β»Β³ | After Tanh squashing |
| RVQ quantized vectors | β 0 ΒΉ | Exact with random-init codebooks |
| Token indices | **99.3%** exact Β² | Pretrained weights |
| Decode outputs | < 8 Γ 10β»ΒΉ ΒΉ | Dominated by β€0.7% boundary tokens |
ΒΉ Differences stem from the β€0.7% of tokens near codebook decision boundaries β a natural consequence of f32 arithmetic differences between frameworks.
Β² With random-init weights: **100%** match (all "mismatches" resolve to identical codebook vectors, i.e., ties).
## Benchmarks
**Platform:** Apple M4 Pro, 64 GB RAM, macOS 15 (arm64)
### Tokenize Latency β All Backends
| Configuration | Modality | PyTorch CPU | Rust NdArray | Rust wgpu (GPU) |
|---|:---:|---:|---:|---:|
| EEG 4ch Γ 64t | EEG | **179 ms** | 661 ms | 51 ms |
| EEG 8ch Γ 32t | EEG | **180 ms** | 662 ms | 60 ms |
| EEG 16ch Γ 16t | EEG | **180 ms** | 664 ms | 62 ms |
| EEG 32ch Γ 8t | EEG | **178 ms** | 664 ms | 65 ms |
| EEG 64ch Γ 4t | EEG | **179 ms** | 664 ms | 68 ms |
| ECG 4ch Γ 150t | ECG | **272 ms** | 1881 ms | 92 ms |
| ECG 8ch Γ 75t | ECG | **273 ms** | 1874 ms | 92 ms |
| ECG 12ch Γ 50t | ECG | **272 ms** | 1877 ms | 93 ms |
| ECG 15ch Γ 40t | ECG | **272 ms** | 1878 ms | 93 ms |
| EMG 4ch Γ 64t | EMG | **255 ms** | 998 ms | 90 ms |
| EMG 8ch Γ 32t | EMG | **255 ms** | 998 ms | 88 ms |
| EMG 16ch Γ 16t | EMG | **254 ms** | 1001 ms | 90 ms |
### Tokenize Latency: NdArray vs wgpu vs PyTorch

### Encode Latency: NdArray vs wgpu vs PyTorch

### Rust β Tokenize Latency by Configuration

### Rust β EEG Scaling by Channel Count

### Rust β Model Construction Time

### Backend Comparison Summary
| Comparison | Result |
|---|---|
| **wgpu vs NdArray** | wgpu is **~12Γ faster** (GPU acceleration) |
| **wgpu vs PyTorch CPU** | wgpu is **~3Γ faster** for EEG/EMG/ECG |
| **NdArray vs PyTorch CPU** | PyTorch is **~3.7Γ faster** (optimized BLAS) |
### Key Observations
- **wgpu (GPU) is the fastest backend** β 51β93 ms across all configurations
- **PyTorch CPU** uses Apple Accelerate/AMX BLAS and fused operators, making it faster than Rust NdArray on CPU
- **Latency scales with total patch count**, not the channel/time decomposition β EEG (256 patches) < EMG (256 patches, 16 RVQ) < ECG (600 patches)
- **Construction time** is ~13 ms (warm) / ~54 ms (cold start for EMG with larger kernels)
- **Standard deviation < 1%** β highly stable inference latency
### Why Rust?
| | Python + PyTorch | Rust + Burn |
|---|---|---|
| Dependencies | pip, torch, numpy, einops, ... | Zero (single static binary) |
| GPU support | CUDA, MPS | wgpu (Metal, Vulkan, WebGPU) |
| Deployment | Interpreter + venv | Single binary, WASM, embedded |
| Memory | GC pauses | Deterministic, no GC |
| Latency (GPU) | β | **51β93 ms** (wgpu Metal) |
## Conversion
These weights were converted from the official `.pt` files:
```python
import torch
from safetensors.torch import save_file
state_dict = torch.load("model.pt", map_location="cpu")
converted = {k: v.float().contiguous() for k, v in state_dict.items()
if isinstance(v, torch.Tensor) and v.is_floating_point()}
save_file(converted, "model.safetensors")
```
Or use the included script:
```bash
python scripts/convert_pt_to_safetensors.py \
--input NeuroRVQ_EEG_tokenizer_v1.pt \
--output NeuroRVQ_EEG_tokenizer_v1.safetensors
```
## Citation
```bibtex
@article{barmpas2024neurorvq,
title={NeuroRVQ: Joint Neurophysiological Multi-Scale Temporal Tokenization and Reconstruction},
author={Barmpas, Konstantinos and others},
year={2024}
}
```
## License
Apache-2.0 β same as the original NeuroRVQ release.
## Links
| | |
|---|---|
| **Rust crate** | [github.com/eugenehp/neurorvq-rs](https://github.com/eugenehp/neurorvq-rs) |
| **Original weights** | [huggingface.co/ntinosbarmpas/NeuroRVQ](https://huggingface.co/ntinosbarmpas/NeuroRVQ) |
| **Paper / Code** | [github.com/KonstantinosBarmpas/NeuroRVQ](https://github.com/KonstantinosBarmpas/NeuroRVQ) |
| **Burn framework** | [burn.dev](https://burn.dev) |
|