README.md · eugenehp/NeuroRVQ at main

File size: 11,045 Bytes

---
license: apache-2.0
tags:
  - eeg
  - ecg
  - emg
  - biosignal
  - tokenizer
  - neuroscience
  - bci
  - brain-computer-interface
  - vector-quantization
  - rvq
  - safetensors
  - burn
  - rust
language:
  - en
library_name: neurorvq-rs
pipeline_tag: feature-extraction
---

# NeuroRVQ — Safetensors Weights

Pre-converted [safetensors](https://github.com/huggingface/safetensors) weights for the [NeuroRVQ](https://github.com/KonstantinosBarmpas/NeuroRVQ) multi-scale biosignal tokenizer, ready for use with **[neurorvq-rs](https://github.com/eugenehp/neurorvq-rs)** (pure-Rust inference on [Burn 0.20](https://burn.dev)) or any framework that supports safetensors.

Weights are converted from the official PyTorch `.pt` checkpoints published at [ntinosbarmpas/NeuroRVQ](https://huggingface.co/ntinosbarmpas/NeuroRVQ).

## Model Files

### Tokenizers (encoder → RVQ → decoder)

| File | Modality | Params | Size | Embed | Patch | RVQ |
|------|----------|--------|------|-------|-------|-----|
| [`NeuroRVQ_EEG_tokenizer_v1.safetensors`](NeuroRVQ_EEG_tokenizer_v1.safetensors) | **EEG** | 76.0 M | 304 MB | 200 | 200 | 8 levels |
| [`NeuroRVQ_ECG_tokenizer_v1.safetensors`](NeuroRVQ_ECG_tokenizer_v1.safetensors) | **ECG** | 68.1 M | 272 MB | 40 | 40 | 8 levels |
| [`NeuroRVQ_EMG_tokenizer_v1.safetensors`](NeuroRVQ_EMG_tokenizer_v1.safetensors) | **EMG** | 143.6 M | 574 MB | 200 | 200 | 16 levels |

### Foundation Models (encoder only)

| File | Modality | Params | Size | Depth |
|------|----------|--------|------|-------|
| [`NeuroRVQ_EEG_foundation_model_v1.safetensors`](NeuroRVQ_EEG_foundation_model_v1.safetensors) | **EEG** | 58.6 M | 234 MB | 12 blocks |
| [`NeuroRVQ_EMG_foundation_model_v1.safetensors`](NeuroRVQ_EMG_foundation_model_v1.safetensors) | **EMG** | 111.2 M | 445 MB | 12 blocks |

### Config Flags

| File | Description |
|------|-------------|
| [`flags/NeuroRVQ_EEG_v1.yml`](flags/NeuroRVQ_EEG_v1.yml) | EEG — 103 channels, patch=200, embed=200 |
| [`flags/NeuroRVQ_ECG_v1.yml`](flags/NeuroRVQ_ECG_v1.yml) | ECG — 15 channels, patch=40, embed=40 |
| [`flags/NeuroRVQ_EMG_v1.yml`](flags/NeuroRVQ_EMG_v1.yml) | EMG — 16 channels, patch=200, embed=200 |

## Quick Start — Rust

```bash
# Install
cargo add neurorvq-rs

# Download weights + config
huggingface-cli download eugenehp/NeuroRVQ \
    NeuroRVQ_EEG_tokenizer_v1.safetensors \
    flags/NeuroRVQ_EEG_v1.yml \
    --local-dir weights/

# Run tokenization
cargo run --release --bin infer -- \
    --config weights/flags/NeuroRVQ_EEG_v1.yml \
    --weights weights/NeuroRVQ_EEG_tokenizer_v1.safetensors
```

### Library API

```rust
use neurorvq_rs::{NeuroRVQEncoder, Modality, data, channels};
use std::path::Path;

let (model, _ms) = NeuroRVQEncoder::<B>::load_with_modality(
    Path::new("flags/NeuroRVQ_EEG_v1.yml"),
    Path::new("NeuroRVQ_EEG_tokenizer_v1.safetensors"),
    Modality::EEG,
    device,
)?;

// Tokenize → 4 branches × 8 RVQ levels of discrete indices
let tokens = model.tokenize(&batch)?;
for (br, levels) in tokens.branch_tokens.iter().enumerate() {
    for (lv, indices) in levels.iter().enumerate() {
        println!("Branch {} Level {}: {} tokens", br, lv, indices.len());
    }
}
```

### Foundation Model API

```rust
use neurorvq_rs::{NeuroRVQFoundationModel, Modality};

let (fm, _ms) = NeuroRVQFoundationModel::<B>::load(
    Path::new("flags/NeuroRVQ_EEG_v1.yml"),
    Path::new("NeuroRVQ_EEG_foundation_model_v1.safetensors"),
    Modality::EEG,
    device,
)?;

let features = fm.encode(&batch)?;       // 4 branch feature vectors
let pooled = fm.encode_pooled(&batch)?;   // Mean-pooled for classification
```

## Quick Start — Python

```python
from safetensors.torch import load_file

state_dict = load_file("NeuroRVQ_EEG_tokenizer_v1.safetensors")
model.load_state_dict(state_dict, strict=False)
```

## Architecture

```
Raw Signal [B, N, T]
    │
    ▼
┌──────────────────────────────────────┐
│  Multi-Scale Temporal Conv           │  4 parallel branches
│  EEG/ECG: k=21,15,9,5               │  modality-specific kernels
│  EMG:     k=51,17,8,5               │
└──────────────────────────────────────┘
    │ ×4 branches
    ▼
┌──────────────────────────────────────┐
│  Transformer Encoder                 │  12 blocks, 10 heads
│  + spatial / temporal pos. embeds    │  shared weights across branches
└──────────────────────────────────────┘
    │ ×4 branches
    ▼
┌──────────────────────────────────────┐
│  Encode Heads                        │  Linear → Tanh → Linear
│  embed_dim → code_dim (128)          │
└──────────────────────────────────────┘
    │ ×4 branches
    ▼
┌──────────────────────────────────────┐
│  Residual Vector Quantization        │  8 levels (EEG/ECG)
│  L2-norm codebook lookup             │  16 levels (EMG)
│  codebook: 8192 × 128               │
└──────────────────────────────────────┘
    │ ×4 branches        ← discrete token indices
    ▼
┌──────────────────────────────────────┐
│  Transformer Decoder                 │  3 blocks
│  per-branch PatchEmbed (1×1 conv)    │
└──────────────────────────────────────┘
    │ concat 4 branches
    ▼
┌──────────────────────────────────────┐
│  Decode Heads                        │  Amplitude (GELU)
│  4×embed_dim → decoder_out_dim       │  Sin/Cos phase (Tanh)
└──────────────────────────────────────┘
    │
    ▼
  Inverse FFT → Reconstructed Signal
```

## Numerical Parity (Rust vs Python)

Verified against the official PyTorch reference implementation:

| Layer | Max Abs Error | Notes |
|-------|:---:|-------|
| Encoder features | < 8 × 10⁻³ | 12 transformer layers, f32 accumulation |
| Encode heads | < 2 × 10⁻³ | After Tanh squashing |
| RVQ quantized vectors | ≈ 0 ¹ | Exact with random-init codebooks |
| Token indices | **99.3%** exact ² | Pretrained weights |
| Decode outputs | < 8 × 10⁻¹ ¹ | Dominated by ≤0.7% boundary tokens |

¹ Differences stem from the ≤0.7% of tokens near codebook decision boundaries — a natural consequence of f32 arithmetic differences between frameworks.

² With random-init weights: **100%** match (all "mismatches" resolve to identical codebook vectors, i.e., ties).

## Benchmarks

**Platform:** Apple M4 Pro, 64 GB RAM, macOS 15 (arm64)

### Tokenize Latency — All Backends

| Configuration | Modality | PyTorch CPU | Rust NdArray | Rust wgpu (GPU) |
|---|:---:|---:|---:|---:|
| EEG 4ch × 64t | EEG | **179 ms** | 661 ms | 51 ms |
| EEG 8ch × 32t | EEG | **180 ms** | 662 ms | 60 ms |
| EEG 16ch × 16t | EEG | **180 ms** | 664 ms | 62 ms |
| EEG 32ch × 8t | EEG | **178 ms** | 664 ms | 65 ms |
| EEG 64ch × 4t | EEG | **179 ms** | 664 ms | 68 ms |
| ECG 4ch × 150t | ECG | **272 ms** | 1881 ms | 92 ms |
| ECG 8ch × 75t | ECG | **273 ms** | 1874 ms | 92 ms |
| ECG 12ch × 50t | ECG | **272 ms** | 1877 ms | 93 ms |
| ECG 15ch × 40t | ECG | **272 ms** | 1878 ms | 93 ms |
| EMG 4ch × 64t | EMG | **255 ms** | 998 ms | 90 ms |
| EMG 8ch × 32t | EMG | **255 ms** | 998 ms | 88 ms |
| EMG 16ch × 16t | EMG | **254 ms** | 1001 ms | 90 ms |

### Tokenize Latency: NdArray vs wgpu vs PyTorch

![Tokenize Comparison](figures/compare_all_tokenize.svg)

### Encode Latency: NdArray vs wgpu vs PyTorch

![Encode Comparison](figures/compare_all_encode.svg)

### Rust — Tokenize Latency by Configuration

![Tokenize Latency](figures/tokenize_latency.svg)

### Rust — EEG Scaling by Channel Count

![EEG Scaling](figures/eeg_scaling.svg)

### Rust — Model Construction Time

![Construction Time](figures/construction_time.svg)

### Backend Comparison Summary

| Comparison | Result |
|---|---|
| **wgpu vs NdArray** | wgpu is **~12× faster** (GPU acceleration) |
| **wgpu vs PyTorch CPU** | wgpu is **~3× faster** for EEG/EMG/ECG |
| **NdArray vs PyTorch CPU** | PyTorch is **~3.7× faster** (optimized BLAS) |

### Key Observations

- **wgpu (GPU) is the fastest backend** — 51–93 ms across all configurations
- **PyTorch CPU** uses Apple Accelerate/AMX BLAS and fused operators, making it faster than Rust NdArray on CPU
- **Latency scales with total patch count**, not the channel/time decomposition — EEG (256 patches) < EMG (256 patches, 16 RVQ) < ECG (600 patches)
- **Construction time** is ~13 ms (warm) / ~54 ms (cold start for EMG with larger kernels)
- **Standard deviation < 1%** — highly stable inference latency

### Why Rust?

| | Python + PyTorch | Rust + Burn |
|---|---|---|
| Dependencies | pip, torch, numpy, einops, ... | Zero (single static binary) |
| GPU support | CUDA, MPS | wgpu (Metal, Vulkan, WebGPU) |
| Deployment | Interpreter + venv | Single binary, WASM, embedded |
| Memory | GC pauses | Deterministic, no GC |
| Latency (GPU) | — | **51–93 ms** (wgpu Metal) |

## Conversion

These weights were converted from the official `.pt` files:

```python
import torch
from safetensors.torch import save_file

state_dict = torch.load("model.pt", map_location="cpu")
converted = {k: v.float().contiguous() for k, v in state_dict.items()
             if isinstance(v, torch.Tensor) and v.is_floating_point()}
save_file(converted, "model.safetensors")
```

Or use the included script:

```bash
python scripts/convert_pt_to_safetensors.py \
    --input NeuroRVQ_EEG_tokenizer_v1.pt \
    --output NeuroRVQ_EEG_tokenizer_v1.safetensors
```

## Citation

```bibtex
@article{barmpas2024neurorvq,
  title={NeuroRVQ: Joint Neurophysiological Multi-Scale Temporal Tokenization and Reconstruction},
  author={Barmpas, Konstantinos and others},
  year={2024}
}
```

## License

Apache-2.0 — same as the original NeuroRVQ release.

## Links

| | |
|---|---|
| **Rust crate** | [github.com/eugenehp/neurorvq-rs](https://github.com/eugenehp/neurorvq-rs) |
| **Original weights** | [huggingface.co/ntinosbarmpas/NeuroRVQ](https://huggingface.co/ntinosbarmpas/NeuroRVQ) |
| **Paper / Code** | [github.com/KonstantinosBarmpas/NeuroRVQ](https://github.com/KonstantinosBarmpas/NeuroRVQ) |
| **Burn framework** | [burn.dev](https://burn.dev) |