--- license: cc-by-nd-4.0 language: - en model-index: - name: TinyMyo results: # ------------------------- # Hand Gesture Classification # ------------------------- - task: type: gesture-classification dataset: type: ninapro_db5 name: Ninapro DB5 metrics: - name: acc@1 type: acc@1 value: 0.8941 verified: false - name: f1 type: f1 value: 0.7797 verified: false - task: type: gesture-classification dataset: type: epn612 name: EPN-612 metrics: - name: acc@1 type: acc@1 value: 0.9674 verified: false - name: f1 type: f1 value: 0.9674 verified: false - task: type: gesture-classification dataset: type: uci_emg name: UCI-EMG metrics: - name: acc@1 type: acc@1 value: 0.9756 verified: false - name: f1 type: f1 value: 0.9755 verified: false # ------------------------- # Generic Neuromotor Interface (Meta RL) # ------------------------- - task: type: gesture-classification dataset: type: gni_meta_rl name: Generic Neuromotor Interface (Discrete Gesture) metrics: - name: CLER type: classification-error-rate value: 0.153 verified: false # ------------------------- # Hand Kinematic Regression # ------------------------- - task: type: kinematic-regression dataset: type: ninapro_db8 name: Ninapro DB8 metrics: - name: MAE type: mean-absolute-error value: 8.77 verified: false - name: RMSE type: root-mean-square-error value: 13.35 verified: false - name: R2 type: r2 value: 0.62 verified: false # ------------------------- # Silent Speech Synthesis # ------------------------- - task: type: speech-synthesis dataset: type: gaddy_silent_speech name: Gaddy Silent Speech (MFCC to Audio) metrics: - name: WER type: word-error-rate value: 0.3354 verified: false # ------------------------- # Silent Speech Recognition # ------------------------- - task: type: speech-recognition dataset: type: gaddy_silent_speech name: Gaddy Silent Speech (EMG to Text) metrics: - name: WER type: word-error-rate value: 0.3395 verified: false ---

TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge

**TinyMyo** is a **3.6M-parameter** Transformer-based **foundation model for surface EMG (sEMG)**. It is pretrained on >480 GB of EMG data and optimized for **ultra-low-power, real-time deployment**, including **microcontrollers (GAP9)** where it achieves an inference time of **0.785 s**, energy of **44.91 mJ** and power envelope of **57.18 mW**. TinyMyo is built for **broad generalization** across datasets, sensor configurations, movement tasks, subjects, and domains (gesture, kinematics, speech). --- # 🔒 License & Usage (Model Weights) The released TinyMyo weights are licensed under **CC BY-ND 4.0**. This summary is not legal advice—please read the full license. ### ✅ You may * **Use** and **redistribute** the **unmodified** TinyMyo weights (including commercially) **with attribution**. * **Fine-tune/modify internally** for research or production without redistributing modified weights. * **Publish code, configs, evaluations, and papers** using TinyMyo. ### 🚫 You may not * **Share or host modified weights** in any form (including LoRA/adapter deltas, pruned/quantized models). * **Claim endorsement** from the TinyMyo authors without permission. * **Use the TinyMyo name** for derivative models. ### 🤝 Contributing Improvements To upstream improvements, submit a **PR** to the **[BioFoundation repository](https://github.com/pulp-bio/BioFoundation)** with: 1. Full reproducibility artifacts (configs, logs, seeds, environment). 2. Evaluation on standard protocols (e.g., DB5, EPN-612, UCI EMG, DB8, Silent Speech). 3. Comparison to TinyMyo’s reported metrics. Approved PRs will be retrained and released as **official TinyMyo** checkpoints under CC BY-ND. --- # 🔎 1. Default Input & Preprocessing Unless specified otherwise, TinyMyo expects: * **Channels:** 16 * **Sampling rate:** 2000 Hz * **Segment length:** 1000 samples (0.5 s) * **Windowing:** 50% overlap (pretraining) * **Preprocessing:** * 4th-order **20–450 Hz bandpass** * **50 Hz notch filter** * **Min–max normalization** (pretraining) * **Z-score normalization** (downstream) Datasets with <16 channels are **zero-padded (pretraining only)**. --- # 🔬 2. Pretraining Overview TinyMyo is pretrained via masked reconstruction on **three large-scale EMG datasets**: | Dataset | Subjects | fs | Channels | Size | | ----------- | -------- | ------- | -------- | ------- | | Ninapro DB6 | 10 | 2000 Hz | 14 | 20.3 GB | | Ninapro DB7 | 22 | 2000 Hz | 12 | 30.9 GB | | EMG2Pose | 192 | 2000 Hz | 16 | 431 GB | ## Tokenization: Channel-Independent Patches Unlike EEG FMs that mix channels early, TinyMyo uses **per-channel patching**: * Patch length: **20 samples** * Patch stride: **20 samples** * Tokens/channel: **50** * Total seq length: **800 tokens** (16 x 50) * Positional encoding: **RoPE (rotary)** This preserves electrode-specific structure while allowing attention to learn cross-channel relationships. ## Transformer Encoder * **8 layers**, **3 heads** * Embedding dim: **192** * Pre-LayerNorm * Dropout & drop-path: **0.1** ## Lightweight Decoder A **single linear layer** (~3.9k params) reconstructs masked patches. Following SimMIM, this forces the encoder to learn robust latent structure. ## Masking Objective * **50% random masking** with a learnable `[MASK]` token * Loss: **Smooth L1** with small penalty on visible patches $$ \mathcal{L} = \mathcal{L}*{\text{masked}} + 0.1,\mathcal{L}*{\text{visible}} $$ ## Training Setup * Optimizer: **AdamW** (β=(0.9,0.98), wd=0.01) * LR: **1e-4** with cosine decay * Batch size: **512** (with grad accumulation) * Epochs: **50**, warm-up: 10 * Hardware: **4× NVIDIA GH200 GPUs** --- # 🧠 3. Architecture Summary ### Model Variant | Variant | Params | (Layers, Dim) | | ------- | -------- | ------------- | | TinyMyo | **3.6M** | (8, 192) | --- # 🎯 4. Downstream Tasks TinyMyo generalizes across **gesture classification**, **kinematic regression**, and **speech EMG**—with state-of-the-art or competitive results. --- ## 4.1 Hand Gesture Classification Evaluated on: * **Ninapro DB5** (52 classes, 10 subjects) * **EPN-612** (5 classes, 612 subjects) * **UCI EMG** (6 classes, 36 subjects) * **Meta Neuromotor Interface** (9 gestures) ### Preprocessing * EMG filtering: **20–90 Hz bandpass + 50 Hz notch** * Window sizes: * **200 ms** (best for DB5) * **1000 ms** (best for EPN, UCI) ### Linear Classification Head * Input: **C × 192** * Params: **<40k** ### Performance (Fine-tuned) | Dataset | Metric | Result | | ------------------------ | ------ | ----------------- | | **Ninapro DB5** (200 ms) | Acc | **89.41 ± 0.16%** | | **EPN-612** (1000 ms) | Acc | **96.74 ± 0.09%** | | **UCI EMG** (1000 ms) | Acc | **97.56 ± 0.32%** | | **Neuromotor** | CLER | **0.153 ± 0.006** | TinyMyo achieves **new state-of-the-art** on DB5, EPN-612, and UCI. --- ## 4.2 Hand Kinematic Regression (Ninapro DB8) * Predict **5 joint angles** * Windows: **200 ms** or **1000 ms** * Normalization: z-score only ### Regression Head (~788k params) * Depthwise + pointwise convs * Upsampling * Global average pooling * Linear projection to 5 outputs ### Performance * **MAE = 8.77 ± 0.12°** (1000 ms) Note: Prior works reporting ~6.9° MAE are **subject-specific**; TinyMyo trains a **single cross-subject model**, a significantly harder setting. --- ## 4.3 Speech Production & Recognition (Silent Speech) Dataset: **Gaddy Silent Speech** (8 channels, 1000 Hz, face/neck EMG) ### Speech Production (EMG → MFCC → HiFi-GAN → Audio) Pipeline: 1. Residual downsampling 2. TinyMyo encoder 3. Linear projection → **26-dim MFCC** 4. HiFi-GAN vocoder **WER:** **33.54 ± 1.12%** ≈ state-of-the-art with **>90% fewer params** in the transduction model. ### Speech Recognition (EMG → Text) * TinyMyo encoder * Linear projection → **37 characters** * **CTC** loss * 4-gram LM + beam search **WER:** **33.95 ± 0.97%** TinyMyo is EMG-only, unlike multimodal systems like MONA-LISA. --- # ⚡ 5. Edge Deployment (GAP9 MCU) TinyMyo runs efficiently on **GAP9 (RISC-V)** via: * **INT8 quantization**, including attention * Multi-level streaming (L3 to L2 to L1) * Integer LayerNorm, GELU, softmax * Static memory arena via liveness analysis ### Runtime (DB5 pipeline) * **Inference time**: **0.785 s** * **Energy**: **44.91 mJ** * **Average power**: **57.18 mW** This is the **first EMG foundation model demonstrated on a microcontroller**. --- # 📊 6. Results Summary ### Pretraining * Smooth L1 reconstruction with high fidelity * Total compute ≈ **4.0 GFLOPs** ### Downstream Highlights * **DB5:** 89.41% * **EPN-612:** 96.74% * **UCI EMG:** 97.56% * **Neuromotor:** 0.153 CLER * **DB8 Regression:** MAE 8.77° * **Silent Speech Production:** 33.54% WER * **Silent Speech Recognition:** 33.95% WER TinyMyo matches or exceeds state-of-the-art performance, while being smaller and more efficient than all prior EMG foundation models. --- # 🛠️ Code & Usage To fine-tune TinyMyo on downstream tasks, follow the examples in the **[BioFoundation repository](https://github.com/pulp-bio/BioFoundation)**. ```bash python -u run_train.py +experiment=TinyMyo_finetune \ pretrained_safetensors_path=/path/to/model.safetensors ``` Environment variables: * `DATA_PATH` → dataset path * `CHECKPOINT_DIR` → checkpoint to load --- ## 🔗 Resources - **Code:** https://github.com/pulp-bio/BioFoundation --- # 📜 Citation Please cite TinyMyo using: ```bibtex @misc{fasulo2025tinymyotinyfoundationmodel, title={TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge}, author={Matteo Fasulo and Giusy Spacone and Thorir Mar Ingolfsson and Yawei Li and Luca Benini and Andrea Cossettini}, year={2025}, eprint={2512.15729}, archivePrefix={arXiv}, primaryClass={eess.SP}, url={https://arxiv.org/abs/2512.15729}, } ``` --- # 🧭 Contact & Support * Questions or issues? Open an issue on the **BioFoundation GitHub repository**.