TinyMyo / README.md
MatteoFasulo's picture
Update README.md
3aa0143 verified
---
license: cc-by-nd-4.0
language:
- en
model-index:
- name: TinyMyo
results:
# -------------------------
# Hand Gesture Classification
# -------------------------
- task:
type: gesture-classification
dataset:
type: ninapro_db5
name: Ninapro DB5
metrics:
- name: acc@1
type: acc@1
value: 0.8941
verified: false
- name: f1
type: f1
value: 0.7797
verified: false
- task:
type: gesture-classification
dataset:
type: epn612
name: EPN-612
metrics:
- name: acc@1
type: acc@1
value: 0.9674
verified: false
- name: f1
type: f1
value: 0.9674
verified: false
- task:
type: gesture-classification
dataset:
type: uci_emg
name: UCI-EMG
metrics:
- name: acc@1
type: acc@1
value: 0.9756
verified: false
- name: f1
type: f1
value: 0.9755
verified: false
# -------------------------
# Generic Neuromotor Interface (Meta RL)
# -------------------------
- task:
type: gesture-classification
dataset:
type: gni_meta_rl
name: Generic Neuromotor Interface (Discrete Gesture)
metrics:
- name: CLER
type: classification-error-rate
value: 0.153
verified: false
# -------------------------
# Hand Kinematic Regression
# -------------------------
- task:
type: kinematic-regression
dataset:
type: ninapro_db8
name: Ninapro DB8
metrics:
- name: MAE
type: mean-absolute-error
value: 8.77
verified: false
- name: RMSE
type: root-mean-square-error
value: 13.35
verified: false
- name: R2
type: r2
value: 0.62
verified: false
# -------------------------
# Silent Speech Synthesis
# -------------------------
- task:
type: speech-synthesis
dataset:
type: gaddy_silent_speech
name: Gaddy Silent Speech (MFCC to Audio)
metrics:
- name: WER
type: word-error-rate
value: 0.3354
verified: false
# -------------------------
# Silent Speech Recognition
# -------------------------
- task:
type: speech-recognition
dataset:
type: gaddy_silent_speech
name: Gaddy Silent Speech (EMG to Text)
metrics:
- name: WER
type: word-error-rate
value: 0.3395
verified: false
---
<div align="center">
<img src="https://raw.githubusercontent.com/MatteoFasulo/BioFoundation/refs/heads/TinyMyo/docs/model/logo/TinyMyo_logo.png" alt="TinyMyo Logo" width="400" />
<h1>TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge</h1>
</div>
<p align="center">
<a href="https://github.com/pulp-bio/BioFoundation">
<img src ="https://img.shields.io/github/stars/pulp-bio/BioFoundation?color=ccf" alt="Github">
</a>
<a href="https://creativecommons.org/licenses/by-nd/4.0/">
<img src="https://img.shields.io/badge/License-CC_BY--ND_4.0-lightgrey.svg" alt="License">
</a>
<a href="https://arxiv.org/abs/2512.15729">
<img src="https://img.shields.io/badge/arXiv-2512.15729-b31b1b.svg" alt="Paper">
</a>
</p>
**TinyMyo** is a **3.6M-parameter** Transformer-based **foundation model for surface EMG (sEMG)**.
It is pretrained on >480 GB of EMG data and optimized for **ultra-low-power, real-time deployment**, including **microcontrollers (GAP9)** where it achieves an inference time of **0.785 s**, energy of **44.91 mJ** and power envelope of **57.18 mW**.
TinyMyo is built for **broad generalization** across datasets, sensor configurations, movement tasks, subjects, and domains (gesture, kinematics, speech).
---
# πŸ”’ License & Usage (Model Weights)
The released TinyMyo weights are licensed under **CC BY-ND 4.0**.
This summary is not legal adviceβ€”please read the full license.
### βœ… You may
* **Use** and **redistribute** the **unmodified** TinyMyo weights (including commercially) **with attribution**.
* **Fine-tune/modify internally** for research or production without redistributing modified weights.
* **Publish code, configs, evaluations, and papers** using TinyMyo.
### 🚫 You may not
* **Share or host modified weights** in any form (including LoRA/adapter deltas, pruned/quantized models).
* **Claim endorsement** from the TinyMyo authors without permission.
* **Use the TinyMyo name** for derivative models.
### 🀝 Contributing Improvements
To upstream improvements, submit a **PR** to the
**[BioFoundation repository](https://github.com/pulp-bio/BioFoundation)** with:
1. Full reproducibility artifacts (configs, logs, seeds, environment).
2. Evaluation on standard protocols (e.g., DB5, EPN-612, UCI EMG, DB8, Silent Speech).
3. Comparison to TinyMyo’s reported metrics.
Approved PRs will be retrained and released as **official TinyMyo** checkpoints under CC BY-ND.
---
# πŸ”Ž 1. Default Input & Preprocessing
Unless specified otherwise, TinyMyo expects:
* **Channels:** 16
* **Sampling rate:** 2000 Hz
* **Segment length:** 1000 samples (0.5 s)
* **Windowing:** 50% overlap (pretraining)
* **Preprocessing:**
* 4th-order **20–450 Hz bandpass**
* **50 Hz notch filter**
* **Min–max normalization** (pretraining)
* **Z-score normalization** (downstream)
Datasets with <16 channels are **zero-padded (pretraining only)**.
---
# πŸ”¬ 2. Pretraining Overview
TinyMyo is pretrained via masked reconstruction on **three large-scale EMG datasets**:
| Dataset | Subjects | fs | Channels | Size |
| ----------- | -------- | ------- | -------- | ------- |
| Ninapro DB6 | 10 | 2000 Hz | 14 | 20.3 GB |
| Ninapro DB7 | 22 | 2000 Hz | 12 | 30.9 GB |
| EMG2Pose | 192 | 2000 Hz | 16 | 431 GB |
## Tokenization: Channel-Independent Patches
Unlike EEG FMs that mix channels early, TinyMyo uses **per-channel patching**:
* Patch length: **20 samples**
* Patch stride: **20 samples**
* Tokens/channel: **50**
* Total seq length: **800 tokens** (16 x 50)
* Positional encoding: **RoPE (rotary)**
This preserves electrode-specific structure while allowing attention to learn cross-channel relationships.
## Transformer Encoder
* **8 layers**, **3 heads**
* Embedding dim: **192**
* Pre-LayerNorm
* Dropout & drop-path: **0.1**
## Lightweight Decoder
A **single linear layer** (~3.9k params) reconstructs masked patches.
Following SimMIM, this forces the encoder to learn robust latent structure.
## Masking Objective
* **50% random masking** with a learnable `[MASK]` token
* Loss: **Smooth L1** with small penalty on visible patches
$$
\mathcal{L} = \mathcal{L}*{\text{masked}} + 0.1,\mathcal{L}*{\text{visible}}
$$
## Training Setup
* Optimizer: **AdamW** (Ξ²=(0.9,0.98), wd=0.01)
* LR: **1e-4** with cosine decay
* Batch size: **512** (with grad accumulation)
* Epochs: **50**, warm-up: 10
* Hardware: **4Γ— NVIDIA GH200 GPUs**
---
# 🧠 3. Architecture Summary
### Model Variant
| Variant | Params | (Layers, Dim) |
| ------- | -------- | ------------- |
| TinyMyo | **3.6M** | (8, 192) |
---
# 🎯 4. Downstream Tasks
TinyMyo generalizes across **gesture classification**, **kinematic regression**, and **speech EMG**β€”with state-of-the-art or competitive results.
---
## 4.1 Hand Gesture Classification
Evaluated on:
* **Ninapro DB5** (52 classes, 10 subjects)
* **EPN-612** (5 classes, 612 subjects)
* **UCI EMG** (6 classes, 36 subjects)
* **Meta Neuromotor Interface** (9 gestures)
### Preprocessing
* EMG filtering: **20–90 Hz bandpass + 50 Hz notch**
* Window sizes:
* **200 ms** (best for DB5)
* **1000 ms** (best for EPN, UCI)
### Linear Classification Head
* Input: **C Γ— 192**
* Params: **<40k**
### Performance (Fine-tuned)
| Dataset | Metric | Result |
| ------------------------ | ------ | ----------------- |
| **Ninapro DB5** (200 ms) | Acc | **89.41 Β± 0.16%** |
| **EPN-612** (1000 ms) | Acc | **96.74 Β± 0.09%** |
| **UCI EMG** (1000 ms) | Acc | **97.56 Β± 0.32%** |
| **Neuromotor** | CLER | **0.153 Β± 0.006** |
TinyMyo achieves **new state-of-the-art** on DB5, EPN-612, and UCI.
---
## 4.2 Hand Kinematic Regression (Ninapro DB8)
* Predict **5 joint angles**
* Windows: **200 ms** or **1000 ms**
* Normalization: z-score only
### Regression Head (~788k params)
* Depthwise + pointwise convs
* Upsampling
* Global average pooling
* Linear projection to 5 outputs
### Performance
* **MAE = 8.77 Β± 0.12Β°** (1000 ms)
Note: Prior works reporting ~6.9Β° MAE are **subject-specific**; TinyMyo trains a **single cross-subject model**, a significantly harder setting.
---
## 4.3 Speech Production & Recognition (Silent Speech)
Dataset: **Gaddy Silent Speech**
(8 channels, 1000 Hz, face/neck EMG)
### Speech Production (EMG β†’ MFCC β†’ HiFi-GAN β†’ Audio)
Pipeline:
1. Residual downsampling
2. TinyMyo encoder
3. Linear projection β†’ **26-dim MFCC**
4. HiFi-GAN vocoder
**WER:** **33.54 Β± 1.12%**
β‰ˆ state-of-the-art with **>90% fewer params** in the transduction model.
### Speech Recognition (EMG β†’ Text)
* TinyMyo encoder
* Linear projection β†’ **37 characters**
* **CTC** loss
* 4-gram LM + beam search
**WER:** **33.95 Β± 0.97%**
TinyMyo is EMG-only, unlike multimodal systems like MONA-LISA.
---
# ⚑ 5. Edge Deployment (GAP9 MCU)
TinyMyo runs efficiently on **GAP9 (RISC-V)** via:
* **INT8 quantization**, including attention
* Multi-level streaming (L3 to L2 to L1)
* Integer LayerNorm, GELU, softmax
* Static memory arena via liveness analysis
### Runtime (DB5 pipeline)
* **Inference time**: **0.785 s**
* **Energy**: **44.91 mJ**
* **Average power**: **57.18 mW**
This is the **first EMG foundation model demonstrated on a microcontroller**.
---
# πŸ“Š 6. Results Summary
### Pretraining
* Smooth L1 reconstruction with high fidelity
* Total compute β‰ˆ **4.0 GFLOPs**
### Downstream Highlights
* **DB5:** 89.41%
* **EPN-612:** 96.74%
* **UCI EMG:** 97.56%
* **Neuromotor:** 0.153 CLER
* **DB8 Regression:** MAE 8.77Β°
* **Silent Speech Production:** 33.54% WER
* **Silent Speech Recognition:** 33.95% WER
TinyMyo matches or exceeds state-of-the-art performance, while being smaller and more efficient than all prior EMG foundation models.
---
# πŸ› οΈ Code & Usage
To fine-tune TinyMyo on downstream tasks, follow the examples in the
**[BioFoundation repository](https://github.com/pulp-bio/BioFoundation)**.
```bash
python -u run_train.py +experiment=TinyMyo_finetune \
pretrained_safetensors_path=/path/to/model.safetensors
```
Environment variables:
* `DATA_PATH` β†’ dataset path
* `CHECKPOINT_DIR` β†’ checkpoint to load
---
## πŸ”— Resources
- **Code:** https://github.com/pulp-bio/BioFoundation
---
# πŸ“œ Citation
Please cite TinyMyo using:
```bibtex
@misc{fasulo2025tinymyotinyfoundationmodel,
title={TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge},
author={Matteo Fasulo and Giusy Spacone and Thorir Mar Ingolfsson and Yawei Li and Luca Benini and Andrea Cossettini},
year={2025},
eprint={2512.15729},
archivePrefix={arXiv},
primaryClass={eess.SP},
url={https://arxiv.org/abs/2512.15729},
}
```
---
# 🧭 Contact & Support
* Questions or issues?
Open an issue on the **BioFoundation GitHub repository**.