emg2qwerty-generic / README.md
bruAristimunha's picture
Add emg2qwerty-generic (remapped from upstream TDSConvCTCModule; CC BY-NC-SA 4.0)
ddc51ed verified
---
license: cc-by-nc-sa-4.0
library_name: braindecode
tags:
- braindecode
- EMG2QwertyNet
- emg
- ctc
- keystroke-decoding
- emg2qwerty
---
# emg2qwerty β€” generic baseline (TDS-Conv-CTC, 108-subject pretrained)
Pre-remapped braindecode-compatible copy of the upstream
[`generic.ckpt`](https://github.com/facebookresearch/emg2qwerty/blob/main/models/generic.ckpt)
from the emg2qwerty release (Sivakumar et al., NeurIPS 2024 D&B Track).
```python
from braindecode.models import EMG2QwertyNet
model = EMG2QwertyNet.from_pretrained("braindecode/emg2qwerty-generic")
```
## Source
Upstream repository: <https://github.com/facebookresearch/emg2qwerty>
Paper: Sivakumar V, Seely J, Du A, Bittner S, Berenzweig A,
Bolarinwa A, Gramfort A, Mandel M. *emg2qwerty: A Large Dataset
with Baselines for Touch Typing using Surface Electromyography*.
Advances in Neural Information Processing Systems (NeurIPS),
Datasets and Benchmarks Track, 2024.
## Modification from upstream
Upstream `models/generic.ckpt` is a PyTorch-Lightning checkpoint
of `emg2qwerty.lightning.TDSConvCTCModule`, whose inner
`nn.Sequential` exposes the classifier head as item 4
(`model.4.{weight,bias}`). braindecode's `EMG2QwertyNet`
exposes the same head as a named attribute
(`final_layer.{weight,bias}`).
The remap is a **two-key rename**, applied once and saved here:
| Upstream key | braindecode key |
|--------------------|-----------------------|
| `model.4.weight` | `final_layer.weight` |
| `model.4.bias` | `final_layer.bias` |
All 49 other keys (BatchNorm, MLP, TDS conv blocks) match
verbatim β€” both modules expose the backbone as
`self.model = nn.Sequential(...)`, so the keys already share the
`model.<index>.` prefix and need no rename. Weights are
otherwise unchanged from upstream.
Conversion is reproducible from
`neuralbench-repo/scripts/convert_emg2qwerty_checkpoint.py`.
### Verification
BatchNorm running statistics on the first layer match upstream:
| Stat | This checkpoint | Expected |
|-----------------------------------------|-----------------|----------|
| `model.0.batch_norm.running_mean.mean` | 0.511 | β‰ˆ 0.51 |
| `model.0.batch_norm.running_var.mean` | 1.146 | β‰ˆ 1.15 |
Forward pass on a 1Γ—32Γ—8000 random input returns shape
`(1, 373, 99)` β€” the 4 s @ 2 kHz window after the TDS encoder +
CTC head.
## License
**CC BY-NC-SA 4.0** ([Attribution-NonCommercial-ShareAlike 4.0
International](https://creativecommons.org/licenses/by-nc-sa/4.0/)),
inherited from the upstream emg2qwerty release.
* **Attribution (BY)** β€” cite Sivakumar et al. (2024) and link
this repository when the weights are used or redistributed.
* **NonCommercial (NC)** β€” not licensed for commercial use.
* **ShareAlike (SA)** β€” derivatives must be released under the
same CC BY-NC-SA 4.0 license.
braindecode itself is BSD-3-Clause; that license applies to the
*code*, not to these weights. The weights are governed solely
by CC BY-NC-SA 4.0.
## Intended use
Drop-in pretrained backbone for the
[`emg/qwerty`](https://github.com/facebookresearch/neuroai/tree/main/neuralbench-repo/neuralbench/tasks/emg/qwerty)
CTC keystroke-decoding task in NeuralBench, or any other research
workflow consuming `braindecode.models.EMG2QwertyNet`.
Per the source paper (table 4): zero-shot val/CER β‰ˆ 16 % on a
held-out subject; further fine-tuning typically reduces CER to
β‰ˆ 10 % on a personalized split.
## Citation
```bibtex
@inproceedings{sivakumar2024emg2qwerty,
title = {emg2qwerty: A Large Dataset with Baselines for
Touch Typing using Surface Electromyography},
author = {Sivakumar, Viswanath and Seely, Jeffrey and Du,
Alan and Bittner, Sean and Berenzweig, Adam and
Bolarinwa, Anuoluwapo and Gramfort, Alexandre and
Mandel, Michael},
booktitle = {Advances in Neural Information Processing Systems
(NeurIPS), Datasets and Benchmarks Track},
year = {2024},
url = {https://github.com/facebookresearch/emg2qwerty},
}
```