File size: 3,381 Bytes
c9212eb 3f98265 c9212eb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | ---
license: bsd-3-clause
tags:
- meg
- brain-signals
- speech-detection
- conformer
- libribrain
datasets:
- pnpl/LibriBrain
metrics:
- f1
library_name: pytorch
model-index:
- name: megconformer-speech-detection
results:
- task:
type: audio-classification
name: Speech classification
dataset:
name: LibriBrain 2025 PNPL (Standard track, speech task)
type: pnpl/LibriBrain
split: holdout
metrics:
- name: F1-macro
type: f1
value: 0.8890 # 88.90 %
args:
average: macro
---
# MEGConformer for Speech Detection
Conformer-based MEG decoder for binary speech detection, trained with 10 different random seeds for reproducibility.
## Model Performance
| Seed | Val F1-Macro | Checkpoint |
|------|--------------|------------|
| 0 (best) | **87.06%** | `seed-0/pytorch_model.ckpt` |
| 6 | 86.80% | `seed-6/pytorch_model.ckpt` |
| 4 | 86.62% | `seed-4/pytorch_model.ckpt` |
| 1 | 86.54% | `seed-1/pytorch_model.ckpt` |
| 2 | 86.37% | `seed-2/pytorch_model.ckpt` |
| 5 | 86.29% | `seed-5/pytorch_model.ckpt` |
| 7 | 86.18% | `seed-7/pytorch_model.ckpt` |
| 3 | 86.13% | `seed-3/pytorch_model.ckpt` |
| 8 | 85.92% | `seed-8/pytorch_model.ckpt` |
| 9 | 85.18% | `seed-9/pytorch_model.ckpt` |
- **Holdout score of seed 0:** 88.90%
## Quick Start
### Load Best Model
```python
import torch
from huggingface_hub import hf_hub_download
from libribrain_experiments.models.configurable_modules.classification_module import (
ClassificationModule,
)
# Download a checkpoint (seed-0)
checkpoint_path = hf_hub_download(
repo_id="zuazo/megconformer-speech-detection", filename="seed-0/pytorch_model.ckpt"
)
# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Load model and move to device
model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device)
model.eval()
# Inference
meg_signal = torch.randn(1, 306, 125, device=device) # Create directly on device
with torch.no_grad():
logits = model(meg_signal)
prediction = torch.argmax(logits, dim=1) # 0=silence, 1=speech
print(f"Prediction: {'Speech' if prediction.item() == 1 else 'Silence'}")
```
## Model Details
- **Architecture**: Conformer Small
- Hidden size: 144
- FFN dim: 576
- Layers: 16
- Attention heads: 4
- Depthwise conv kernel: 31
- **Input**: 306-channel MEG signals
- **Window size**: 2.5 seconds (625 samples at 250 Hz)
- **Output**: Binary classification (silence/speech)
- **Training**: [LibriBrain](https://huggingface.co/datasets/pnpl/LibriBrain) 2025 Standard track
## Reproducibility
All 10 random seeds are provided to ensure reproducibility.
## Citation
```bibtex
@misc{dezuazo2025megconformerconformerbasedmegdecoder,
title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification},
author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas},
year={2025},
eprint={2512.01443},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2512.01443},
}
```
## License
The 3-Clause BSD License
## Links
- **Paper**: [arXiv:2512.01443](https://arxiv.org/abs/2512.01443)
- **Code**: [GitHub](https://github.com/neural2speech/libribrain-experiments)
- **Competition**: [LibriBrain 2025](https://neural-processing-lab.github.io/2025-libribrain-competition/) |