|
|
--- |
|
|
license: bsd-3-clause |
|
|
tags: |
|
|
- meg |
|
|
- brain-signals |
|
|
- speech-detection |
|
|
- conformer |
|
|
- libribrain |
|
|
datasets: |
|
|
- pnpl/LibriBrain |
|
|
metrics: |
|
|
- f1 |
|
|
library_name: pytorch |
|
|
|
|
|
model-index: |
|
|
- name: megconformer-speech-detection |
|
|
results: |
|
|
- task: |
|
|
type: audio-classification |
|
|
name: Speech classification |
|
|
dataset: |
|
|
name: LibriBrain 2025 PNPL (Standard track, speech task) |
|
|
type: pnpl/LibriBrain |
|
|
split: holdout |
|
|
metrics: |
|
|
- name: F1-macro |
|
|
type: f1 |
|
|
value: 0.8890 |
|
|
args: |
|
|
average: macro |
|
|
--- |
|
|
|
|
|
# MEGConformer for Speech Detection |
|
|
|
|
|
Conformer-based MEG decoder for binary speech detection, trained with 10 different random seeds for reproducibility. |
|
|
|
|
|
## Model Performance |
|
|
|
|
|
| Seed | Val F1-Macro | Checkpoint | |
|
|
|------|--------------|------------| |
|
|
| 0 (best) | **87.06%** | `seed-0/pytorch_model.ckpt` | |
|
|
| 6 | 86.80% | `seed-6/pytorch_model.ckpt` | |
|
|
| 4 | 86.62% | `seed-4/pytorch_model.ckpt` | |
|
|
| 1 | 86.54% | `seed-1/pytorch_model.ckpt` | |
|
|
| 2 | 86.37% | `seed-2/pytorch_model.ckpt` | |
|
|
| 5 | 86.29% | `seed-5/pytorch_model.ckpt` | |
|
|
| 7 | 86.18% | `seed-7/pytorch_model.ckpt` | |
|
|
| 3 | 86.13% | `seed-3/pytorch_model.ckpt` | |
|
|
| 8 | 85.92% | `seed-8/pytorch_model.ckpt` | |
|
|
| 9 | 85.18% | `seed-9/pytorch_model.ckpt` | |
|
|
|
|
|
- **Holdout score of seed 0:** 88.90% |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
### Load Best Model |
|
|
```python |
|
|
import torch |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
from libribrain_experiments.models.configurable_modules.classification_module import ( |
|
|
ClassificationModule, |
|
|
) |
|
|
|
|
|
# Download a checkpoint (seed-0) |
|
|
checkpoint_path = hf_hub_download( |
|
|
repo_id="zuazo/megconformer-speech-detection", filename="seed-0/pytorch_model.ckpt" |
|
|
) |
|
|
|
|
|
# Choose device |
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
|
|
# Load model and move to device |
|
|
model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device) |
|
|
model.eval() |
|
|
|
|
|
# Inference |
|
|
meg_signal = torch.randn(1, 306, 125, device=device) # Create directly on device |
|
|
|
|
|
with torch.no_grad(): |
|
|
logits = model(meg_signal) |
|
|
prediction = torch.argmax(logits, dim=1) # 0=silence, 1=speech |
|
|
|
|
|
print(f"Prediction: {'Speech' if prediction.item() == 1 else 'Silence'}") |
|
|
``` |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture**: Conformer Small |
|
|
- Hidden size: 144 |
|
|
- FFN dim: 576 |
|
|
- Layers: 16 |
|
|
- Attention heads: 4 |
|
|
- Depthwise conv kernel: 31 |
|
|
- **Input**: 306-channel MEG signals |
|
|
- **Window size**: 2.5 seconds (625 samples at 250 Hz) |
|
|
- **Output**: Binary classification (silence/speech) |
|
|
- **Training**: [LibriBrain](https://huggingface.co/datasets/pnpl/LibriBrain) 2025 Standard track |
|
|
|
|
|
## Reproducibility |
|
|
|
|
|
All 10 random seeds are provided to ensure reproducibility. |
|
|
|
|
|
## Citation |
|
|
```bibtex |
|
|
@misc{dezuazo2025megconformerconformerbasedmegdecoder, |
|
|
title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification}, |
|
|
author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas}, |
|
|
year={2025}, |
|
|
eprint={2512.01443}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL}, |
|
|
url={https://arxiv.org/abs/2512.01443}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
The 3-Clause BSD License |
|
|
|
|
|
## Links |
|
|
|
|
|
- **Paper**: [arXiv:2512.01443](https://arxiv.org/abs/2512.01443) |
|
|
- **Code**: [GitHub](https://github.com/neural2speech/libribrain-experiments) |
|
|
- **Competition**: [LibriBrain 2025](https://neural-processing-lab.github.io/2025-libribrain-competition/) |