File size: 3,381 Bytes

---
license: bsd-3-clause
tags:
- meg
- brain-signals
- speech-detection
- conformer
- libribrain
datasets:
- pnpl/LibriBrain
metrics:
- f1
library_name: pytorch

model-index:
- name: megconformer-speech-detection
  results:
  - task:
      type: audio-classification
      name: Speech classification
    dataset:
      name: LibriBrain 2025 PNPL (Standard track, speech task)
      type: pnpl/LibriBrain
      split: holdout
    metrics:
    - name: F1-macro
      type: f1
      value: 0.8890   # 88.90 %
      args:
        average: macro
---

# MEGConformer for Speech Detection

Conformer-based MEG decoder for binary speech detection, trained with 10 different random seeds for reproducibility.

## Model Performance

| Seed | Val F1-Macro | Checkpoint |
|------|--------------|------------|
| 0 (best) | **87.06%** | `seed-0/pytorch_model.ckpt` |
| 6 | 86.80% | `seed-6/pytorch_model.ckpt` |
| 4 | 86.62% | `seed-4/pytorch_model.ckpt` |
| 1 | 86.54% | `seed-1/pytorch_model.ckpt` |
| 2 | 86.37% | `seed-2/pytorch_model.ckpt` |
| 5 | 86.29% | `seed-5/pytorch_model.ckpt` |
| 7 | 86.18% | `seed-7/pytorch_model.ckpt` |
| 3 | 86.13% | `seed-3/pytorch_model.ckpt` |
| 8 | 85.92% | `seed-8/pytorch_model.ckpt` |
| 9 | 85.18% | `seed-9/pytorch_model.ckpt` |

- **Holdout score of seed 0:** 88.90%

## Quick Start

### Load Best Model
```python
import torch
from huggingface_hub import hf_hub_download

from libribrain_experiments.models.configurable_modules.classification_module import (
    ClassificationModule,
)

# Download a checkpoint (seed-0)
checkpoint_path = hf_hub_download(
    repo_id="zuazo/megconformer-speech-detection", filename="seed-0/pytorch_model.ckpt"
)

# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model and move to device
model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device)
model.eval()

# Inference
meg_signal = torch.randn(1, 306, 125, device=device)  # Create directly on device

with torch.no_grad():
    logits = model(meg_signal)
    prediction = torch.argmax(logits, dim=1)  # 0=silence, 1=speech

print(f"Prediction: {'Speech' if prediction.item() == 1 else 'Silence'}")
```

## Model Details

- **Architecture**: Conformer Small
  - Hidden size: 144
  - FFN dim: 576
  - Layers: 16
  - Attention heads: 4
  - Depthwise conv kernel: 31
- **Input**: 306-channel MEG signals
- **Window size**: 2.5 seconds (625 samples at 250 Hz)
- **Output**: Binary classification (silence/speech)
- **Training**: [LibriBrain](https://huggingface.co/datasets/pnpl/LibriBrain) 2025 Standard track

## Reproducibility

All 10 random seeds are provided to ensure reproducibility.

## Citation
```bibtex
@misc{dezuazo2025megconformerconformerbasedmegdecoder,
      title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification}, 
      author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas},
      year={2025},
      eprint={2512.01443},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.01443}, 
}
```

## License

The 3-Clause BSD License

## Links

- **Paper**: [arXiv:2512.01443](https://arxiv.org/abs/2512.01443)
- **Code**: [GitHub](https://github.com/neural2speech/libribrain-experiments)
- **Competition**: [LibriBrain 2025](https://neural-processing-lab.github.io/2025-libribrain-competition/)