Add pipeline tag and improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +34 -8
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
  license: mit
 
3
  tags:
4
  - neuroscience
5
  - brain-to-text
@@ -9,24 +10,49 @@ tags:
9
  - brain foundation models
10
  ---
11
 
12
- # MEG-XL
13
 
14
- Model weights for the paper "MEG-XL: Data-Efficient Brain-to-Text via Long-Context Pre-Training".
15
 
16
- Use the checkpoint file alongside the code. Instructions for usage are in the GitHub repository.
 
 
17
 
18
- Weights/Checkpoint: [[HuggingFace]](https://huggingface.co/pnpl/MEG-XL/blob/main/meg-xl-med.ckpt)
19
 
20
- Code: [[GitHub Repository]](https://github.com/neural-processing-lab/MEG-XL)
21
 
22
- Paper: [[arXiv]](https://arxiv.org/abs/2602.02494)
 
23
 
24
- If you find this work helpful in your research, please cite:
 
 
 
25
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  @article{jayalath2026megxl,
27
  title={{MEG-XL}: Data-Efficient Brain-to-Text via Long-Context Pre-Training},
28
  author={Jayalath, Dulhan and Jones, Oiwi Parker},
29
  journal={arXiv preprint arXiv:2602.02494},
30
  year={2026}
31
  }
32
- ```
 
1
  ---
2
  license: mit
3
+ pipeline_tag: other
4
  tags:
5
  - neuroscience
6
  - brain-to-text
 
10
  - brain foundation models
11
  ---
12
 
13
+ # MEG-XL: Data-Efficient Brain-to-Text via Long-Context Pre-Training
14
 
15
+ MEG-XL is a brain-to-text foundation model pre-trained with 2.5 minutes of MEG context per sample (equivalent to 191k tokens). It is designed to capture extended neural context, enabling high data efficiency for decoding words from brain activity.
16
 
17
+ - **Paper:** [MEG-XL: Data-Efficient Brain-to-Text via Long-Context Pre-Training](https://huggingface.co/papers/2602.02494)
18
+ - **Repository:** [GitHub - neural-processing-lab/MEG-XL](https://github.com/neural-processing-lab/MEG-XL)
19
+ - **Weights/Checkpoint:** [meg-xl-med.ckpt](https://huggingface.co/pnpl/MEG-XL/blob/main/meg-xl-med.ckpt)
20
 
21
+ ## Usage
22
 
23
+ Instructions for environment setup, tokenizer (BioCodec) requirements, and data preparation are available in the [official GitHub repository](https://github.com/neural-processing-lab/MEG-XL).
24
 
25
+ ### Fine-tuning MEG-XL for Brain-to-Text
26
+ You can fine-tune or evaluate the model on word decoding tasks using the following command structure:
27
 
28
+ ```bash
29
+ python -m brainstorm.evaluate_criss_cross_word_classification \
30
+ --config-name=eval_criss_cross_word_classification_{armeni, gwilliams, libribrain} \
31
+ model.criss_cross_checkpoint=/path/to/your/checkpoint.ckpt
32
  ```
33
+
34
+ ### Linear Probing
35
+ To perform linear probing, use:
36
+
37
+ ```bash
38
+ python -m brainstorm.evaluate_criss_cross_word_classification \
39
+ --config-name=eval_criss_cross_word_classification_linear_probe_{armeni, gwilliams, libribrain} \
40
+ model.criss_cross_checkpoint=/path/to/your/checkpoint.ckpt
41
+ ```
42
+
43
+ ## Requirements
44
+ - Python >= 3.12
45
+ - High-VRAM GPU (>= 40-80GiB depending on the task).
46
+ - Access to the [BioCodec](https://arxiv.org/abs/2510.09095) tokenizer code and weights.
47
+
48
+ ## Citation
49
+
50
+ If you find this work helpful in your research, please cite:
51
+ ```bibtex
52
  @article{jayalath2026megxl,
53
  title={{MEG-XL}: Data-Efficient Brain-to-Text via Long-Context Pre-Training},
54
  author={Jayalath, Dulhan and Jones, Oiwi Parker},
55
  journal={arXiv preprint arXiv:2602.02494},
56
  year={2026}
57
  }
58
+ ```