On Pruning State-Space LLMs
Paper
•
2502.18886
•
Published
•
2
Version: 1.0
Architecture: MOHAWK LMHead
The model has been benchmarked on several tasks:
| Task | Metric | Value | Stderr |
|---|---|---|---|
| ARC Challenge | acc | 0.4164 | ±0.0144 |
| ARC Easy | acc | 0.7492 | ±0.0089 |
| Hellaswag | acc | 0.4988 | ±0.0050 |
| Lambada (OpenAI) | acc | 0.5707 | ±0.0069 |
| perplexity | 7.0794 | ±0.1761 | |
| PIQA | acc | 0.7661 | ±0.0099 |
| Winogrande | acc | 0.6283 | ±0.0136 |
Note:
- For accuracy metrics, higher values are better.
- For perplexity, lower values are better.
If you use this model, please cite:
@misc{ghattas2025pruningstatespacellms,
title={On Pruning State-Space LLMs},
author={Tamer Ghattas and Michael Hassid and Roy Schwartz},
year={2025},
eprint={2502.18886},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.18886},
}
Model Card Last Updated: February 16, 2025
Base model
HuggingFaceTB/SmolLM2-1.7B