| language: | |
| - en | |
| - zh | |
| license: apache-2.0 | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| # BlockFFN-Small | |
| This is the original 0.1B BlockFFN checkpoint used in the paper *BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity* for acceleration tests. | |
| You can load and use this model simply by using `AutoTokenizer` and `AutoModelForCausalLM`. | |
| Links: [[Paper](https://arxiv.org/pdf/2507.08771)] [[Codes](https://github.com/thunlp/BlockFFN)] | |
| ### Citation | |
| If you find our work useful for your research, please kindly cite our paper as follows: | |
| ``` | |
| @article{song2025blockffn, | |
| title={{BlockFFN}: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity}, | |
| author={Chenyang Song and Weilin Zhao and Xu Han and Chaojun Xiao and Yingfa Chen and Yuxuan Li and Zhiyuan Liu and Maosong Sun}, | |
| journal={arXiv preprint arXiv:2507.08771}, | |
| year={2025}, | |
| url={https://arxiv.org/pdf/2507.08771}, | |
| } | |
| ``` |