chen-yingfa
/

HypeNet-2B

linear-attention

Model card Files Files and versions

HypeNet-2B / README.md

chen-yingfa's picture

Update README.md

81a6e00 verified about 1 month ago

|

history blame contribute delete

623 Bytes

license: apache-2.0
datasets:
  - HuggingFaceFW/fineweb-edu
language:
  - en
base_model:
  - Qwen/Qwen3-1.7B
tags:
  - linear-attention
  - hybrid
  - rnn
  - distillation

Links:

GitHub repo: https://github.com/thunlp/hybrid-linear-attention
Paper: https://arxiv.org/abs/2601.22156

This is the final HypeNet-2B checkpoint from the paper Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts, distilled from Qwen3-1.7B using the HALO pipeline proposed in our paper. For more information, please refer to our GitHub repo.