Sliding Window Attention Adaptation yuyijiong/Qwen3-SWA-adaptation Text Generation • Updated Dec 17, 2025 • 5 yuyijiong/fusang-v1-filtered Viewer • Updated Jan 7 • 15.9k • 69 Sliding Window Attention Adaptation Paper • 2512.10411 • Published Dec 11, 2025 • 21
LLM Eval Dataset en cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 498k • 723 TIGER-Lab/MMLU-Pro Benchmark • Updated 8 days ago • 12.1k • 151k • 470 openai/gsm8k Benchmark • Updated Mar 23 • 17.6k • 915k • 1.3k lukaemon/bbh Viewer • Updated Jul 11, 2025 • 6.51k • 39.5k • 78
Chinese pretrain datasets opencsg/chinese-fineweb-edu Viewer • Updated Dec 12, 2025 • 84.6M • 6.47k • 110 opencsg/chinese-fineweb-edu-v2 Viewer • Updated Dec 12, 2025 • 188M • 2.29k • 73 opencsg/chinese-cosmopedia Preview • Updated Jan 15, 2025 • 879 • 77
train_with_paraphrasing [long-context models trained with "original text paraphrasing" dataset](https://github.com/yuyijiong/train_with_paraphrasing) yuyijiong/Qwen-14b-chat-yarn-32k Text Generation • 14B • Updated Jun 7, 2024 • 31 • 21 yuyijiong/Llama3-8B-Chinese-Chat-32k Text Generation • 8B • Updated Jun 19, 2024 • 6 • 3 yuyijiong/Qwen1.5-4b-chat-paraph Text Generation • 4B • Updated Jun 7, 2024 • 5 yuyijiong/Qwen2-7b-Instruct-paraph Text Generation • 8B • Updated Jun 28, 2024 • 7
LLM eval dataset zh lmlmcat/cmmlu Updated Jul 13, 2023 • 22.5k • 75 ceval/ceval-exam Viewer • Updated Jul 27, 2025 • 13.9k • 56.6k • 297 meta-math/GSM8K_zh Viewer • Updated Dec 4, 2023 • 8.79k • 876 • 30 zai-org/humaneval-x Updated Oct 25, 2022 • 3.29k • 95
Sliding Window Attention Adaptation yuyijiong/Qwen3-SWA-adaptation Text Generation • Updated Dec 17, 2025 • 5 yuyijiong/fusang-v1-filtered Viewer • Updated Jan 7 • 15.9k • 69 Sliding Window Attention Adaptation Paper • 2512.10411 • Published Dec 11, 2025 • 21
train_with_paraphrasing [long-context models trained with "original text paraphrasing" dataset](https://github.com/yuyijiong/train_with_paraphrasing) yuyijiong/Qwen-14b-chat-yarn-32k Text Generation • 14B • Updated Jun 7, 2024 • 31 • 21 yuyijiong/Llama3-8B-Chinese-Chat-32k Text Generation • 8B • Updated Jun 19, 2024 • 6 • 3 yuyijiong/Qwen1.5-4b-chat-paraph Text Generation • 4B • Updated Jun 7, 2024 • 5 yuyijiong/Qwen2-7b-Instruct-paraph Text Generation • 8B • Updated Jun 28, 2024 • 7
LLM Eval Dataset en cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 498k • 723 TIGER-Lab/MMLU-Pro Benchmark • Updated 8 days ago • 12.1k • 151k • 470 openai/gsm8k Benchmark • Updated Mar 23 • 17.6k • 915k • 1.3k lukaemon/bbh Viewer • Updated Jul 11, 2025 • 6.51k • 39.5k • 78
LLM eval dataset zh lmlmcat/cmmlu Updated Jul 13, 2023 • 22.5k • 75 ceval/ceval-exam Viewer • Updated Jul 27, 2025 • 13.9k • 56.6k • 297 meta-math/GSM8K_zh Viewer • Updated Dec 4, 2023 • 8.79k • 876 • 30 zai-org/humaneval-x Updated Oct 25, 2022 • 3.29k • 95
Chinese pretrain datasets opencsg/chinese-fineweb-edu Viewer • Updated Dec 12, 2025 • 84.6M • 6.47k • 110 opencsg/chinese-fineweb-edu-v2 Viewer • Updated Dec 12, 2025 • 188M • 2.29k • 73 opencsg/chinese-cosmopedia Preview • Updated Jan 15, 2025 • 879 • 77