How Do Decoder-Only LLMs Perceive Users? Rethinking Attention Masking for User Representation Learning Paper • 2602.10622 • Published 5 days ago • 26
yolay/Youtu-Agent-RL-Maths-Qwen2.5-7B Text Generation • 8B • Updated about 1 month ago • 9 • 3
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 298
HuggingFaceFW/ablation-model-fineweb-edu Text Generation • 2B • Updated Jun 11, 2024 • 546 • 19
aaron-di/YamshadowExperiment28-7B-TaskArithmetic Text Generation • 7B • Updated May 10, 2024 • 2
aaron-di/Yamshadowexperiment28M70.8-0.84-0.86-0.36-0.16-0.95-7B Text Generation • 7B • Updated Apr 8, 2024 • 1