Flux Attention - a QQTang1223 Collection

QQTang1223 's Collections

Flux Attention

updated Apr 14

🚀 Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

Paper • 2604.07394 • Published Apr 8 • 16
QQTang1223/full_streaming_Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Apr 11 • 4
QQTang1223/full_xattn_Qwen3-8B

Text Generation • 8B • Updated Apr 11 • 13 • 1
QQTang1223/full_xattn_Qwen3-4B

Text Generation • 4B • Updated Apr 11 • 4
QQTang1223/full_xattn_Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Apr 11 • 4
QQTang1223/full_triangle_Qwen3-8B

Text Generation • 8B • Updated Apr 11 • 4
QQTang1223/full_triangle_Qwen3-4B

Text Generation • 4B • Updated Apr 11 • 3
QQTang1223/full_triangle_Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Apr 11 • 5
QQTang1223/full_streaming_Qwen3-8B

Text Generation • 8B • Updated Apr 11 • 6
QQTang1223/full_streaming_Qwen3-4B

Text Generation • 4B • Updated Apr 11 • 4
QQTang1223/qwen_mix_sft_64K6

Viewer • Updated Apr 11 • 49.3k • 25
QQTang1223/llama_mix_sft_64K6

Viewer • Updated Apr 11 • 49.3k • 25
QQTang1223/xattn_streaming_Qwen3-4B

Text Generation • 4B • Updated Apr 14 • 7