Maxtimer97
/

GLM2NSA

Text Generation

Model card Files Files and versions

GLM2NSA / topk_sparse_attention.py

Commit History

Changed to autotune triton for 48G GPU deployment

4ee9d9e

Maxtimer97 commited on Oct 3

Flattened repo

a2f57c7

Maxtimer97 commited on Sep 29