Commit History

Changed to correct architecture name for vllm
d97e95b
verified

Maxtimer97 commited on

Added num stages tuning
42f4907

Maxtimer97 commited on

Changed to autotune triton for 48G GPU deployment
4ee9d9e

Maxtimer97 commited on

Removed assertion
22ba83b

Maxtimer97 commited on

Removed wrong edge case assertion
bb26ab9

Maxtimer97 commited on

Corrected device assignment error
c8b397a

Maxtimer97 commited on

Corrected device assignment error
251c74f

Maxtimer97 commited on

Corrected device assignment error
3d4c21e

Maxtimer97 commited on

Safe cache check
bdf9fdc

Maxtimer97 commited on

Removed GenerationMixing
bf53a99
verified

Maxtimer97 commited on

Hardwired nsa attention
bb18c9e
verified

Maxtimer97 commited on

Added attn_imp in config
ce81c56

Maxtimer97 commited on

Merge branch 'main' of https://huggingface.co/Maxtimer97/GLM2NSA
28885e1

Maxtimer97 commited on

Relative import first
a27b55c

Maxtimer97 commited on

Put relative import first for Huggingface
0bda180
verified

Maxtimer97 commited on

Added modeling files
8a2bc5d

Maxtimer97 commited on

Upload tokenizer
2495dfe
verified

Maxtimer97 commited on

Upload ChatGLMForConditionalGeneration
5046ee8
verified

Maxtimer97 commited on

initial commit
e842585
verified

Maxtimer97 commited on