Minsu Ha
gkalstn0
AI & ML interests
None yet
Organizations
docs: add RTX 4090 benchmark + GPU arch list for SageAttention build
#22 opened 2 months ago
by
gkalstn0
docs: split Memory-efficient Inference and GGUF+SageAttention into sub READMEs
#19 opened 2 months ago
by
gkalstn0
docs: split Memory-efficient Inference and GGUF+SageAttention into sub READMEs
#20 opened 2 months ago
by
gkalstn0
docs: ComfyUI release + section update
#18 opened 2 months ago
by
gkalstn0
Is the QK^T result the VRAM bottleneck for video models?
1
#10 opened 2 months ago
by
yunming181920
Can MoE be used to get this to 4B and still fit into 5090 VRAM?
1
#11 opened 2 months ago
by
usernameSRSalreadyexists
GGUF versions
👀👍 5
2
#4 opened 2 months ago
by
maroo87
Struggling to get this to run on a 24gb gpu
17
#3 opened 3 months ago
by
CodeExplode
feat: GGUF + SageAttention guide — DPMSolver++ default, benchmark, README update
#17 opened 2 months ago
by
gkalstn0
feat: GGUF + SageAttention guide — DPMSolver++ default, benchmark, README update
#14 opened 2 months ago
by
gkalstn0
feat: GGUF + SageAttention guide — DPMSolver++ default, benchmark, README update
#16 opened 2 months ago
by
gkalstn0
feat: GGUF + SageAttention guide — DPMSolver++ default, benchmark, README update
#15 opened 2 months ago
by
gkalstn0
feat: GGUF + SageAttention guide — DPMSolver++ default, benchmark, README update
#13 opened 2 months ago
by
gkalstn0
Add quality comparison videos + 50-step benchmark
#5 opened 2 months ago
by
gkalstn0
Test PR with branch ref
1
#4 opened 2 months ago
by
gkalstn0
Add quality comparison videos + 50-step benchmark
1
#3 opened 2 months ago
by
gkalstn0
Add quality comparison videos + 50-step benchmark
1
#2 opened 2 months ago
by
gkalstn0
Fix README: reference Q4_K_M / Q5_K_M filenames
#1 opened 2 months ago
by
gkalstn0
Add FP8 weight quantization guide to README
#9 opened 2 months ago
by
gkalstn0
Enable Flash Attention by trimming prompt embedding padding
#8 opened 2 months ago
by
gkalstn0