keep-warm pulse: hold GPU boost clock while typing so casual chat decodes at boosted rate 70592b3 verified Humuhumu33 commited on about 4 hours ago
warmup uses decode() to precompile batched pipelines (cut first-msg TTFT) e4b6a8f verified Humuhumu33 commited on about 4 hours ago
fast first turn: warmup boosts clock + primes system-prompt KV (12s TTFT to ~0.5s) + static grounded greeting 7efaf4b verified Humuhumu33 commited on about 4 hours ago
discrete-GPU validation ?bench=discrete: bandwidth + live + spec-flip in one page dabcb0d verified Humuhumu33 commited on about 5 hours ago
decode(): report live-path GPU ms/tok for clean CPU/GPU split d2afec3 verified Humuhumu33 commited on about 5 hours ago
decode(): report live-path GPU ms/tok for clean CPU/GPU split 8f34f10 verified Humuhumu33 commited on about 5 hours ago
per-pass GPU trace ?bench=trace: name the non-weight overhead 770541b verified Humuhumu33 commited on about 5 hours ago
per-pass GPU trace ?bench=trace: name the non-weight overhead 31a05bf verified Humuhumu33 commited on about 5 hours ago
spec bench: warmup + short prompt + 192-tok decode-dominated (fix prefill confound) 439999f verified Humuhumu33 commited on about 5 hours ago
live decode profile ?bench=perf: boosted-clock steady tok/s vs roofline e74320a verified Humuhumu33 commited on about 5 hours ago
register-blocked GEMV lab: B=1/2/4/8/16 rows per group 3653faa verified Humuhumu33 commited on about 7 hours ago
profiler: fix read-only binding + dot8 batched-verify proxy + refined verdict bcdc5b6 verified Humuhumu33 commited on about 7 hours ago
kernel limiter profiler: read-only vs ALU-intensity vs occupancy sweep a24d0c6 verified Humuhumu33 commited on about 7 hours ago
spec-decode for BitNet: subNorm+f32-KV batched verify, ?spec + ?bench=spec 7406205 verified Humuhumu33 commited on about 8 hours ago
spec-decode for BitNet: subNorm+f32-KV batched verify, ?spec + ?bench=spec d4f3433 verified Humuhumu33 commited on about 8 hours ago
spec-decode for BitNet: subNorm+f32-KV batched verify, ?spec + ?bench=spec 459725d verified Humuhumu33 commited on about 8 hours ago
spec-decode for BitNet: subNorm+f32-KV batched verify, ?spec + ?bench=spec 57800b6 verified Humuhumu33 commited on about 8 hours ago
roofline: batched passes + onSubmittedWorkDone for real bw fa278c0 verified Humuhumu33 commited on about 8 hours ago
roofline: fill buffer non-zero + exceed cache for real VRAM bw ac2b557 verified Humuhumu33 commited on about 8 hours ago
Upload holo-load2bit.mjs with huggingface_hub f789de8 verified Humuhumu33 commited on about 9 hours ago
Upload holo-load2bit.mjs with huggingface_hub b5ae11e verified Humuhumu33 commited on about 9 hours ago
Upload holo-gpu-device.mjs with huggingface_hub df29783 verified Humuhumu33 commited on about 9 hours ago
Upload holo-load2bit.mjs with huggingface_hub 139e306 verified Humuhumu33 commited on about 9 hours ago
Upload holo-load2bit.mjs with huggingface_hub 81c5d37 verified Humuhumu33 commited on about 9 hours ago
Upload holo-load2bit.mjs with huggingface_hub 5b034d2 verified Humuhumu33 commited on about 14 hours ago
Upload gpu-blake3.mjs with huggingface_hub e020433 verified Humuhumu33 commited on about 14 hours ago
Upload blake3-gpu-parallel.html with huggingface_hub a3c9d6f verified Humuhumu33 commited on about 14 hours ago
Upload holo-load2bit.mjs with huggingface_hub 0a2d0d8 verified Humuhumu33 commited on about 15 hours ago
Upload holo-blake3.mjs with huggingface_hub 7b5c4b3 verified Humuhumu33 commited on about 15 hours ago
Upload blake3-gpu-test.html with huggingface_hub a349783 verified Humuhumu33 commited on about 15 hours ago
Upload core/loader.js with huggingface_hub 484edea verified Humuhumu33 commited on about 16 hours ago
Upload holo-load2bit.mjs with huggingface_hub dfdc138 verified Humuhumu33 commited on about 17 hours ago