wzh

hg2wzh

·

AI & ML interests

None yet

Recent Activity

liked a Space 14 days ago

HuggingFaceM4/encoder-free-vlm

liked a model about 1 month ago

reacted to erikkaum's post with ❤️ about 2 months ago

Releasing my first kernel 🔥 MaxSim Late-interaction retrieval (ColBERT / PyLate) bottlenecks on materializing the full similarity matrix. This kernel avoids it by using tiled scoring with simdgroup_matrix (Metal) and WMMA. The result is 3–5× speedup compared to naive PyTorch baseline 🔥 Benchmarks: - SmallRerank (B=32, C=10): up to 3.2× (M3 Pro) / 2.8× (A100) - HeavyRerank (B=32, C=100): up to 3.8× (M3 Pro) / 5.3× (A100) - LongDocStress (Ld=1024): up to 6.2× (L4) Try it out 👇 https://huggingface.co/kernels/erikkaum/maxsim

View all activity

Organizations

None yet

hg2wzh 's collections 8