Dongwon Jo

dongwonjo

·

https://dongwonjo.github.io

AI & ML interests

Efficient AI, Model Compression, Sparse Attention, Quantization, Pruning, Generative Model, Large Language Model, Diffusion

Recent Activity

authored a paper about 18 hours ago

Rotation-Aligned Key Channel Pruning for Efficient Vision-Language Model Inference

upvoted a paper about 1 month ago

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

authored a paper about 1 month ago

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

View all activity

Organizations

dongwonjo 's datasets

None public yet