Dongwon Jo
dongwonjo
AI & ML interests
Efficient AI, Model Compression, Quantization, Pruning, Generative Model, Large Language Model, Diffusion
Recent Activity
authored
a paper
about 3 hours ago
Retrospective Sparse Attention for Efficient Long-Context Generation
authored
a paper
about 4 hours ago
Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection
authored
a paper
about 4 hours ago
Squeezing Large-Scale Diffusion Models for Mobile
Organizations
None yet