Seoul National University VLSI Lab

university

https://vlsi.snu.ac.kr/

Activity Feed Request to join this org

AI & ML interests

Efficient AI

Recent Activity

jiwonsong authored a paper about 1 month ago

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

dongwonjo authored a paper about 1 month ago

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

jiwonsong submitted a paper about 1 month ago

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

View all activity

Papers

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

RelayGen: Intra-Generation Model Switching for Efficient Reasoning

View all Papers

SNU-VLSI 's papers 7

Submitted by

Jiwon Song

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

Seoul National University VLSI Lab

5 1

Submitted by

Jiwon Song

RelayGen: Intra-Generation Model Switching for Efficient Reasoning

Seoul National University VLSI Lab

4 2

Submitted by

Jiwon Song

LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning

Seoul National University VLSI Lab

Retrospective Sparse Attention for Efficient Long-Context Generation

Seoul National University VLSI Lab

Submitted by

Jiwon Song

Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning

Seoul National University VLSI Lab

33 2

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

Seoul National University VLSI Lab

Squeezing Large-Scale Diffusion Models for Mobile

Seoul National University VLSI Lab

AI & ML interests

Recent Activity

Papers

Team members 6

CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection

RelayGen: Intra-Generation Model Switching for Efficient Reasoning

LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning

Retrospective Sparse Attention for Efficient Long-Context Generation

Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

Squeezing Large-Scale Diffusion Models for Mobile