SDAR Collection The models without suffixes use the default block size = 4. • 21 items • Updated Sep 9 • 7
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference Paper • 2505.19427 • Published May 26 • 11