NeuralOS: Towards Simulating Operating Systems via Neural Generative Models Paper • 2507.08800 • Published Jul 11, 2025 • 80
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published May 20, 2025 • 78
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published Apr 8, 2025 • 110
HALO: Hadamard-Assisted Lossless Optimization for Efficient Low-Precision LLM Training and Fine-Tuning Paper • 2501.02625 • Published Jan 5, 2025 • 15
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Paper • 2502.05003 • Published Feb 7, 2025 • 42
AQLM+PV Collection Official AQLM quantizations for "PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression": https://arxiv.org/abs/2405.14852 • 26 items • Updated Feb 28, 2025 • 22