TernaryLM: Memory-Efficient Language Modeling via Native 1-Bit Quantization with Adaptive Layer-wise Scaling
Paper
•
2602.07374
•
Published
•
1
TernaryLM-132M is a 132M parameter Transformer trained natively using ternary weights {-1, 0, +1}.
Unlike post-training quantization methods, this model learns quantized representations during training.
Research on:
@misc{nargund2026ternarylmmemoryefficientlanguagemodeling,
title={TernaryLM: Memory-Efficient Language Modeling via Native 1-Bit Quantization with Adaptive Layer-wise Scaling},
author={Nisharg Nargund and Priyesh Shukla},
year={2026},
eprint={2602.07374},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2602.07374},
}