File size: 1,354 Bytes
e4af6d2 a16941c e4af6d2 e42db85 389f302 e4af6d2 e42db85 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
---
language: en
license: apache-2.0
tags:
- efficient-llm
- quantization
- ternary
- bitnet
- pytorch
- tinystories
- language-modeling
datasets:
- roneneldan/TinyStories
arxiv: 2602.07374
---
# TernaryLM-132M
TernaryLM-132M is a 132M parameter Transformer trained natively using ternary weights {-1, 0, +1}.
Unlike post-training quantization methods, this model learns quantized representations during training.
## Architecture
- Parameters: 132M
- Layers: 12
- Hidden Size: 768
- Attention Heads: 12
- Context Length: 512
- Quantization: Native Ternary Training
## Training
- Dataset: TinyStories (~60k stories)
- Optimizer: AdamW (betas=(0.9, 0.98))
- LR: 3e-4
- Scheduler: OneCycleLR
- Epochs: 15
- Hardware: Multi-GPU T4 setup (Kaggle)
## Intended Use
Research on:
- Efficient Transformers
- Quantization-aware training
- Edge deployment
## Limitations
- Not instruction-tuned
- Limited dataset scale
- Research prototype
## Citation
```bibtex
@misc{nargund2026ternarylmmemoryefficientlanguagemodeling,
title={TernaryLM: Memory-Efficient Language Modeling via Native 1-Bit Quantization with Adaptive Layer-wise Scaling},
author={Nisharg Nargund and Priyesh Shukla},
year={2026},
eprint={2602.07374},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2602.07374},
}
``` |