TernaryLM / README.md
OpenRAG128's picture
Update README.md
389f302 verified
metadata
language: en
license: apache-2.0
tags:
  - efficient-llm
  - quantization
  - ternary
  - bitnet
  - pytorch
  - tinystories
  - language-modeling
datasets:
  - roneneldan/TinyStories
arxiv: 2602.07374

TernaryLM-132M

TernaryLM-132M is a 132M parameter Transformer trained natively using ternary weights {-1, 0, +1}.

Unlike post-training quantization methods, this model learns quantized representations during training.

Architecture

  • Parameters: 132M
  • Layers: 12
  • Hidden Size: 768
  • Attention Heads: 12
  • Context Length: 512
  • Quantization: Native Ternary Training

Training

  • Dataset: TinyStories (~60k stories)
  • Optimizer: AdamW (betas=(0.9, 0.98))
  • LR: 3e-4
  • Scheduler: OneCycleLR
  • Epochs: 15
  • Hardware: Multi-GPU T4 setup (Kaggle)

Intended Use

Research on:

  • Efficient Transformers
  • Quantization-aware training
  • Edge deployment

Limitations

  • Not instruction-tuned
  • Limited dataset scale
  • Research prototype

Citation

@misc{nargund2026ternarylmmemoryefficientlanguagemodeling,
      title={TernaryLM: Memory-Efficient Language Modeling via Native 1-Bit Quantization with Adaptive Layer-wise Scaling}, 
      author={Nisharg Nargund and Priyesh Shukla},
      year={2026},
      eprint={2602.07374},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.07374}, 
}