File size: 1,354 Bytes
e4af6d2
 
 
 
 
 
 
 
 
 
 
a16941c
e4af6d2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e42db85
389f302
 
 
 
 
 
 
 
e4af6d2
e42db85
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

---
language: en
license: apache-2.0
tags:
- efficient-llm
- quantization
- ternary
- bitnet
- pytorch
- tinystories
- language-modeling
datasets:
- roneneldan/TinyStories
arxiv: 2602.07374
---

# TernaryLM-132M

TernaryLM-132M is a 132M parameter Transformer trained natively using ternary weights {-1, 0, +1}.

Unlike post-training quantization methods, this model learns quantized representations during training.

## Architecture

- Parameters: 132M
- Layers: 12
- Hidden Size: 768
- Attention Heads: 12
- Context Length: 512
- Quantization: Native Ternary Training

## Training

- Dataset: TinyStories (~60k stories)
- Optimizer: AdamW (betas=(0.9, 0.98))
- LR: 3e-4
- Scheduler: OneCycleLR
- Epochs: 15
- Hardware: Multi-GPU T4 setup (Kaggle)

## Intended Use

Research on:
- Efficient Transformers
- Quantization-aware training
- Edge deployment

## Limitations

- Not instruction-tuned
- Limited dataset scale
- Research prototype

## Citation

```bibtex
@misc{nargund2026ternarylmmemoryefficientlanguagemodeling,
      title={TernaryLM: Memory-Efficient Language Modeling via Native 1-Bit Quantization with Adaptive Layer-wise Scaling}, 
      author={Nisharg Nargund and Priyesh Shukla},
      year={2026},
      eprint={2602.07374},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.07374}, 
}
```