Bonsai 1-bit Qwen3-1.7B (Wave17/18 series)

This repo holds important checkpoints from the Bonsai 1-bit reproduction project: a quantization-aware-trained (QAT) version of Qwen/Qwen3-1.7B where every linear layer in the transformer body is binarized to 1-bit sign + per-128-group fp16 scale (≈ 1.125 bits per weight on the quantized portion). Embedding, lm_head, layer norms, and biases stay in FP16.

Latest results

Standard WikiText-2 test perplexity (sliding window, max_length=2048, stride=512, full 245k token test split, scale_source="learned"):

Checkpoint	qaf_update_step	tokens_seen	WT2 PPL	Notes
wave18BU_7280 ⭐	7280	12.38M	97.56	current best (peak LR 1e-4, rawmix78m, pure LM)
wave18AC_3900	3900	14.32M	141.52	previous baseline (LR 5e-6, mix100m, γ=0.15 η=0.65)
wave18AN_5600	5600	4.06M	148.52	(deprecated lineage; ancestor in upload was misnamed by an older paragraph-PPL metric)
Dense Qwen3-1.7B	—	—	~12	reference upper bound

Notes on naming

Older uploads use an internal paragraph-sample PPL that we discovered later was not aligned with the standard WikiText-2 stream protocol. Those numbers (e.g. wiki114p2) are not directly comparable with literature. All numbers in the table above use the standard sliding-window stream protocol matching BitNet/OneBit/BiLLM papers.

Repo layout

qtj1999/llm1bit/
├── README.md                                       (this file)
├── wave18BU_update7280_best_wt2ppl97p56/            ⭐ current best
│   ├── model_state_only.pt                          (3.5 GB, model weights only)
│   ├── training_state_meta.json                    (training metadata)
│   └── eval_wt2_standard.json                      (PPL eval result JSON)
├── wave18AC_update3900_best_wiki119p7/              (older, ~PPL 141 in standard metric)
│   └── training_state.pt                            (6.1 GB, full state)
└── wave18AN_update5600_best_wiki114p2/              (older, ~PPL 148 in standard metric)
    └── training_state.pt                            (6.1 GB, full state)

Loading

The checkpoints are saved as QAT trainer state, not standard HF format. To use them, you need the project codebase (https://github.com/...) which constructs BinaryLinear modules and loads weights into them. A standalone HF-format export will be added later.

The .pt files contain model_state_dict keyed by Qwen3 module names plus block_scales / block_scales_ema / block_scales_ptq_ema per BinaryLinear.

Training recipe (wave18BU)

Resume from wave18AC_3900 → BS → BT → BU lineage
Loss: pure LM CE (η=1.0, no instruction CE, no KD)
Data: rawmix78m (78M-token mixed Wikipedia + multi-source raw text, contamination-filtered against WT2/MMLU/GSM8K/etc.)
Optimizer: paged_adamw_8bit, peak LR ~1e-4 with warmup + cosine decay
batch: 2 GPU × 1 × 16 grad_accum × 512 seq = 16k tokens / update
Total tokens trained at update 7280: ~12.4M

Hardware

Trained on 2× NVIDIA RTX 5090 (32 GB each). Inference runs on a single 5090 in bf16 or as GGUF Q1_0 mixed (linear layers Q1_0, embedding/lm_head FP16) at ~500 MB total size.

Limitations

1-bit quantization is not lossless; expect a 5-10× PPL increase versus full-precision Qwen3-1.7B.
The current checkpoint is trained primarily on raw text (LM CE) — it has not been fine-tuned for instruction following or specific tasks. Generation quality is limited.
Generation outputs may exhibit repetition and incoherence at this PPL range; instruction following, math, and code tasks are not validated.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for qtj1999/llm1bit

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Finetuned

(702)

this model