BiPT-v3 — Binarized Point Transformer V3 (W1A1)

Backup of the headline checkpoint and reproduction artifacts for the BiPT (Binarized Point Transformer) project (IEEE TPAMI submission). This repository is a private archive of key information before a server maintenance/restart — it is not a polished release.

Headline result (S3DIS Area-5)

Model	Bits (W/A)	mIoU	mAcc	allAcc
BiPT (this checkpoint)	W1A1	0.6805	0.7470	0.8992

Best validation mIoU 0.6805 reached 2026-06-15 (epoch 17 of the final stage), stable through epoch 26 (no further improvement → converged).
Backbone: Point Transformer V3 (PTv3). The run is named w1a1 (1-bit weights / 1-bit activations) and uses the project's binarization (SAB softmax-aware binarization, BiLinearLSR, SubLN, RelNorm, EMA-max).

Caveat for whoever restores this: the dumped RESOLVED_headline_config.py shows top-level quantize=False and a vanilla type='PT-v3m1' backbone — i.e. the 1-bit binarization is injected by the code path / model wrapper at runtime, not by a config flag in this dump.

Resolved (binarization entry point): the injection happens in code/pointcept_framework/pointcept/models/quantization/quant_utils.py → convert_ptv3_to_bi_ptv3(model), which recursively replaces every nn.Linear except modules matching ['embedding','stem','seg_head'] with BiLinearLSR (binary_layers.py). So the backbone PTv3 linears are W1A1; the input embedding, stem, and segmentation head stay full-precision — the standard "keep first/last layer FP" binarization convention. This is consistent with quantize=False in the config dump, because binarization is a model-surgery pass, not a config flag.
Project method (across the full chain) uses dense superpoint-guided supervision: superpoint/edge/semantic contrastive losses + boundary CE reweighting, plus the superpoint construction (kNN + normal-dissimilarity edges + Felzenszwalb growing). These contrastive/edge terms are active in the earlier chain stages.
Final/headline stage (...push_w1_warm20_30ep...) overrides model.criteria to just two terms — verified from the config:
- CrossEntropyLoss with per-class weights [0.65,0.65,0.85,8.0,5.0,2.9,4.2,0.95,0.85,1.6,1.1,1.7,1.45] (rare classes beam=8.0 / column=5.0 / door=4.2 / window=2.9 up-weighted), loss_weight=1.0;
- LovaszLoss (multiclass, loss_weight=1.0).
- 30 epochs, eval_epoch=2, warm-started from the stage-3 model_last.pth.

Files

weights/bipt_w1a1_s3dis_best_6805.pth — headline W1A1 checkpoint (mIoU 0.6805).
configs/ — the full reproduction chain (Pointcept config format):
- pami_pycut_w1a1_rare_edge_spcontrast_semcontrast_4gpu_20260505_040617.py — chain-root base.
- pami_pycut_w1a1_chonggao_protected_1gpu_20260601.py — stage 1.
- pami_pycut_w1a1_stage3_sem_strong_protected_20260604.py — stage 2 (sem SupCon λ=0.01, 30 ep).
- pami_pycut_w1a1_weak_rare_ce2x_protected_20260609.py — stage 3 (2× rare-class CE, 20 ep).
- pami_pycut_w1a1_push_w1_warm20_30ep_protected_20260611.py — final/headline stage (30 ep).
- RESOLVED_headline_config.py — fully-resolved config dumped by the trainer for the headline run.
- pami_pycut_ptv3fp_dual_valfix_protected_20260605.py / ..._40ep_...py — FP32 PTv3 recipe-matched teacher configs (see note below).
logs/*.besttail.txt — best-mIoU / val-result tail of each stage's train.log.
code/ — core reproduction source (1.9 MB, no datasets/exp outputs):
- pointcept_framework/pointcept/models/quantization/ — the binarization core: binary_layers.py (BiLinearLSR 1-bit linear with learned-scale-rescaling + straight-through sign, plus a JIT-compiled bit-packed CUDA GEMM in backend/ used at inference; falls back to PyTorch simulation when CUDA build fails) and quant_utils.py::convert_ptv3_to_bi_ptv3 (the runtime PTv3→Bi-PTv3 surgery).
- pointcept_framework/pointcept/models/point_transformer_v3/ — the PTv3 backbone that the surgery binarizes.
- pointcept_framework/{pointcept,tools,configs/s3dis,configs/_base_} — the trimmed Pointcept library + tools + S3DIS configs needed to load and evaluate.
- superpoint_ops/ — superpoint construction (cut-pursuit + kNN/normal-edge graph) backing the superpoint-guided contrastive losses.
- min_repro_server_pack/ — smallest server-side eval recipe (MIN_REPRO_README.md, run_wc1_quick.sh, run_area5_eval.sh) for sanity-checking the checkpoint without retraining.

Training recipe (warm-start chain)

The headline W1A1 model is the endpoint of a 4-stage warm-start chain, each stage resuming from the previous stage's best/last checkpoint:

stage1 chonggao (1gpu)
  -> stage2 sem_strong  (30 ep, semantic_contrastive_weight=0.01)
  -> stage3 rare_ce2x   (20 ep, 2x CE weight on beam/column/door/window + Lovasz)
  -> stage4 push_w1_warm20 (30 ep)  ==> best mIoU 0.6805

Compute side: Pointcept on the Haozhe Vepfs tree, launched via the poplab→Volcano run.py launcher. Dataset: S3DIS, Area-5 held out for validation.

Note on the FP32 teacher configs

pami_pycut_ptv3fp_dual_* are FP32 (quantize=False) PTv3 runs sharing the superpoint recipe. The archived 20-epoch single-stage FP32 run reached only ~0.444 mIoU because it used a fraction of the student's multi-stage training budget and a different warm-start start point — it is not a fair recipe-matched teacher and should not be cited as one. A budget-matched multi-stage FP32 run is the correct comparison.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support