BiPT-v3 β€” Binarized Point Transformer V3 (W1A1)

Backup of the headline checkpoint and reproduction artifacts for the BiPT (Binarized Point Transformer) project (IEEE TPAMI submission). This repository is a private archive of key information before a server maintenance/restart β€” it is not a polished release.

Headline result (S3DIS Area-5)

Model Bits (W/A) mIoU mAcc allAcc
BiPT (this checkpoint) W1A1 0.6805 0.7470 0.8992
  • Best validation mIoU 0.6805 reached 2026-06-15 (epoch 17 of the final stage), stable through epoch 26 (no further improvement β†’ converged).
  • Backbone: Point Transformer V3 (PTv3). The run is named w1a1 (1-bit weights / 1-bit activations) and uses the project's binarization (SAB softmax-aware binarization, BiLinearLSR, SubLN, RelNorm, EMA-max).

    Caveat for whoever restores this: the dumped RESOLVED_headline_config.py shows top-level quantize=False and a vanilla type='PT-v3m1' backbone β€” i.e. the 1-bit binarization is injected by the code path / model wrapper at runtime, not by a config flag in this dump.

    Resolved (binarization entry point): the injection happens in code/pointcept_framework/pointcept/models/quantization/quant_utils.py β†’ convert_ptv3_to_bi_ptv3(model), which recursively replaces every nn.Linear except modules matching ['embedding','stem','seg_head'] with BiLinearLSR (binary_layers.py). So the backbone PTv3 linears are W1A1; the input embedding, stem, and segmentation head stay full-precision β€” the standard "keep first/last layer FP" binarization convention. This is consistent with quantize=False in the config dump, because binarization is a model-surgery pass, not a config flag.

  • Project method (across the full chain) uses dense superpoint-guided supervision: superpoint/edge/semantic contrastive losses + boundary CE reweighting, plus the superpoint construction (kNN + normal-dissimilarity edges + Felzenszwalb growing). These contrastive/edge terms are active in the earlier chain stages.
  • Final/headline stage (...push_w1_warm20_30ep...) overrides model.criteria to just two terms β€” verified from the config:
    • CrossEntropyLoss with per-class weights [0.65,0.65,0.85,8.0,5.0,2.9,4.2,0.95,0.85,1.6,1.1,1.7,1.45] (rare classes beam=8.0 / column=5.0 / door=4.2 / window=2.9 up-weighted), loss_weight=1.0;
    • LovaszLoss (multiclass, loss_weight=1.0).
    • 30 epochs, eval_epoch=2, warm-started from the stage-3 model_last.pth.

Files

  • weights/bipt_w1a1_s3dis_best_6805.pth β€” headline W1A1 checkpoint (mIoU 0.6805).
  • configs/ β€” the full reproduction chain (Pointcept config format):
    • pami_pycut_w1a1_rare_edge_spcontrast_semcontrast_4gpu_20260505_040617.py β€” chain-root base.
    • pami_pycut_w1a1_chonggao_protected_1gpu_20260601.py β€” stage 1.
    • pami_pycut_w1a1_stage3_sem_strong_protected_20260604.py β€” stage 2 (sem SupCon Ξ»=0.01, 30 ep).
    • pami_pycut_w1a1_weak_rare_ce2x_protected_20260609.py β€” stage 3 (2Γ— rare-class CE, 20 ep).
    • pami_pycut_w1a1_push_w1_warm20_30ep_protected_20260611.py β€” final/headline stage (30 ep).
    • RESOLVED_headline_config.py β€” fully-resolved config dumped by the trainer for the headline run.
    • pami_pycut_ptv3fp_dual_valfix_protected_20260605.py / ..._40ep_...py β€” FP32 PTv3 recipe-matched teacher configs (see note below).
  • logs/*.besttail.txt β€” best-mIoU / val-result tail of each stage's train.log.
  • code/ β€” core reproduction source (1.9 MB, no datasets/exp outputs):
    • pointcept_framework/pointcept/models/quantization/ β€” the binarization core: binary_layers.py (BiLinearLSR 1-bit linear with learned-scale-rescaling + straight-through sign, plus a JIT-compiled bit-packed CUDA GEMM in backend/ used at inference; falls back to PyTorch simulation when CUDA build fails) and quant_utils.py::convert_ptv3_to_bi_ptv3 (the runtime PTv3β†’Bi-PTv3 surgery).
    • pointcept_framework/pointcept/models/point_transformer_v3/ β€” the PTv3 backbone that the surgery binarizes.
    • pointcept_framework/{pointcept,tools,configs/s3dis,configs/_base_} β€” the trimmed Pointcept library + tools + S3DIS configs needed to load and evaluate.
    • superpoint_ops/ β€” superpoint construction (cut-pursuit + kNN/normal-edge graph) backing the superpoint-guided contrastive losses.
    • min_repro_server_pack/ β€” smallest server-side eval recipe (MIN_REPRO_README.md, run_wc1_quick.sh, run_area5_eval.sh) for sanity-checking the checkpoint without retraining.

Training recipe (warm-start chain)

The headline W1A1 model is the endpoint of a 4-stage warm-start chain, each stage resuming from the previous stage's best/last checkpoint:

stage1 chonggao (1gpu)
  -> stage2 sem_strong  (30 ep, semantic_contrastive_weight=0.01)
  -> stage3 rare_ce2x   (20 ep, 2x CE weight on beam/column/door/window + Lovasz)
  -> stage4 push_w1_warm20 (30 ep)  ==> best mIoU 0.6805

Compute side: Pointcept on the Haozhe Vepfs tree, launched via the poplab→Volcano run.py launcher. Dataset: S3DIS, Area-5 held out for validation.

Note on the FP32 teacher configs

pami_pycut_ptv3fp_dual_* are FP32 (quantize=False) PTv3 runs sharing the superpoint recipe. The archived 20-epoch single-stage FP32 run reached only ~0.444 mIoU because it used a fraction of the student's multi-stage training budget and a different warm-start start point β€” it is not a fair recipe-matched teacher and should not be cited as one. A budget-matched multi-stage FP32 run is the correct comparison.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support