sapiens2-pose-5b INT4-G128

Packed 4-bit derivative of facebook/sapiens2-pose-5b.

This artifact uses symmetric per-group INT4 packing with group size 128 for large floating-point weight tensors. Norms, biases, positional/rope tensors, and small tensors are kept in their source dtype. It is a storage/runtime-loader quant for the current official Sapiens2 code path, not an AWQ/GGUF/NVFP4 LLM artifact.

Files

  • facebook__sapiens2-pose-5b-int4-g128.safetensors: packed INT4 safetensors artifact.
  • load_sapiens2_int4.py: loader that reconstructs a PyTorch state dict for the official Sapiens2 model code.
  • config.json and preprocessor_config.json: copied from the source repo.
  • quantization_report.json: build report.

Quantization Report

  • Source revision: dfc4f28877b041f50e9c55663bf29ed38f9c18a1
  • Group size: 128
  • Source bytes: 20480794652
  • Artifact bytes: 2647416860
  • Compression ratio: 7.7361x
  • Tensors: 968
  • Quantized tensors: 344
  • Max tensor MAE during dequant smoke: 0.00373528

Fidelity Validation

The packed INT4 artifact was dequantized back to floating-point tensors and compared against the source checkpoint.

  • Validation gate: global floating-tensor similarity >= 90.00%
  • Result: PASS
  • Global floating-tensor similarity: 99.327831%
  • Minimum large-tensor cosine: 0.986056328

Loading

from load_sapiens2_int4 import load_state_dict

state_dict = load_state_dict("facebook__sapiens2-pose-5b-int4-g128.safetensors", device="cpu")
# Then instantiate the matching official Sapiens2 architecture and load:
# model.load_state_dict(state_dict, strict=True)

Limitations

This is a verified packed-weight artifact with a dequantizing loader. It does not claim native INT4 CUDA kernels for Sapiens2 yet. Runtime speedups require a Sapiens2-specific kernel/export path and should be benchmarked separately.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Reza2kn/sapiens2-pose-5b-INT4-G128

Quantized
(1)
this model

Collection including Reza2kn/sapiens2-pose-5b-INT4-G128