sapiens2-pose-5b INT4-G128

Packed 4-bit derivative of facebook/sapiens2-pose-5b.

This artifact uses symmetric per-group INT4 packing with group size 128 for large floating-point weight tensors. Norms, biases, positional/rope tensors, and small tensors are kept in their source dtype. It is a storage/runtime-loader quant for the current official Sapiens2 code path, not an AWQ/GGUF/NVFP4 LLM artifact.

Files

facebook__sapiens2-pose-5b-int4-g128.safetensors: packed INT4 safetensors artifact.
load_sapiens2_int4.py: loader that reconstructs a PyTorch state dict for the official Sapiens2 model code.
config.json and preprocessor_config.json: copied from the source repo.
quantization_report.json: build report.

Quantization Report

Source revision: dfc4f28877b041f50e9c55663bf29ed38f9c18a1
Group size: 128
Source bytes: 20480794652
Artifact bytes: 2647416860
Compression ratio: 7.7361x
Tensors: 968
Quantized tensors: 344
Max tensor MAE during dequant smoke: 0.00373528

Fidelity Validation

The packed INT4 artifact was dequantized back to floating-point tensors and compared against the source checkpoint.

Validation gate: global floating-tensor similarity >= 90.00%
Result: PASS
Global floating-tensor similarity: 99.327831%
Minimum large-tensor cosine: 0.986056328

Loading

from load_sapiens2_int4 import load_state_dict

state_dict = load_state_dict("facebook__sapiens2-pose-5b-int4-g128.safetensors", device="cpu")
# Then instantiate the matching official Sapiens2 architecture and load:
# model.load_state_dict(state_dict, strict=True)

Limitations

This is a verified packed-weight artifact with a dequantizing loader. It does not claim native INT4 CUDA kernels for Sapiens2 yet. Runtime speedups require a Sapiens2-specific kernel/export path and should be benchmarked separately.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Reza2kn/sapiens2-pose-5b-INT4-G128

Base model

facebook/sapiens2-pretrain-5b

Finetuned

facebook/sapiens2-pose-5b

Quantized

(1)

this model

Collection including Reza2kn/sapiens2-pose-5b-INT4-G128

Sapiens2 INT4-G128 Quantized Checkpoints

Collection

All 22 Sapiens2 INT4-G128 quants with fidelity scores. • 22 items • Updated Jun 8 • 1