sapiens2-pose-0.8b INT4-G128

Packed 4-bit derivative of facebook/sapiens2-pose-0.8b.

This artifact uses symmetric per-group INT4 packing with group size 128 for large floating-point weight tensors. Norms, biases, positional/rope tensors, and small tensors are kept in their source dtype. It is a storage/runtime-loader quant for the current official Sapiens2 code path, not an AWQ/GGUF/NVFP4 LLM artifact.

Files

facebook__sapiens2-pose-0.8b-int4-g128.safetensors: packed INT4 safetensors artifact.
load_sapiens2_int4.py: loader that reconstructs a PyTorch state dict for the official Sapiens2 model code.
config.json and preprocessor_config.json: copied from the source repo.
quantization_report.json: build report.

Quantization Report

Source revision: 2d5be15b2cdc2bac5d9ea8a61849c3ba2c52f42a
Group size: 128
Source bytes: 3394819360
Artifact bytes: 439862416
Compression ratio: 7.7179x
Tensors: 560
Quantized tensors: 200
Max tensor MAE during dequant smoke: 0.00429335

Fidelity Validation

The packed INT4 artifact was dequantized back to floating-point tensors and compared against the source checkpoint.

Validation gate: global floating-tensor similarity >= 90.00%
Result: PASS
Global floating-tensor similarity: 99.425898%
Minimum large-tensor cosine: 0.974259198

Loading

from load_sapiens2_int4 import load_state_dict

state_dict = load_state_dict("facebook__sapiens2-pose-0.8b-int4-g128.safetensors", device="cpu")
# Then instantiate the matching official Sapiens2 architecture and load:
# model.load_state_dict(state_dict, strict=True)

Limitations

This is a verified packed-weight artifact with a dequantizing loader. It does not claim native INT4 CUDA kernels for Sapiens2 yet. Runtime speedups require a Sapiens2-specific kernel/export path and should be benchmarked separately.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Reza2kn/sapiens2-pose-0.8b-INT4-G128

Base model

facebook/sapiens2-pretrain-0.8b

Finetuned

facebook/sapiens2-pose-0.8b

Quantized

(1)

this model

Collection including Reza2kn/sapiens2-pose-0.8b-INT4-G128

Sapiens2 INT4-G128 Quantized Checkpoints

Collection

All 22 Sapiens2 INT4-G128 quants with fidelity scores. • 22 items • Updated Jun 8 • 1