β‘ Introduction
Sapiens-Lite is our optimized "inference-only" solution, offering:
- Up to 4x faster inference
- Minimal dependencies
- Negligible accuracy loss
π Getting Started
Set the sapiens_lite code root.
export SAPIENS_LITE_ROOT=$SAPIENS_ROOT/liteWe support lite-inference for multiple GPU architectures, primarily in two modes.
MODE=torchscript: All GPUs with PyTorch2.2+. Inference atfloat32, slower but closest to original model performance.MODE=bfloat16: Optimized mode for A100 GPUs with PyTorch-2.3. Uses FlashAttention for accelerated inference. Coming Soon!
Note to Windows users: Please use the python scripts in
./demoinstead of./scripts.Please download the checkpoints from hugging-face.
Checkpoints are suffixed with "_$MODE.pt2".
You can be selective about only downloading the checkpoints of interest.
Set$SAPIENS_LITE_CHECKPOINT_ROOTto the path ofsapiens_lite_host/$MODE. Checkpoint directory structure:sapiens_lite_host/ βββ torchscript βββ pretrain/ β βββ checkpoints/ β βββ sapiens_0.3b/ β βββ sapiens_0.6b/ β βββ sapiens_1b/ β βββ sapiens_2b/ βββ pose/ βββ seg/ βββ depth/ βββ normal/ βββ bfloat16 βββ pretrain/ βββ pose/ βββ seg/ βββ depth/ βββ normal/
π§ Installation
Set up the minimal sapiens_lite conda environment (pytorch >= 2.2):
conda create -n sapiens_lite python=3.10
conda activate sapiens_lite
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install opencv-python tqdm json-tricks
π Sapiens-Lite Inference
Note: For inference in bfloat16 mode:
- Outputs may result in slight variations from the original
float32predictions. - The first model run will
autotunethe model and print the log. Subsequent runs automatically load the tuned model. - Due to
torch.compilewarmup iterations, you'll observe better speedups with a larger number of images, thanks to amortization.
Available tasks:
βοΈ Converting Models to Lite
Obtain a torch.ExportedProgram or torchscript from the existing sapiens model checkpoint. Note, this requires the full-install sapiens conda env.
cd $SAPIENS_ROOT/scripts/[pretrain,pose,seg]/optimize/local
./[feature_extracter,keypoints*,seg,depth,normal]_optimizer.sh
For inference:
- Use
demo.AdhocImageDatasetwrapped with aDataLoaderfor image fetching and preprocessing.\ - Utilize the
WorkerPoolclass for multiprocessing capabilities in tasks like saving predictions and visualizations.