FastPose-ViT / README.md

PierreAncey

Update README.md

5de8373 verified 9 days ago

preview code

raw

history blame contribute delete

2.45 kB

metadata

license: mit
tags:
  - pose-estimation
  - 6d-pose
  - vision-transformer
  - spacecraft
  - space
  - vit
datasets:
  - SPEED
pipeline_tag: image-classification

FastPose-ViT: Pretrained Weights

Pretrained weights for FastPose-ViT, a Vision Transformer pipeline for real-time 6D spacecraft pose estimation.

Available Weights

All models are trained on the SPEED dataset.

File	Model	Task	Input Resolution
`vit_b_16_384.pth`	ViT-B/16-384	Pose estimation (6D)	384x384
`vit_b_16.pth`	ViT-B/16	Pose estimation (6D)	224x224
`small.pth`	LW-DETR Small	Object detection (bbox)	512x512

Usage

Clone the repository:

git clone https://github.com/PierreAncey/FastPose-ViT.git
cd FastPose-ViT

Download weights and place them in a weights/ directory.
Run evaluation on the SPEED dataset:

DATASET=SPEED_FIXED && \
python3 src/evaluate.py \
  --model_weights weights/vit_b_16_384.pth \
  --rotation_format matrix \
  --num_hidden_layers 0 \
  --hidden_layer_dim 0 \
  --nb_class_tokens 1 \
  --batch_size 8 \
  --vit_model vit_b_16_384 \
  --dataset SPEED \
  --dataset_root_dir $DATASET \
  --num_workers 8 \
  --merge_outputs \
  --no_mlp

Run the object detector:

DATASET=SPEED_FIXED && \
python3 object_detector/evaluate.py \
  --dataset_root_dir $DATASET \
  --model_variant small \
  --model_weights weights/small.pth

Model Details

Pose estimator: ViT backbone with direct 6D pose regression (rotation matrix + translation vector). Uses 6D continuous rotation representation with Gram-Schmidt orthogonalization.
Object detector: LW-DETR (Lightweight DETR) fine-tuned from COCO-pretrained weights for single-class spacecraft detection. Provides bounding boxes as preprocessing for the pose estimator.

Citation

@InProceedings{Ancey_2026_WACV,
    author    = {Ancey, Pierre and Price, Andrew and Javed, Saqib and Salzmann, Mathieu},
    title     = {FastPose-ViT: A Vision Transformer for Real-Time Spacecraft Pose Estimation},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {March},
    year      = {2026},
    pages     = {7873-7882}
}

License

MIT License. See the repository for details.