--- license: mit tags: - pose-estimation - 6d-pose - vision-transformer - spacecraft - space - vit datasets: - SPEED pipeline_tag: image-classification --- # FastPose-ViT: Pretrained Weights Pretrained weights for **[FastPose-ViT](https://github.com/PierreAncey/FastPose-ViT)**, a Vision Transformer pipeline for real-time 6D spacecraft pose estimation. ## Available Weights All models are trained on the **SPEED** dataset. | File | Model | Task | Input Resolution | |------|-------|------|-----------------| | `vit_b_16_384.pth` | ViT-B/16-384 | Pose estimation (6D) | 384x384 | | `vit_b_16.pth` | ViT-B/16 | Pose estimation (6D) | 224x224 | | `small.pth` | LW-DETR Small | Object detection (bbox) | 512x512 | ## Usage 1. Clone the repository: ```bash git clone https://github.com/PierreAncey/FastPose-ViT.git cd FastPose-ViT ``` 2. Download weights and place them in a `weights/` directory. 3. Run evaluation on the SPEED dataset: ```bash DATASET=SPEED_FIXED && \ python3 src/evaluate.py \ --model_weights weights/vit_b_16_384.pth \ --rotation_format matrix \ --num_hidden_layers 0 \ --hidden_layer_dim 0 \ --nb_class_tokens 1 \ --batch_size 8 \ --vit_model vit_b_16_384 \ --dataset SPEED \ --dataset_root_dir $DATASET \ --num_workers 8 \ --merge_outputs \ --no_mlp ``` 4. Run the object detector: ```bash DATASET=SPEED_FIXED && \ python3 object_detector/evaluate.py \ --dataset_root_dir $DATASET \ --model_variant small \ --model_weights weights/small.pth ``` ## Model Details - **Pose estimator**: ViT backbone with direct 6D pose regression (rotation matrix + translation vector). Uses 6D continuous rotation representation with Gram-Schmidt orthogonalization. - **Object detector**: LW-DETR (Lightweight DETR) fine-tuned from COCO-pretrained weights for single-class spacecraft detection. Provides bounding boxes as preprocessing for the pose estimator. ## Citation ```bibtex @InProceedings{Ancey_2026_WACV, author = {Ancey, Pierre and Price, Andrew and Javed, Saqib and Salzmann, Mathieu}, title = {FastPose-ViT: A Vision Transformer for Real-Time Spacecraft Pose Estimation}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {March}, year = {2026}, pages = {7873-7882} } ``` ## License MIT License. See the [repository](https://github.com/PierreAncey/FastPose-ViT) for details.