SDSTrack — RGB-E Checkpoint for VisEvent

This repository contains the RGB-E (RGB-Event) checkpoint for SDSTrack, a self-distillation symmetric adapter learning tracker for multi-modal single object tracking (CVPR 2024).

Model Details

Attribute	Value
Tracker	SDSTrack (Self-Distillation Symmetric Adapter Learning)
Backbone	ViT-B (Vision Transformer, base size) with MAE pretraining
Modality	RGB-E (RGB + Event camera)
Dataset	VisEvent
Config	`cvpr2024_rgbe`
Training epochs	50
Batch size	16
Learning rate	1e-4
Upstream commit	`822d985` (SDSTrack@main)

Reproduction Results

This checkpoint was independently reproduced as part of the EvTrack project (Pattern Recognition course design, Topic #65).

Corrected metrics (MATLAB-equivalent protocol, absent frames excluded):

Metric	Paper (CVPR 2024)	Reproduction	Delta
Success AUC	~0.597	0.5829	-1.4%
Precision @ 20px	~0.767	0.7506	-1.6%
SR @ 0.50	—	0.6929	—

Evaluation details:

319/320 VisEvent test sequences evaluated
1 sequence (00331_UAV_outdoor5) excluded — target absent in first frame
See EvTrack experiments/sdstrack for full reproduction docs

Files

File	Size	Description
`SDSTrack_cvpr2024_rgbe.pth.tar`	~490 MB	Trained checkpoint for RGB-E evaluation on VisEvent
`results/vis_event_test/`	~13 MB	Tracker predictions (320 `.txt` files, one per sequence)

Checksums

Algorithm	Hash
SHA256	`b573dec59e9537204efbc131dccae047e27aeb41a26af7fbd4af222c8eaf0b74`
MD5	`bd0c98b7a2ea898d8cfdc3942158b9fa`

Verify with:

sha256sum SDSTrack_cvpr2024_rgbe.pth.tar
md5sum SDSTrack_cvpr2024_rgbe.pth.tar

Usage

Loading the checkpoint in Python

from huggingface_hub import hf_hub_download
import torch

checkpoint_path = hf_hub_download(
    repo_id="krisspy39/sdstrack-rgbe",
    filename="SDSTrack_cvpr2024_rgbe.pth.tar",
    repo_type="model"
)

checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)

Evaluation with upstream code

Clone SDSTrack:

git clone https://github.com/hoqolo/SDSTrack.git
cd SDSTrack

Download the pretrained OSTrack foundation model to ./pretrained/vitb_256_mae_ce_32x4_ep300/OSTrack_ep0300.pth.tar
Symlink or copy this checkpoint to ./models/SDSTrack_cvpr2024_rgbe.pth.tar
Run evaluation:

python ./RGBE_workspace/test_rgbe_mgpus.py \
  --script_name sdstrack \
  --num_gpus 1 \
  --threads 4 \
  --epoch 50 \
  --yaml_name cvpr2024_rgbe

Note: The upstream code requires PyTorch 1.11 + Python 3.8. For PyTorch 2.x compatibility patches, see EvTrack/sdstrack_eval.py.

Dataset

This checkpoint is trained and evaluated on VisEvent, a large-scale RGB-Event single object tracking benchmark.

Train: 120 sequences
Test: 320 sequences
Data format: Each sequence contains vis_imgs/ (RGB frames), event_imgs/ (event frames), groundtruth.txt, and absent_label.txt

The VisEvent dataset is also available as a webdataset on Hugging Face: krisspy39/visevent

Citation

If you use this model or the SDSTrack tracker, please cite:

@inproceedings{hou2024sdstrack,
  title={Self-Distillation Symmetric Adapter Learning for Multi-Modal Object Tracking},
  author={Hou, Xiaojun and others},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

License

This checkpoint is provided for research purposes. Please refer to the original SDSTrack repository for licensing terms.

Acknowledgments

Original SDSTrack implementation by hoqolo
VisEvent dataset by wangxiao5791509
This checkpoint was reproduced as part of a university Pattern Recognition course project (Topic #65: Event-camera-based object tracking)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including krisspy39/sdstrack-rgbe

EvTrack

Collection

models and datasets used in EvTrack • 3 items • Updated Jun 9