--- license: cc-by-4.0 library_name: pytorch tags: - computer-vision - object-tracking - spiking-neural-networks - visual-streaming-perception - energy-efficient - cvpr-2025 pipeline_tag: object-detection --- # ViStream: Law-of-Charge-Conservation Inspired Spiking Neural Network for Visual Streaming Perception **ViStream** is a novel energy-efficient framework for Visual Streaming Perception (VSP) that leverages Spiking Neural Networks (SNNs) with Law of Charge Conservation (LoCC) properties. ## Model Details ### Model Description - **Developed by:** Kang You, Ziling Wei, Jing Yan, Boning Zhang, Qinghai Guo, Yaoyu Zhang, Zhezhi He - **Model type:** Spiking Neural Network for Visual Streaming Perception - **Language(s):** PyTorch implementation - **License:** CC-BY-4.0 - **Paper:** [CVPR 2025](https://openaccess.thecvf.com/content/CVPR2025/papers/You_VISTREAM_Improving_Computation_Efficiency_of_Visual_Streaming_Perception_via_Law-of-Charge-Conservation_CVPR_2025_paper.pdf) - **Repository:** [GitHub](https://github.com/Intelligent-Computing-Research-Group/ViStream) ### Model Architecture ViStream introduces two key innovations: 1. **Law of Charge Conservation (LoCC)** property in ST-BIF neurons 2. **Differential Encoding (DiffEncode)** scheme for temporal optimization The framework achieves significant computational reduction while maintaining accuracy equivalent to ANN counterparts. ## Uses ### Direct Use ViStream can be directly used for: - **Multiple Object Tracking (MOT)** - **Single Object Tracking (SOT)** - **Video Object Segmentation (VOS)** - **Multiple Object Tracking and Segmentation (MOTS)** - **Pose Tracking** ### Downstream Use The model can be fine-tuned for various visual streaming perception tasks in: - Autonomous driving - UAV navigation - AR/VR applications - Real-time surveillance ## Bias, Risks, and Limitations ### Limitations - Requires specific hardware optimization for maximum energy benefits - Performance may vary with different frame rates - Limited to visual perception tasks ### Recommendations - Test thoroughly on target hardware before deployment - Consider computational constraints of edge devices - Validate performance on domain-specific datasets ## How to Get Started with the Model ```python from huggingface_hub import hf_hub_download import torch # Download the checkpoint checkpoint_path = hf_hub_download( repo_id="AndyBlocker/ViStream", filename="checkpoint-90.pth" ) # Load the model (requires ViStream implementation) checkpoint = torch.load(checkpoint_path, map_location='cpu') ``` For complete usage examples, see the [GitHub repository](https://github.com/Intelligent-Computing-Research-Group/ViStream). ## Training Details ### Training Data The model was trained on multiple datasets for various visual streaming perception tasks including object tracking, video object segmentation, and pose tracking. ### Training Procedure **Training Details:** - Framework: PyTorch - Optimization: Energy-efficient SNN training with Law of Charge Conservation - Architecture: ResNet-based backbone with spike quantization layers ## Evaluation The model demonstrates competitive performance across multiple visual streaming perception tasks while achieving significant energy efficiency improvements compared to traditional ANN-based approaches. Detailed evaluation results are available in the [CVPR 2025 paper](https://openaccess.thecvf.com/content/CVPR2025/papers/You_VISTREAM_Improving_Computation_Efficiency_of_Visual_Streaming_Perception_via_Law-of-Charge-Conservation_CVPR_2025_paper.pdf). ## Model Card Authors Kang You, Ziling Wei, Jing Yan, Boning Zhang, Qinghai Guo, Yaoyu Zhang, Zhezhi He ## Model Card Contact For questions about this model, please open an issue in the [GitHub repository](https://github.com/Intelligent-Computing-Research-Group/ViStream). ## Citation ```bibtex @inproceedings{you2025vistream, title={VISTREAM: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network}, author={You, Kang and Wei, Ziling and Yan, Jing and Zhang, Boning and Guo, Qinghai and Zhang, Yaoyu and He, Zhezhi}, booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference}, pages={8796--8805}, year={2025} } ```