UnCageNet / README.md
sayak-iit's picture
Update README.md
6f218a7 verified
---
license: apache-2.0
language:
- en
tags:
- computer-vision
- animal-pose-estimation
- multi-object-tracking
- occlusion-handling
- image-segmentation
- inpainting
- deep-learning
- video-analysis
arxiv: 2512.07712
---
# UnCageNet
**UnCageNet** is a computer vision framework for robust **animal tracking and pose estimation in caged environments**, where occlusions caused by cage bars significantly degrade the performance of existing methods.
This repository provides the **official implementation** of the paper:
> **UnCageNet: Tracking and Pose Estimation of Caged Animal**
> Sayak Dutta, Harish Katti, Shashikant Verma, Shanmuganathan Raman
> arXiv: https://arxiv.org/abs/2512.07712
๐Ÿ”— **Code:** https://github.com/itz-sayak/UnCageNet
---
## ๐Ÿ” Method Overview
UnCageNet introduces a **three-stage preprocessing pipeline** that improves downstream tracking and pose estimation under structured occlusions:
1. **Cage Segmentation**
- Gabor-enhanced ResNet-UNet
- Orientation-aware filters (72 directional kernels)
- Accurate detection of cage bar structures
2. **Cage Inpainting**
- Content-aware reconstruction using **CRFill**
- Removes structured occlusions while preserving animal appearance
3. **Downstream Evaluation**
- Standard pose estimation and tracking models (e.g., STEP, ViTPose)
- Applied on โ€œuncagedโ€ frames for fair performance comparison
This pipeline enables performance **comparable to uncaged environments**, despite heavy occlusions.
---
## ๐Ÿ“Š Experimental Highlights
- Significant improvement in:
- Keypoint detection accuracy
- Trajectory consistency
- Robust performance across:
- Severe occlusion patterns
- Long video sequences
- Plug-and-play compatibility with existing tracking and pose models
(Refer to the paper for full quantitative results.)
---
## ๐Ÿ’ก Intended Use
UnCageNet is intended for:
- Animal behavior analysis
- Zoological and veterinary monitoring
- Laboratory animal studies
- Long-term tracking in constrained environments
---
## โš ๏ธ Limitations
- Assumes **structured occlusions** (e.g., cage bars)
- Performance may degrade for:
- Highly deformable or unstructured occluders
- Extremely low-resolution video
- Not trained for arbitrary object categories beyond animals
---
## ๐Ÿ“„ Citation
If you use this work, please cite:
```bibtex
@article{dutta2025uncagenet,
title = {UnCageNet: Tracking and Pose Estimation of Caged Animal},
author = {Dutta, Sayak and Katti, Harish and Verma, Shashikant and Raman, Shanmuganathan},
journal = {arXiv preprint arXiv:2512.07712},
year = {2025}
}