File size: 4,553 Bytes
0037210 94150bb 539279e 94150bb 4ef075f 94150bb 539279e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
---
license: mit
language:
- en
tags:
- Visual Odometry
- Deep Learning
- Computer Vision
---
# π CycleVO
<div align="center">





**Part of the BodySLAM Framework for Endoscopic Surgical Applications**
[Paper](https://arxiv.org/abs/2408.03078) | [GitHub](https://github.com/GuidoManni/BodySLAM)
</div>
---
## π Overview
CycleVO is an unsupervised monocular pose estimation model designed to robustly estimate the relative camera pose between consecutive frames from endoscopic video. It addresses challenges such as low-texture surfaces and significant illumination variations common in surgical environments.
<div align="center">
<img src="CycleVO_architecture.png" alt="CycleVO Architecture Diagram" width="80%"/>
</div>
## β¨ Key Features
- **π Unsupervised Learning via Cycle Consistency**: Inspired by CycleGAN and InfoGAN
- **β‘ Competitive Performance and Speed**: Low inference time compared to state-of-the-art methods
- **π Easy Integration with SLAM Pipelines**: Provides ready-to-use motion matrices
## π§ Model Details
CycleVO learns to estimate the relative motion (i.e., camera pose) between consecutive endoscopic frames. The model predicts a motion matrix π=[π
,π‘<sub>unscaled</sub>,1,0] using a generator encoder architecture augmented with a pose estimation tail.
| **Developed by** | Guido Manni, Clemente Lauretti, Francesco Prata, Rocco Papalia, Loredana Zollo, Paolo Soda |
|:-----------------|:--------------------------------------------------------------------------------------------|
| **Model Type** | Unsupervised Monocular Visual Odometry / Relative Camera Pose Estimation |
| **License** | MIT |
| **Training** | From scratch using a large-scale internal endoscopic dataset |
## π Getting Started
For complete documentation, please refer to the [GitHub repository](https://github.com/yourusername/BodySLAM).
## π Use Cases
### β
Ideal Applications
- **Surgical Navigation**: Real-time guidance during minimally invasive procedures
- **3D Reconstruction**: Enhanced mapping of surgical scenes
- **Depth Perception**: Accurate pose estimates to complement monocular depth predictors
### β Out-of-Scope Applications
- General-purpose visual odometry without proper domain adaptation
## π Training Details
- **Dataset**: 300+ hours of endoscopic videos from 100 patients (gastroscopy and prostatectomy)
- **Preprocessing**: Frame extraction with 128Γ128 pixel center crop
- **Loss Function**: Combined adversarial, image cycle consistency, and pose cycle consistency losses
- **Optimizer**: Adam with standard learning rate schedules
## π‘οΈ Limitations & Recommendations
- **Inherent Scale Ambiguity**: Common in monocular systems
- **Domain Specificity**: Trained solely on endoscopic data
- **Clinical Deployment**: Requires thorough validation and clinical trials
**We recommend**:
- Validating the model thoroughly in your target environment
- Integrating additional sensors when possible
- Collaborating with clinical experts before surgical deployment
## π Citation
```bibtex
@misc{manni2024bodyslamgeneralizedmonocularvisual,
title={BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications},
author={G. Manni and C. Lauretti and F. Prata and R. Papalia and L. Zollo and P. Soda},
year={2024},
eprint={2408.03078},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2408.03078}
}
```
## π Glossary
- **Cycle Consistency Loss**: Enforces agreement between original and reconstructed inputs after transformations
- **Motion Matrix (M)**: Composed of rotation (R) and unscaled translation vector (t<sub>unscaled</sub>)
- **ATE/RTE/RRE**: Absolute Trajectory Error, Relative Trajectory Error, Relative Rotation Error
## π« Contact
For questions or further information, please contact:
**Guido Manni** - [guido.manni@unicampus.it](mailto:guido.manni@unicampus.it)
--- |