File size: 4,418 Bytes
c0545c6 23c8f66 2c76547 1c53a53 2c76547 23c8f66 2c76547 23c8f66 2c76547 23c8f66 2c76547 23c8f66 2c76547 23c8f66 2c76547 23c8f66 2c76547 23c8f66 2c76547 23c8f66 71d22da 2c76547 23c8f66 71d22da 2c76547 23c8f66 4ba91b5 2c76547 71d22da 2c76547 23c8f66 2c76547 23c8f66 2c76547 23c8f66 2c76547 c0545c6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | ---
language:
- en
pipeline_tag: depth-estimation
---
# [ECE1508 Final Project] Joint Learning of Exposure Patterns and Stereo Depth from Coded Snapshots

This project introduces a novel, end-to-end learning approach that jointly addresses two traditionally separate computer vision challenges: Snapshot Compressed Image (SCI) decoding and dynamic stereo depth estimation. The framework is an adaptation of the [DynamicStereo](https://github.com/facebookresearch/dynamic_stereo) repository and was trained using the [DynamicReplica](https://github.com/facebookresearch/dynamic_stereo) dataset.
## Dataset
The [DynamicReplica](https://github.com/facebookresearch/dynamic_stereo) dataset consists of 145200 *stereo* frames (524 videos) with humans and animals in motion.
### Download the Dynamic Replica dataset
Due to the enormous size of the original dataset, we created the `links_lite.json` file to enable quick testing by downloading just a small portion of the dataset.
```
python ./scripts/download_dynamic_replica.py --link_list_file links_lite.json --download_folder ./dynamic_replica_data --download_splits test train valid real
```
To download the full dataset, please visit [the original site](https://github.com/facebookresearch/dynamic_stereo) created by Meta.
## Installation
To set up and run the project, please follow these steps.
### Setup the root for all source files:
```
git clone https://github.com/kungchuking/E2E_SCSI.git
cd dynamic_stereo
```
### Create a conda env:
```
conda create -n dynamicstereo python=3.8
conda activate dynamicstereo
```
### Install requirements
```
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
pip install -r requirements.txt
```
## Evaluation
To download the pre-trained model weights (checkpoints), please follow the instructions below.
### Command Line Download
You can use the following commands to create the required directory and download the primary checkpoint directly from the Hugging Face repository:
```
mkdir dynamicstereo_sf_dr
wget -O dynamicstereo_sf_dr/model_dynamic-stereo_050895.pth "https://huggingface.co/kungchuking/E2E_SCSI/resolve/main/dynamicstereo_sf_dr/model_dynamic-stereo_050895.pth"
```
### Manual Download
Alternatively, you can manually download the checkpoints by clicking the [link](https://huggingface.co/kungchuking/E2E_SCSI/resolve/main/dynamicstereo_sf_dr/model_dynamic-stereo_050895.pth). Ensure the downloaded file is placed in the required path: `./dynamicstereo_sf_dr/`.
### Evaluation Notebook
For detailed instructions on how to evaluate the model, please refer to the dedicated [evaluation notebook](https://huggingface.co/kungchuking/E2E_SCSI/blob/main/notebooks/evaluate.ipynb).
### Evaluation and Validation
To execute the final evaluation on the DynamicReplica test set, navigate to the `evaluation`directory and run the following Python script:
```
cd evaluation
python evaluate.py
```
## Training
### Hardware and Memory Requirements
Training the model requires a minimum of a 50GB GPU.
* **Memory Adjustment**: If your GPU memory is limited, you may decrease the `image_size` and/or the `sample_len` parameters.
* **Resolution Note**: The chosen `image_size` of 480x640 corresponds to the native resolution of the custom-designed coded-exposure camera used for our research.
* **Compression Impact**: Reducing the `sample_length` will inherently decrease the effective compression ratio for the Snapshot Compressed Imaging (SCI) process.
Before starting training, you must download the Dynamic Replica dataset.
### Execution
If you are running on a Linux machine, use the provided shell script for training:
```
./train.csh
```
For other operating systems, you can open the `./train.csh` file and manually copy and execute the instruction.
## License
Portions of the project are available under separate license terms: [DynamicStereo](https://github.com/facebookresearch/dynamic_stereo) is licensed under CC-BY-NC, [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) is licensed under the MIT license, [LoFTR](https://github.com/zju3dv/LoFTR) and [CREStereo](https://github.com/megvii-research/CREStereo) are licensed under the Apache 2.0 license. |