E2E_SCSI / README.md

Update README

4ba91b5 2 months ago

4.42 kB

	---
	language:
	- en
	pipeline_tag: depth-estimation
	---
	# [ECE1508 Final Project] Joint Learning of Exposure Patterns and Stereo Depth from Coded Snapshots

	![Overview](https://github.com/kungchuking/E2E_SCSI/raw/master/images/overview.gif)

	This project introduces a novel, end-to-end learning approach that jointly addresses two traditionally separate computer vision challenges: Snapshot Compressed Image (SCI) decoding and dynamic stereo depth estimation. The framework is an adaptation of the [DynamicStereo](https://github.com/facebookresearch/dynamic_stereo) repository and was trained using the [DynamicReplica](https://github.com/facebookresearch/dynamic_stereo) dataset.

	## Dataset
	The [DynamicReplica](https://github.com/facebookresearch/dynamic_stereo) dataset consists of 145200 stereo frames (524 videos) with humans and animals in motion.

	### Download the Dynamic Replica dataset
	Due to the enormous size of the original dataset, we created the `links_lite.json` file to enable quick testing by downloading just a small portion of the dataset.

	```
	python ./scripts/download_dynamic_replica.py --link_list_file links_lite.json --download_folder ./dynamic_replica_data --download_splits test train valid real
	```

	To download the full dataset, please visit [the original site](https://github.com/facebookresearch/dynamic_stereo) created by Meta.

	## Installation
	To set up and run the project, please follow these steps.

	### Setup the root for all source files:
	```
	git clone https://github.com/kungchuking/E2E_SCSI.git
	cd dynamic_stereo
	```
	### Create a conda env:
	```
	conda create -n dynamicstereo python=3.8
	conda activate dynamicstereo
	```
	### Install requirements
	```
	pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
	pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
	pip install -r requirements.txt
	```

	## Evaluation
	To download the pre-trained model weights (checkpoints), please follow the instructions below.

	### Command Line Download
	You can use the following commands to create the required directory and download the primary checkpoint directly from the Hugging Face repository:
	```
	mkdir dynamicstereo_sf_dr
	wget -O dynamicstereo_sf_dr/model_dynamic-stereo_050895.pth "https://huggingface.co/kungchuking/E2E_SCSI/resolve/main/dynamicstereo_sf_dr/model_dynamic-stereo_050895.pth"
	```
	### Manual Download
	Alternatively, you can manually download the checkpoints by clicking the [link](https://huggingface.co/kungchuking/E2E_SCSI/resolve/main/dynamicstereo_sf_dr/model_dynamic-stereo_050895.pth). Ensure the downloaded file is placed in the required path: `./dynamicstereo_sf_dr/`.

	### Evaluation Notebook
	For detailed instructions on how to evaluate the model, please refer to the dedicated [evaluation notebook](https://huggingface.co/kungchuking/E2E_SCSI/blob/main/notebooks/evaluate.ipynb).

	### Evaluation and Validation
	To execute the final evaluation on the DynamicReplica test set, navigate to the `evaluation`directory and run the following Python script:
	```
	cd evaluation
	python evaluate.py
	```

	## Training
	### Hardware and Memory Requirements
	Training the model requires a minimum of a 50GB GPU.
	* Memory Adjustment: If your GPU memory is limited, you may decrease the `image_size` and/or the `sample_len` parameters.
	* Resolution Note: The chosen `image_size` of 480x640 corresponds to the native resolution of the custom-designed coded-exposure camera used for our research.
	* Compression Impact: Reducing the `sample_length` will inherently decrease the effective compression ratio for the Snapshot Compressed Imaging (SCI) process.
	Before starting training, you must download the Dynamic Replica dataset.
	### Execution
	If you are running on a Linux machine, use the provided shell script for training:
	```
	./train.csh
	```
	For other operating systems, you can open the `./train.csh` file and manually copy and execute the instruction.

	## License
	Portions of the project are available under separate license terms: [DynamicStereo](https://github.com/facebookresearch/dynamic_stereo) is licensed under CC-BY-NC, [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) is licensed under the MIT license, [LoFTR](https://github.com/zju3dv/LoFTR) and [CREStereo](https://github.com/megvii-research/CREStereo) are licensed under the Apache 2.0 license.