Commit ·
23c8f66
1
Parent(s): f0ba1d7
Update README
Browse files
README.md
CHANGED
|
@@ -1,29 +1,11 @@
|
|
| 1 |
-
# [
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
[[`Paper`](https://research.facebook.com/publications/dynamicstereo-consistent-dynamic-depth-from-stereo-videos/)] [[`Project`](https://dynamic-stereo.github.io/)] [[`BibTeX`](#citing-dynamicstereo)]
|
| 8 |
-
|
| 9 |
-

|
| 10 |
-
|
| 11 |
-
**DynamicStereo** is a transformer-based architecture for temporally consistent depth estimation from stereo videos. It has been trained on a combination of two datasets: [SceneFlow](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html) and **Dynamic Replica** that we present below.
|
| 12 |
|
| 13 |
## Dataset
|
| 14 |
-
|
| 15 |
-
https://user-images.githubusercontent.com/37815420/236239579-7877623c-716b-4074-a14e-944d095f1419.mp4
|
| 16 |
-
|
| 17 |
-
The dataset consists of 145200 *stereo* frames (524 videos) with humans and animals in motion.
|
| 18 |
-
|
| 19 |
-
We provide annotations for both *left and right* views, see [this notebook](https://github.com/facebookresearch/dynamic_stereo/blob/main/notebooks/Dynamic_Replica_demo.ipynb):
|
| 20 |
-
- camera intrinsics and extrinsics
|
| 21 |
-
- image depth (can be converted to disparity with intrinsics)
|
| 22 |
-
- instance segmentation masks
|
| 23 |
-
- binary foreground / background segmentation masks
|
| 24 |
-
- optical flow (released!)
|
| 25 |
-
- long-range pixel trajectories (released!)
|
| 26 |
-
|
| 27 |
|
| 28 |
### Download the Dynamic Replica dataset
|
| 29 |
Due to the enormous size of the original dataset, we created the `links_lite.json` file to enable quick testing by downloading just a small portion of the dataset.
|
|
@@ -35,14 +17,12 @@ python ./scripts/download_dynamic_replica.py --link_list_file links_lite.json --
|
|
| 35 |
To download the full dataset, please visit [the original site](https://github.com/facebookresearch/dynamic_stereo) created by Meta.
|
| 36 |
|
| 37 |
## Installation
|
| 38 |
-
|
| 39 |
-
Describes installation of DynamicStereo with the latest PyTorch3D, PyTorch 1.12.1 & cuda 11.3
|
| 40 |
|
| 41 |
### Setup the root for all source files:
|
| 42 |
```
|
| 43 |
-
git clone https://github.com/
|
| 44 |
cd dynamic_stereo
|
| 45 |
-
export PYTHONPATH=`(cd ../ && pwd)`:`pwd`:$PYTHONPATH
|
| 46 |
```
|
| 47 |
### Create a conda env:
|
| 48 |
```
|
|
@@ -51,89 +31,40 @@ conda activate dynamicstereo
|
|
| 51 |
```
|
| 52 |
### Install requirements
|
| 53 |
```
|
| 54 |
-
|
| 55 |
-
# It will require some time to install PyTorch3D. In the meantime, you may want to take a break and enjoy a cup of coffee.
|
| 56 |
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
|
| 57 |
pip install -r requirements.txt
|
| 58 |
```
|
| 59 |
|
| 60 |
-
### (Optional) Install RAFT-Stereo
|
| 61 |
-
```
|
| 62 |
-
mkdir third_party
|
| 63 |
-
cd third_party
|
| 64 |
-
git clone https://github.com/princeton-vl/RAFT-Stereo
|
| 65 |
-
cd RAFT-Stereo
|
| 66 |
-
bash download_models.sh
|
| 67 |
-
cd ../..
|
| 68 |
-
```
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
## Evaluation
|
| 73 |
-
To download the checkpoints,
|
| 74 |
-
```
|
| 75 |
-
mkdir checkpoints
|
| 76 |
-
cd checkpoints
|
| 77 |
-
wget https://dl.fbaipublicfiles.com/dynamic_replica_v1/dynamic_stereo_sf.pth
|
| 78 |
-
wget https://dl.fbaipublicfiles.com/dynamic_replica_v1/dynamic_stereo_dr_sf.pth
|
| 79 |
-
cd ..
|
| 80 |
-
```
|
| 81 |
-
You can also download the checkpoints manually by clicking the links below. Copy the checkpoints to `./dynamic_stereo/checkpoints`.
|
| 82 |
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
To evaluate DynamicStereo:
|
| 87 |
```
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
MODEL.DynamicStereoModel.model_weights=./checkpoints/dynamic_stereo_sf.pth
|
| 91 |
```
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
As a result, you should see the numbers from *Table 5* in the [paper](https://arxiv.org/pdf/2305.02296.pdf). (for this, you need `kernel_size=20`)
|
| 95 |
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
If you installed [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo), you can run:
|
| 99 |
-
```
|
| 100 |
-
python ./evaluation/evaluate.py --config-name eval_dynamic_replica_40_frames \
|
| 101 |
-
MODEL.model_name=RAFTStereoModel exp_dir=./outputs/test_dynamic_replica_raft
|
| 102 |
-
```
|
| 103 |
-
|
| 104 |
-
Other public datasets we use:
|
| 105 |
-
- [SceneFlow](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html)
|
| 106 |
-
- [Sintel](http://sintel.is.tue.mpg.de/stereo)
|
| 107 |
-
- [Middlebury](https://vision.middlebury.edu/stereo/data/)
|
| 108 |
-
- [ETH3D](https://www.eth3d.net/datasets#low-res-two-view-training-data)
|
| 109 |
-
- [KITTI 2015](http://www.cvlibs.net/datasets/kitti/eval_stereo.php)
|
| 110 |
|
| 111 |
## Training
|
| 112 |
-
|
| 113 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
```
|
| 115 |
-
|
| 116 |
-
--spatial_scale -0.2 0.4 --image_size 384 512 --saturation_range 0 1.4 --num_steps 200000 \
|
| 117 |
-
--ckpt_path dynamicstereo_sf_dr \
|
| 118 |
-
--sample_len 5 --lr 0.0003 --train_iters 10 --valid_iters 20 \
|
| 119 |
-
--num_workers 28 --save_freq 100 --update_block_3d --different_update_blocks \
|
| 120 |
-
--attention_type self_stereo_temporal_update_time_update_space --train_datasets dynamic_replica things monkaa driving
|
| 121 |
```
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
|
| 126 |
## License
|
| 127 |
-
|
| 128 |
|
| 129 |
-
|
| 130 |
-
## Citing DynamicStereo
|
| 131 |
-
If you use DynamicStereo or Dynamic Replica in your research, please use the following BibTeX entry.
|
| 132 |
-
```
|
| 133 |
-
@article{karaev2023dynamicstereo,
|
| 134 |
-
title={DynamicStereo: Consistent Dynamic Depth from Stereo Videos},
|
| 135 |
-
author={Nikita Karaev and Ignacio Rocco and Benjamin Graham and Natalia Neverova and Andrea Vedaldi and Christian Rupprecht},
|
| 136 |
-
journal={CVPR},
|
| 137 |
-
year={2023}
|
| 138 |
-
}
|
| 139 |
-
```
|
|
|
|
| 1 |
+
# [ECE1508 Final Project] Joint Learning of Exposure Patterns and Stereo Depth from Coded Snapshots
|
| 2 |
|
| 3 |
+

|
| 4 |
|
| 5 |
+
This project introduces a novel, end-to-end learning approach that jointly addresses two traditionally separate computer vision challenges: Snapshot Compressed Image (SCI) decoding and dynamic stereo depth estimation. The framework is an adaptation of the [DynamicStereo](https://github.com/facebookresearch/dynamic_stereo) repository and was trained using the [DynamicReplica](https://github.com/facebookresearch/dynamic_stereo) dataset.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
## Dataset
|
| 8 |
+
The [DynamicReplica](https://github.com/facebookresearch/dynamic_stereo) dataset consists of 145200 *stereo* frames (524 videos) with humans and animals in motion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
### Download the Dynamic Replica dataset
|
| 11 |
Due to the enormous size of the original dataset, we created the `links_lite.json` file to enable quick testing by downloading just a small portion of the dataset.
|
|
|
|
| 17 |
To download the full dataset, please visit [the original site](https://github.com/facebookresearch/dynamic_stereo) created by Meta.
|
| 18 |
|
| 19 |
## Installation
|
| 20 |
+
To set up and run the project, please follow these steps.
|
|
|
|
| 21 |
|
| 22 |
### Setup the root for all source files:
|
| 23 |
```
|
| 24 |
+
git clone https://github.com/kungchuking/E2E_SCSI.git
|
| 25 |
cd dynamic_stereo
|
|
|
|
| 26 |
```
|
| 27 |
### Create a conda env:
|
| 28 |
```
|
|
|
|
| 31 |
```
|
| 32 |
### Install requirements
|
| 33 |
```
|
| 34 |
+
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
|
|
|
|
| 35 |
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
|
| 36 |
pip install -r requirements.txt
|
| 37 |
```
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
## Evaluation
|
| 40 |
+
To download the pre-trained model weights (checkpoints), please follow the instructions below.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
+
### Command Line Download
|
| 43 |
+
You can use the following commands to create the required directory and download the primary checkpoint directly from the Hugging Face repository:
|
|
|
|
|
|
|
| 44 |
```
|
| 45 |
+
mkdir dynamicstereo_sf_dr
|
| 46 |
+
wget -O dynamicstereo_sf_dr/model_dynamic-stereo_030537.pth "https://huggingface.co/kungchuking/E2E_SCSI/resolve/main/dynamicstereo_sf_dr/model_dynamic-stereo_030537.pth"
|
|
|
|
| 47 |
```
|
| 48 |
+
### Manual Download
|
| 49 |
+
Alternatively, you can manually download the checkpoints by clicking the [link](https://huggingface.co/kungchuking/E2E_SCSI/resolve/main/dynamicstereo_sf_dr/model_dynamic-stereo_030537.pth). Ensure the downloaded file is placed in the required path: `./dynamicstereo_sf_dr/`.
|
|
|
|
| 50 |
|
| 51 |
+
### Evaluation Notebook
|
| 52 |
+
For detailed instructions on how to evaluate the model, please refer to the dedicated [evaluation notebook](https://github.com/kungchuking/E2E_SCSI/blob/master/notebooks/evaluate.ipynb).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
## Training
|
| 55 |
+
### Hardware and Memory Requirements
|
| 56 |
+
Training the model requires a minimum of a 50GB GPU.
|
| 57 |
+
* **Memory Adjustment**: If your GPU memory is limited, you may decrease the `image_size` and/or the `sample_len` parameters.
|
| 58 |
+
* **Resolution Note**: The chosen `image_size` of 480x640 corresponds to the native resolution of the custom-designed coded-exposure camera used for our research.
|
| 59 |
+
* **Compression Impact**: Reducing the `sample_length` will inherently decrease the effective compression ratio for the Snapshot Compressed Imaging (SCI) process.
|
| 60 |
+
Before starting training, you must download the Dynamic Replica dataset.
|
| 61 |
+
### Execution
|
| 62 |
+
If you are running on a Linux machine, use the provided shell script for training:
|
| 63 |
```
|
| 64 |
+
./train.csh
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
```
|
| 66 |
+
For other operating systems, you can open the `./train.csh` file and manually copy and execute the instruction.
|
|
|
|
|
|
|
| 67 |
|
| 68 |
## License
|
| 69 |
+
Portions of the project are available under separate license terms: [DynamicStereo](https://github.com/facebookresearch/dynamic_stereo) is licensed under CC-BY-NC, [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) is licensed under the MIT license, [LoFTR](https://github.com/zju3dv/LoFTR) and [CREStereo](https://github.com/megvii-research/CREStereo) are licensed under the Apache 2.0 license.
|
| 70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|