Spaces:

3ZadeSSG
/

PVSDNet-Depth-Only

Sleeping

App Files Files Community

3ZadeSSG commited on Jan 14

Commit

a949be7

1 Parent(s): 4cfaf7d

updated requirements

Browse files

Files changed (2) hide show

README.md +2 -87
requirements.txt +2 -2

README.md CHANGED Viewed

@@ -1,97 +1,12 @@
 <div align="center">
-<a href="#"><img src='https://img.shields.io/badge/-Paper-00629B?style=flat&logo=ieee&logoColor=white' alt='arXiv'></a>
 <a href='https://realistic3d-miun.github.io/PVSDNet/'><img src='https://img.shields.io/badge/Project_Page-Website-green?logo=googlechrome&logoColor=white' alt='Project Page'></a>
-<a href='https://huggingface.co/spaces/3ZadeSSG/PVSDNet'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Demo_(Coming_Soon)-blue'></a>
 </div>
 # PVSDNet: Joint Depth Prediction and View Synthesis via Shared Latent Spaces in Real-Time.
-## Supplementary Video (Head to Project Page for more visual results)
-[![Watch the video](https://img.youtube.com/vi/49s2UPvRA6I/maxresdefault.jpg)](https://youtu.be/49s2UPvRA6I)
-# 1. PVSDNet - Joint Depth and View
-**Note:** Will be added soon.
-## 1.A. Normal Inference (Recommended for minimal setup)
-**Note:** Will be added soon.
-## 2.A. Faster Inference (For best possible FPS)
-**Note:** Will be added soon.
-# 2. PVSDNet Depth-Only Model
-This model is a variant of the original PVSDNet model, where we only predict depth and not the target views. The model core is similar except the rendering network and the positional encoding are removed.
-* Download the checkpoints from following table and place them in `checkpoint_onnx` directory.
-    | Model           | Size   | Checkpoint |
-    |-----------------|--------|----------------|
-    | PVSDNet-Depth-Only  | 1.11 GB| [Download](https://huggingface.co/3ZadeSSG/PVSDNet-Depth-Only/resolve/main/depth_only_model.pth) |
-    | PVSDNet-Depth-Only-Lite  | 279 MB | [Download](https://huggingface.co/3ZadeSSG/PVSDNet-Depth-Only/resolve/main/depth_only_lite_model.pth) |
-## 2.A. Normal Inference (Recommended for minimal setup)
-## 2.B. Faster Inference (For best possible FPS)
-You need to setup your own TRT Engine for this purpose.
-* Make sure you modify the `depth_only_parameters` to set resolution you need. By default we have kept it at `384x384`.
-* Run `export_onnx_depth.py` to conver the normal pytorch models located into into onnx
-    ```
-    python export_onnx_depth.py
-    ```
-* Create TRT Engine directory
-    ```
-    mkdir TRT_Engine
-    ```
-* Build the TRT engine based on created onnx files (which by default will be located in `checkpoint_onnx`)
-    ```
-        trtexec --onnx=./checkpoint_onnx/depth_only_model.onnx --saveEngine=./TRT_Engine/depth_only_model_fp16.engine --fp16
-    ```
-    ```
-        trtexec --onnx=./checkpoint_onnx/depth_only_lite_model.onnx --saveEngine=./TRT_Engine/depth_only_lite_model_fp16.engine --fp16
-    ```
-## 2.C. Predicting on Depth Datasets using Multi-Resolution Fusion
-We run the scripts inside the `depth_dataset_predictor` directory. There are two sample images for each dataset to test the code.
-* First we build the TRT engine for each dataset as we use multi-resolution fusion.
-    ```
-    python depth_dataset_predictor/build_trt_<dataset_name>.py
-    ```
-* Then we run the prediction script
-    ```
-    python depth_dataset_predictor/predict_<dataset_name>_TensorRT.py
-    ```
-|Dataset|Setp 1|Step 2|
-|---|---|---|
-|ETH3D| ```python depth_dataset_predictor/build_trt_ETH3D.py``` | ```python depth_dataset_predictor/predict_ETH3D_TensorRT.py```|
-|Sintel| ```python depth_dataset_predictor/build_trt_Sintel.py``` | ```python depth_dataset_predictor/predict_Sintel_TensorRT.py```|
-|KITTI| ```python depth_dataset_predictor/build_trt_KITTI.py``` | ```python depth_dataset_predictor/predict_KITTI_TensorRT.py```|
-|DIODE| ```python depth_dataset_predictor/build_trt_DIODE.py``` | ```python depth_dataset_predictor/predict_DIODE_TensorRT.py```|
-|NYU| ```python depth_dataset_predictor/build_trt_NYU.py``` | ```python depth_dataset_predictor/predict_NYU_TensorRT.py```|
-## 2.D. Predicting on 1080p In-The-Wild Images/Videos using Multi-Resolution Fusion
-Similar to dataset, we can use the mutli-resolution fusion to predict on 1080p In-The-Wild Images/Videos.
-* First we build the trt engine
-    ```
-    python depth_in_wild_predictor/build_trt_1080p.py
-    ```
-* Then we run the prediction script for images
-    ```
-    python depth_in_wild_predictor/predict_1080p_TensorRT.py
-    ```
-    OR, run the prediction script for videos
-    ```
-    python depth_in_wild_predictor/predict_video_1080p_TensorRT.py
-    ```
-#### Note
-* For any other resolutions, you can modify the resolutions in these above scripts to suit your needs. We have kept the default resolution as 1080p for this example.
-* We recommend 3-6 resolutions for best results, but you can use 1-2 smaller resolutions if working with low reoslution images/videos since receptive field of the network can handle that without any issues.

 <div align="center">
 <a href='https://realistic3d-miun.github.io/PVSDNet/'><img src='https://img.shields.io/badge/Project_Page-Website-green?logo=googlechrome&logoColor=white' alt='Project Page'></a>
 </div>
 # PVSDNet: Joint Depth Prediction and View Synthesis via Shared Latent Spaces in Real-Time.
+Head to our [Project Page](https://realistic3d-miun.github.io/PVSDNet/) for more details about supplementary materials and full code.
+* This space only contains the PVSDNet-Depth-Only Model.

requirements.txt CHANGED Viewed

@@ -1,6 +1,6 @@
 numpy
-torch==2.9.1+cu130
-torchvision==0.24.1+cu130
 pytorch-msssim==1.0.0
 pytorchvideo==0.1.5
 gradio==6.2.0

 numpy
+torch==2.9.1
+torchvision==0.24.1
 pytorch-msssim==1.0.0
 pytorchvideo==0.1.5
 gradio==6.2.0