Add pipeline tag: image-to-3d

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +66 -65
README.md CHANGED
@@ -1,66 +1,67 @@
1
- ---
2
- license: mit
3
- ---
4
-
5
- # AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views
6
-
7
- [![Project Website](https://img.shields.io/badge/AnySplat-Website-4CAF50?logo=googlechrome&logoColor=white)](https://city-super.github.io/anysplat/)
8
- [![Paper](https://img.shields.io/badge/arXiv-Paper-b31b1b?logo=arxiv&logoColor=b31b1b)](https://arxiv.org/pdf/2505.23716)
9
- [![GitHub Repo](https://img.shields.io/badge/GitHub-Code-FFD700?logo=github)](https://github.com/OpenRobotLab/AnySplat)
10
- [![Hugging Face Model](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/lhjiang/anysplat)
11
-
12
-
13
- ## Quick Start
14
-
15
- See the Github repository: https://github.com/OpenRobotLab/AnySplat regarding installation instructions.
16
-
17
- The model can then be used as follows:
18
-
19
- ```python
20
- from pathlib import Path
21
- import torch
22
- import os
23
- import sys
24
- sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
25
-
26
- from src.misc.image_io import save_interpolated_video
27
- from src.model.model.anysplat import AnySplat
28
- from src.utils.image import process_image
29
-
30
- # Load the model from Hugging Face
31
- model = AnySplat.from_pretrained("anysplat_ckpt_v1")
32
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
33
- model = model.to(device)
34
- model.eval()
35
- for param in model.parameters():
36
- param.requires_grad = False
37
-
38
- # Load and preprocess example images (replace with your own image paths)
39
- image_names = ["path/to/imageA.png", "path/to/imageB.png", "path/to/imageC.png"]
40
- images = [process_image(image_name) for image_name in image_names]
41
- images = torch.stack(images, dim=0).unsqueeze(0).to(device) # [1, K, 3, 448, 448]
42
- b, v, _, h, w = images.shape
43
-
44
- # Run Inference
45
- gaussians, pred_context_pose = model.inference((images+1)*0.5)
46
-
47
- pred_all_extrinsic = pred_context_pose['extrinsic']
48
- pred_all_intrinsic = pred_context_pose['intrinsic']
49
- save_interpolated_video(pred_all_extrinsic, pred_all_intrinsic, b, h, w, gaussians, image_folder, model.decoder)
50
-
51
- ```
52
-
53
- ## Citation
54
-
55
- ```
56
- @article{jiang2025anysplat,
57
- title={AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views},
58
- author={Jiang, Lihan and Mao, Yucheng and Xu, Linning and Lu, Tao and Ren, Kerui and Jin, Yichen and Xu, Xudong and Yu, Mulin and Pang, Jiangmiao and Zhao, Feng and others},
59
- journal={arXiv preprint arXiv:2505.23716},
60
- year={2025}
61
- }
62
- ```
63
-
64
- ## License
65
-
 
66
  The code and models are licensed under the [MIT License](LICENSE).
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-to-3d
4
+ ---
5
+
6
+ # AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views
7
+
8
+ [![Project Website](https://img.shields.io/badge/AnySplat-Website-4CAF50?logo=googlechrome&logoColor=white)](https://city-super.github.io/anysplat/)
9
+ [![Paper](https://img.shields.io/badge/arXiv-Paper-b31b1b?logo=arxiv&logoColor=b31b1b)](https://arxiv.org/pdf/2505.23716)
10
+ [![GitHub Repo](https://img.shields.io/badge/GitHub-Code-FFD700?logo=github)](https://github.com/OpenRobotLab/AnySplat)
11
+ [![Hugging Face Model](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/lhjiang/anysplat)
12
+
13
+
14
+ ## Quick Start
15
+
16
+ See the Github repository: https://github.com/OpenRobotLab/AnySplat regarding installation instructions.
17
+
18
+ The model can then be used as follows:
19
+
20
+ ```python
21
+ from pathlib import Path
22
+ import torch
23
+ import os
24
+ import sys
25
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
26
+
27
+ from src.misc.image_io import save_interpolated_video
28
+ from src.model.model.anysplat import AnySplat
29
+ from src.utils.image import process_image
30
+
31
+ # Load the model from Hugging Face
32
+ model = AnySplat.from_pretrained("anysplat_ckpt_v1")
33
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
34
+ model = model.to(device)
35
+ model.eval()
36
+ for param in model.parameters():
37
+ param.requires_grad = False
38
+
39
+ # Load and preprocess example images (replace with your own image paths)
40
+ image_names = ["path/to/imageA.png", "path/to/imageB.png", "path/to/imageC.png"]
41
+ images = [process_image(image_name) for image_name in image_names]
42
+ images = torch.stack(images, dim=0).unsqueeze(0).to(device) # [1, K, 3, 448, 448]
43
+ b, v, _, h, w = images.shape
44
+
45
+ # Run Inference
46
+ gaussians, pred_context_pose = model.inference((images+1)*0.5)
47
+
48
+ pred_all_extrinsic = pred_context_pose['extrinsic']
49
+ pred_all_intrinsic = pred_context_pose['intrinsic']
50
+ save_interpolated_video(pred_all_extrinsic, pred_all_intrinsic, b, h, w, gaussians, image_folder, model.decoder)
51
+
52
+ ```
53
+
54
+ ## Citation
55
+
56
+ ```
57
+ @article{jiang2025anysplat,
58
+ title={AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views},
59
+ author={Jiang, Lihan and Mao, Yucheng and Xu, Linning and Lu, Tao and Ren, Kerui and Jin, Yichen and Xu, Xudong and Yu, Mulin and Pang, Jiangmiao and Zhao, Feng and others},
60
+ journal={arXiv preprint arXiv:2505.23716},
61
+ year={2025}
62
+ }
63
+ ```
64
+
65
+ ## License
66
+
67
  The code and models are licensed under the [MIT License](LICENSE).