CSIRORobotics
/

WildCross

Depth Estimation

Model card Files Files and versions

xet

Community

david-hall-csiro commited on Mar 4

Commit

6e7b09c

verified ·

1 Parent(s): 34015ed

Updated README

Browse files

Files changed (1) hide show

README.md +50 -23

README.md CHANGED Viewed

@@ -8,9 +8,10 @@ pipeline_tag: depth-estimation
 <div align="center">
 <h1>WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments</h1>
-[**Joshua Knights**](https://scholar.google.com/citations?user=RxbGr2EAAAAJ&hl=en)<sup>1,2</sup> · **Joseph Reid**<sup>1</sup> · [**Mark Cox**](https://scholar.google.com/citations?user=Bk3UD4EAAAAJ&hl=en)<sup>1</sup>
-<br>
-[**Kaushik Roy**](https://bit0123.github.io/)<sup>1</sup> · [**David Hall**](https://scholar.google.com/citations?user=dosODoQAAAAJ&hl=en)<sup>1</sup> · [**Peyman Moghadam**](https://scholar.google.com.au/citations?user=QAVcuWUAAAAJ&hl=en)<sup>1,2</sup>
 <sup>1</sup>DATA61, CSIRO&emsp;&emsp;&emsp;<sup>2</sup>Queensland University of Technology
 <br>
@@ -20,30 +21,29 @@ pipeline_tag: depth-estimation
 <a href='https://doi.org/10.25919/5fmy-yg37'><img src='https://img.shields.io/badge/Dataset_Download-WildCross-blue'></a>
 </div>
-This repository contains the pre-trained checkpoints for a variety of tasks on the WildCross benchmark.
 ![teaser](./teaser.png)
-If you find this repository useful or use the WildCross dataset in your work, please cite us using the following:
-```
-@inproceedings{wildcross2026,
-  title={{WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments}},
-  author={Joshua Knights, Joseph Reid, Kaushik Roy, David Hall, Mark Cox, Peyman Moghadam},
-  booktitle={Proceedings-IEEE International Conference on Robotics and Automation},
-  pages={},
-  year={2026}
-}
-```
-## Download Instructions
 Our dataset can be downloaded through the [**CSIRO Data Access Portal**](https://doi.org/10.25919/5fmy-yg37).  Detailed instructions for downloading the dataset can be found in the README file provided on the data access portal page.
 ## Training and Benchmarking
-Here we provide pre-trained checkpoints for a variety of tasks on WildCross.
-**Visual Place Recognition**
-### Checkpoints
 | Model      | Checkpoint Folder|
 |------------|------------|
 | NetVlad | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/VPR/NetVLAD)       |
@@ -51,20 +51,47 @@ Here we provide pre-trained checkpoints for a variety of tasks on WildCross.
 | SALAD | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/VPR/SALAD)       |
 | BoQ | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/VPR/BoQ)       |
-**Cross Modal Place Recognition**
-### Checkpoints
 | Model      | Checkpoint Folder|
 |------------|------------|
 | Lip-Loc (ResNet50) | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/crossmodal/resnet50)       |
 | Lip-Loc (Dino-v2)  | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/crossmodal/dinov2)       |
 | Lip-Loc (Dino-v3) | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/crossmodal/dinov3)       |
-**Metric Depth Estimation**
-### Checkpoints
 | Model      | Checkpoint Folder|
 |------------|------------|
 | DepthAnythingV2-vits | [Link](https://huggingface.co/CSIRORobotics/WildCross/resolve/main/DepthAnythingV2/finetuned/vits.pth)       |
 | DepthAnythingV2-vitb | [Link](https://huggingface.co/CSIRORobotics/WildCross/resolve/main/DepthAnythingV2/finetuned/vitb.pth)       |
 | DepthAnythingV2-vitl | [Link](https://huggingface.co/CSIRORobotics/WildCross/resolve/main/DepthAnythingV2/finetuned/vitl.pth)       |
-For instructions on how to use these checkpoints for training or evaluation, further instructions can be found on the [WildCross GitHub repository](https://github.com/csiro-robotics/WildCross).

 <div align="center">
 <h1>WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments</h1>
+[**Joshua Knights**](https://scholar.google.com/citations?user=RxbGr2EAAAAJ&hl=en)<sup>1,2</sup> · **Joseph Reid**<sup>1</sup> · [**Kaushik Roy**](https://bit0123.github.io/)<sup>1</sup>
+<br>
+[**David Hall**](https://scholar.google.com/citations?user=dosODoQAAAAJ&hl=en)<sup>1</sup> · [**Mark Cox**](https://scholar.google.com/citations?user=Bk3UD4EAAAAJ&hl=en)<sup>1</sup>
+· [**Peyman Moghadam**](https://scholar.google.com.au/citations?user=QAVcuWUAAAAJ&hl=en)<sup>1,2</sup>
 <sup>1</sup>DATA61, CSIRO&emsp;&emsp;&emsp;<sup>2</sup>Queensland University of Technology
 <br>
 <a href='https://doi.org/10.25919/5fmy-yg37'><img src='https://img.shields.io/badge/Dataset_Download-WildCross-blue'></a>
 </div>
+This repository contains the pre-trained checkpoints for a variety of tasks on the **WildCross benchmark**.
 ![teaser](./teaser.png)
+## WildCross Overview
+We introduced **WildCross**, a large-scale benchmark for cross-modal place recognition and metric depth estimation in natural environments. The dataset comprises over 476K sequential RGB frames with semi-dense depth and surface normal annotations, each aligned with accurate 6DoF poses and synchronized dense lidar submaps.
+We conduct comprehensive experiments on visual, lidar, and cross-modal place recognition, as well as metric depth estimation, demonstrating the value of WildCross as a challenging benchmark for multi-modal robotic perception tasks.
+This HuggingFace Repository contains the model weights for replicating all experiments outlined in the [**original paper**](https://arxiv.org/pdf/2603.01475).
+## Data Download Instructions
 Our dataset can be downloaded through the [**CSIRO Data Access Portal**](https://doi.org/10.25919/5fmy-yg37).  Detailed instructions for downloading the dataset can be found in the README file provided on the data access portal page.
 ## Training and Benchmarking
+Here we provide pre-trained checkpoints for a variety of tasks on WildCross. For instructions on how to use all checkpoints for training or evaluation, further instructions can be found on the [WildCross GitHub repository](https://github.com/csiro-robotics/WildCross).
+### Visual Place Recognition
+WildCross supports visual relocalization with sequential RGB imagery across challenging revisits, including reverse-direction traversals and long-term appearance changes. The benchmark includes cross-fold train/test splits for robust evaluation of generalization and in-domain adaptation.
+For each model below we provide the weights for the original pre-trained model as well as models fine-tuned on our different data splits.
+#### Checkpoints
 | Model      | Checkpoint Folder|
 |------------|------------|
 | NetVlad | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/VPR/NetVLAD)       |
 | SALAD | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/VPR/SALAD)       |
 | BoQ | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/VPR/BoQ)       |
+### LiDAR Place Recognition
+WildCross is an extension of the original [Wild-Places](https://csiro-robotics.github.io/Wild-Places/) dataset for LiDAR place recognition. WildCross extends it's evaluation setup using new splits of the original data. For LiDAR place recognition (LPR), code for training and evaluation can be found on a WildCross branch of the original [Wild-Places repository](https://github.com/csiro-robotics/Wild-Places/tree/WildCross_splits).
+For each model below we provide model weights which have been fine-tuned on our new data splits.
+#### Checkpoints
+| Model      | Checkpoint Folder|
+|------------|------------|
+| LoGG3D-Net | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/LPR/LoGG3DNet)       |
+| MinkLoc3Dv2 | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/LPR/MinkLoc3Dv2)       |
+| HOTFormerLoc | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/LPR/HotFormerLoc)       |
+### Cross Modal Place Recognition
+CMPR in WildCross evaluates retrieval across sensing modalities, such as image-to-lidar localization. The synchronized RGB frames, accurate poses, and dense lidar submaps provide a strong testbed for cross-modal representation learning.
+Checkpoints below provide Lip-Loc CMPR model weights using different backbones, fine-tuned on our different data splits.
+#### Checkpoints
 | Model      | Checkpoint Folder|
 |------------|------------|
 | Lip-Loc (ResNet50) | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/crossmodal/resnet50)       |
 | Lip-Loc (Dino-v2)  | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/crossmodal/dinov2)       |
 | Lip-Loc (Dino-v3) | [Link](https://huggingface.co/CSIRORobotics/WildCross/tree/main/crossmodal/dinov3)       |
+### Metric Depth Estimation
+WildCross provides semi-dense metric depth and surface normal annotations for every frame, generated from accumulated global point clouds, accurate camera poses, and visibility filtering to remove occluded points. This supports training and benchmarking depth models in natural environments where current methods face substantial domain-shift challenges.
+Checkpoints below provide model weights for different DepthAnythingv2 models fine-tuned on WildCross data.
+#### Checkpoints
 | Model      | Checkpoint Folder|
 |------------|------------|
 | DepthAnythingV2-vits | [Link](https://huggingface.co/CSIRORobotics/WildCross/resolve/main/DepthAnythingV2/finetuned/vits.pth)       |
 | DepthAnythingV2-vitb | [Link](https://huggingface.co/CSIRORobotics/WildCross/resolve/main/DepthAnythingV2/finetuned/vitb.pth)       |
 | DepthAnythingV2-vitl | [Link](https://huggingface.co/CSIRORobotics/WildCross/resolve/main/DepthAnythingV2/finetuned/vitl.pth)       |
+## BibTeX
+If you find this repository useful or use the WildCross dataset in your work, please cite us using the following:
+```
+@inproceedings{wildcross2026,
+  title={{WildCross: A Cross-Modal Large Scale Benchmark for Place Recognition and Metric Depth Estimation in Natural Environments}},
+  author={Joshua Knights, Joseph Reid, Kaushik Roy, David Hall, Mark Cox, Peyman Moghadam},
+  booktitle={Proceedings-IEEE International Conference on Robotics and Automation},
+  pages={},
+  year={2026}
+}
+```