VISION / README.md

nielsr HF Staff

Improve model card: Add pipeline tag, paper, code, and usage

f78fc09 verified 7 months ago

3.3 kB

license: mit
pipeline_tag: image-to-image

VISION: Prompting Ocean Vertical Velocity Reconstruction from Incomplete Observations

This model is presented in the paper VISION: Prompting Ocean Vertical Velocity Reconstruction from Incomplete Observations.

Abstract

Reconstructing subsurface ocean dynamics, such as vertical velocity fields, from incomplete surface observations poses a critical challenge in Earth science, a field long hampered by the lack of standardized, analysis-ready benchmarks. To systematically address this issue and catalyze research, we first build and release KD48, a high-resolution ocean dynamics benchmark derived from petascale simulations and curated with expert-driven denoising. Building on this benchmark, we introduce VISION, a novel reconstruction paradigm based on Dynamic Prompting designed to tackle the core problem of missing data in real-world observations. The essence of VISION lies in its ability to generate a visual prompt on-the-fly from any available subset of observations, which encodes both data availability and the ocean's physical state. More importantly, we design a State-conditioned Prompting module that efficiently injects this prompt into a universal backbone, endowed with geometry- and scale-aware operators, to guide its adaptive adjustment of computational strategies. This mechanism enables VISION to precisely handle the challenges posed by varying input combinations. Extensive experiments on the KD48 benchmark demonstrate that VISION not only substantially outperforms state-of-the-art models but also exhibits strong generalization under extreme data missing scenarios. By providing a high-quality benchmark and a robust model, our work establishes a solid infrastructure for ocean science research under data uncertainty.

Code

The official code is available on the GitHub repository.

Quick Start

Installation

cuda 11.8

# git clone this repository
git clone https://github.com/YuanGao-YG/VISION.git
cd VISION

# create new anaconda env
conda env create -f environment.yml
conda activate vision

Inference

Pre-trained weights and test data are available on Hugging Face. Download them and place them in the same location.

Inference using Incomplete Observations (SSH)

python inference_io_ssh_vision.py

Inference using Incomplete Observations (SSH + U + V)

python inference_io_ssh_u_v_vision.py

Inference using Complete Observations (SSH + U + V + B)

python inference_co_ssh_u_v_b_vision.py

Citation

If you find our work useful, please consider citing it:

@article{gao2025vision,
  title={VISION: Prompting Ocean Vertical Velocity Reconstruction from Incomplete Observations},
  author={Gao, Yuan and Wu, Hao and Wen, Qingsong and Wang, Kun and Wu, Xian and Huang, Xiaomeng},
  journal={arXiv preprint arXiv:2509.21477},
  year={2025},
  eprint={2509.21477},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2509.21477}
}