--- license: mit pipeline_tag: image-feature-extraction --- # GreenHyperSpectra: A multi-source hyperspectral dataset for global vegetation trait prediction This repository contains models and data associated with the paper **[GreenHyperSpectra: A multi-source hyperspectral dataset for global vegetation trait prediction](https://huggingface.co/papers/2507.06806)**. GreenHyperSpectra introduces a pretraining dataset of real-world cross-sensor and cross-ecosystem hyperspectral samples. This dataset is designed to benchmark trait prediction using semi- and self-supervised methods. The work demonstrates how leveraging GreenHyperSpectra can lead to label-efficient multi-output regression models that outperform state-of-the-art supervised baselines, significantly improving the learning of spectral representations for plant trait prediction. All code and data for this project are available at the official GitHub repository: [https://github.com/GreenHyperSpectra/GreenHyperSpectra](https://github.com/GreenHyperSpectra/GreenHyperSpectra)