Inquiry regarding AIFS model usage on Hugging Face: grid output, soil moisture and ensemble std
I am very impressed with the capabilities of this model and am exploring its use for my research. However, I have a few questions regarding its implementation and output, and I would be grateful for your guidance.
My questions are as follows:
Ensemble Std : In the context of the ensemble version of the AIFS model, I would like to understand the standard procedure for accessing or calculating the ensemble std of the predictions. Could you please provide some guidance or point me to the relevant documentation on how to best achieve this within the Hugging Face ecosystem?
Output of Soil Moisture Data: I am interested in obtaining soil moisture data from the AIFS model. I have noticed a potential discrepancy in the available documentation. While the page on the ECMWF website detailing the open data catalogue (https://www.ecmwf.int/en/forecasts/datasets/set-ix) seems to categorize soil moisture as a prognostic variable. Could you please clarify if and how it is possible to directly output soil moisture data from the AIFS model? I only got 100 fileds from the demo and swv or vsw is not included.
Grid Transformation: I have been working with the provided notebooks, and it appears that the latitude and longitude data of the model output may not align with a regular 0.25-degree grid. I was expecting the final output to be on a standard grid (like the file shown on your OpenData Portal) . Is there a recommended procedure or a guide for transforming the output coordinates to a standard 0.25-degree grid? Any pointers to documentation or examples would be highly appreciated.
By the way, notebook in ecmwf/aifs-single-1.0 is not working because the conflicts between pytorch and cuda version: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs
Thank you for your time and for developing this powerful tool for the community. I look forward to your response.
Hi,
1/ Calculating the ensemble std of the predictions falls out of scope for the demo here on huggingface, we attempt to show you how to run the model for your own research purposes. So, in order to do it, you will need to run the model to generate the number of ensemble members you are after (i.e. running it again and again), and over a significant period of time to produce an accurate sampling. Then using your analysis tool of choice it's up to you to calculate the std. However, be aware the running from the same initial conditions will produce a reduced spread, so you will need to retrieve the various ensemble members initial conditions from opendata.
2/ You seem to be looking at the AIFS-Single docs, https://www.ecmwf.int/en/forecasts/dataset/set-x is the AIFS-ENS and correctly shows no soil moisture. Currently the AIFS-ENS does not produce soil moisture.
3/ The AIFS models run at N320, a reduced gaussian grid. If you want data on the 0.25/0.25 regular lat lon, you will need to regrid it. The notebook shows an example of regridding from 0.25 to n320, so simply reverse the process.values = ekr.interpolate(values, {"grid": "N320"}, {"grid": (0.25, 0.25)})
4/ That error looks like it is an issue on your side with an incompability between torch and cuda, make sure you are using torch==2.4.0and an up to date CUDA
As I have not heard from you, I trust this has resolved your issue.
I will close it now, feel free to reopen if you need more assistance.