peterdudfield commited on
Commit
b74423e
·
1 Parent(s): 7b19f32

Delete experiments

Browse files
Files changed (28) hide show
  1. experiments/india/001_v1/india_pv_wind.md +0 -69
  2. experiments/india/002_wind_meteomatics/india_windnet_v2.md +0 -46
  3. experiments/india/003_wind_plevels/MAE.png +0 -3
  4. experiments/india/003_wind_plevels/MAEvstimesteps.png +0 -3
  5. experiments/india/003_wind_plevels/p10.png +0 -3
  6. experiments/india/003_wind_plevels/p50.png +0 -3
  7. experiments/india/003_wind_plevels/plevel.md +0 -54
  8. experiments/india/004_n_training_samples/log-plot.py +0 -14
  9. experiments/india/004_n_training_samples/mae_samples.png +0 -0
  10. experiments/india/004_n_training_samples/mae_step.png +0 -3
  11. experiments/india/004_n_training_samples/readme.md +0 -48
  12. experiments/india/005_extra_nwp_variables/mae_steps.png +0 -3
  13. experiments/india/005_extra_nwp_variables/mae_steps_grouped.png +0 -3
  14. experiments/india/005_extra_nwp_variables/readmd.md +0 -55
  15. experiments/india/006_da_only/bad.png +0 -3
  16. experiments/india/006_da_only/da_only.md +0 -37
  17. experiments/india/006_da_only/good.png +0 -3
  18. experiments/india/006_da_only/mae_steps.png +0 -3
  19. experiments/india/007_different_seeds/mae_all_steps.png +0 -3
  20. experiments/india/007_different_seeds/mae_steps.png +0 -3
  21. experiments/india/007_different_seeds/readme.md +0 -33
  22. experiments/india/008_coarse4/mae_step.png +0 -3
  23. experiments/india/008_coarse4/mae_step_smooth.png +0 -3
  24. experiments/india/008_coarse4/readme.md +0 -77
  25. experiments/mae_analysis.py +0 -152
  26. experiments/uk/011 - Extending forecast to 36 hours (updated ECMWF data)/PVNEt_national_XG_comparison.png +0 -3
  27. experiments/uk/011 - Extending forecast to 36 hours (updated ECMWF data)/PVNet_day_ahead.md +0 -22
  28. experiments/uk/011 - Extending forecast to 36 hours (updated ECMWF data)/PVNets_comparison.png +0 -3
experiments/india/001_v1/india_pv_wind.md DELETED
@@ -1,69 +0,0 @@
1
- # PVNet for Wind and PV Sites in India
2
-
3
- ## PVNet for sites
4
-
5
- ### Data
6
-
7
- We use PV generation data for India from April 2019-Nov 2022 for training
8
- and Dec 2022- Nov 2023 for validation. This is only with ECMWF data, and PV generation history.
9
-
10
- The forecast is every 15 minutes for 48 hours for PV generation.
11
-
12
- The input NWP data is hourly, and 32x32 pixels (corresponding to around 320kmx320km) around a central
13
- point in NW-India.
14
-
15
- [WandB Link](https://wandb.ai/openclimatefix/pvnet_india2.1/runs/o4xpvzrc)
16
-
17
- ### Results
18
-
19
- Overall MAE is 4.9% on the validation set, and forecasts look overall good.
20
-
21
- ![batch_idx_1_all_892_2ca7e12db5de2cf2e244](https://github.com/openclimatefix/PVNet/assets/7170359/07e8199a-11b5-4400-9897-37b7738a4f39)
22
-
23
- ![W B Chart 05_02_2024, 10_07_12_pvnet](https://github.com/openclimatefix/PVNet/assets/7170359/abaefdc1-dedd-4a12-8a26-afaf36d7786b)
24
-
25
- ## WindNet
26
-
27
-
28
- ### April-29-2024 WindNet v1 Production Model
29
-
30
- [WandB Link](https://wandb.ai/openclimatefix/india/runs/5llq8iw6)
31
-
32
- Improvements: Larger input size (64x64), 7 hour delay for ECMWF NWP inputs, to match productions.
33
- New, much more efficient encoder for NWP, allowing for more filters and layers, with less parameters.
34
- The 64x64 input size corresponds to 6.4 degrees x 6.4 degrees, which is around 700km x 700km. This allows for the
35
- model to see the wind over the wind generation sites, which seems to be the biggest reason for the improvement in the model.
36
-
37
-
38
-
39
- MAE is 7.6% with real improvements on the production side of things.
40
-
41
-
42
- There were other experiments with slightly different numbers of filters, model parameters and the like, but generally no
43
- improvements were seen.
44
-
45
-
46
- ## WindNet v1 Results
47
-
48
- ### Data
49
-
50
- We use Wind generation data for India from April 2019-Nov 2022 for training
51
- and Dec 2022- Nov 2023 for validation. This is only with ECMWF data, and Wind generation history.
52
-
53
- The forecast is every 15 minutes for 48 hours for Wind generation.
54
-
55
- The input NWP data is hourly, and 32x32 pixels (corresponding to around 320kmx320km) around a central
56
- point in NW-India. Note: The majority of the wind generation is likely not covered in the 320kmx320km area.
57
-
58
-
59
- [WandB Link](https://wandb.ai/openclimatefix/pvnet_india2.1/runs/otdx7axx)
60
-
61
- ### Results
62
-
63
- ![W B Chart 05_02_2024, 10_05_19](https://github.com/openclimatefix/PVNet/assets/7170359/6a8cd9c5-bdfe-41ab-996d-37fd1be2a07c)
64
-
65
- ![W B Chart 05_02_2024, 10_06_51_windnet](https://github.com/openclimatefix/PVNet/assets/7170359/77554ef0-4411-4432-af95-8530aef4a701)
66
-
67
- ![batch_idx_1_all_1730_379a9f881a7f01153f98](https://github.com/openclimatefix/PVNet/assets/7170359/243d9f3e-4cb9-405e-80c5-40c6c218c17f)
68
-
69
- MAE is around 10% overall, although it doesn't seem to do very well on the ramps up and down.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/002_wind_meteomatics/india_windnet_v2.md DELETED
@@ -1,46 +0,0 @@
1
- ### WindNet v2 Meteomatics + ECMWF Model
2
-
3
- [WandB Linl](https://wandb.ai/openclimatefix/india/runs/v3mja33d)
4
-
5
- This newest experiment uses Meteomatics data in addition to ECMWF data. The Meteomatics data is at specific locations corresponding
6
- to the gneeration sites we know about. It is smartly downscaled ECMWF data, down to 15 minutes and at a few height levels we are
7
- interested in, primarily 10m, 100m, and 200m. The Meteomatics data is a semi-reanalysis, with each block of 6 hours being from one forecast run.
8
- For example, in one day, hours 00-06 are from the same, 00 forecast run, and hours 06-12 are from the 06 forecast run. This is important to note
9
- as it is both not a real reanalysis, but we also can't have it exactly match the live data, as any forecast steps beyond 6 hours are thrown away.
10
- This does mean that these results should be taken as a best case or better than best case scenario, as every 6 hour, observations from the future
11
- are incorporated into the Meteomatics input data from the next NWP mode run.
12
-
13
- For the purposes of WindNet, Meteomatics data is treated as Sensor data that goes into the future.
14
- The model encodes the sensor information the same way as for the historical PV, Wind, and GSP generation, and has
15
- a simple, single attention head to encode the information. This is then concatenated along with the rest of the data, like in
16
- previous experiments.
17
-
18
- This model also has an even larger input size of ECMWF data, 81x81 pixels, corresponding to around 810kmx810km.
19
- ![Screenshot_20240430_082855](https://github.com/openclimatefix/PVNet/assets/7170359/6981a088-8664-474b-bfea-c94c777fc119)
20
-
21
- MAE is 7.0% on the validation set, showing a slight improvement over the previous model.
22
-
23
- Comperison with the production model:
24
-
25
- | Timestep | Prod MAE % | No Meteomatics MAE % | Meteomatics MAE % |
26
- | --- | --- | --- | --- |
27
- | 0-0 minutes | 7.586 | 5.920 | 2.475 |
28
- | 15-15 minutes | 8.021 | 5.809 | 2.968 |
29
- | 30-45 minutes | 7.233 | 5.742 | 3.472 |
30
- | 45-60 minutes | 7.187 | 5.698 | 3.804 |
31
- | 60-120 minutes | 7.231 | 5.816 | 4.650 |
32
- | 120-240 minutes | 7.287 | 6.080 | 6.028 |
33
- | 240-360 minutes | 7.319 | 6.375 | 6.738 |
34
- | 360-480 minutes | 7.285 | 6.638 | 6.964 |
35
- | 480-720 minutes | 7.143 | 6.747 | 6.906 |
36
- | 720-1440 minutes | 7.380 | 7.207 | 6.962 |
37
- | 1440-2880 minutes | 7.904 | 7.507 | 7.507 |
38
-
39
- ![mae_per_timestep](https://github.com/openclimatefix/PVNet/assets/7170359/e3c942e8-65c6-4b95-8c51-f25d43e7a082)
40
-
41
-
42
-
43
-
44
- Example plot
45
-
46
- ![Screenshot_20240430_082937](https://github.com/openclimatefix/PVNet/assets/7170359/88db342e-bf82-414e-8255-5ad4af659fb8)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/003_wind_plevels/MAE.png DELETED

Git LFS Details

  • SHA256: b06d6f85c2ee708e9555969afd622353b950a744f604d6c31d3c32d9b1543c23
  • Pointer size: 131 Bytes
  • Size of remote file: 174 kB
experiments/india/003_wind_plevels/MAEvstimesteps.png DELETED

Git LFS Details

  • SHA256: 3646fe682b4d13b2e00d68cf6d19dec9d00e6c56cc4d3995c3903920b35b8707
  • Pointer size: 131 Bytes
  • Size of remote file: 219 kB
experiments/india/003_wind_plevels/p10.png DELETED

Git LFS Details

  • SHA256: cce6f27ce1bafc89e9b5cb75cc2dad7c1053bea931ea4f5dfa5a1ef404d1042b
  • Pointer size: 131 Bytes
  • Size of remote file: 150 kB
experiments/india/003_wind_plevels/p50.png DELETED

Git LFS Details

  • SHA256: ceae23a3f91f6bc56cf688bdbcaf5172f1a54736e412c5f0e80d8c056f7d9754
  • Pointer size: 131 Bytes
  • Size of remote file: 229 kB
experiments/india/003_wind_plevels/plevel.md DELETED
@@ -1,54 +0,0 @@
1
- # Running WindNet for RUVNL for diferent Plevels
2
-
3
- https://wandb.ai/openclimatefix/india/runs/5llq8iw6 is the current production one
4
- This has 7 plevels and a small patch size.
5
-
6
- ## Experiments
7
-
8
- 1. Only used plevel 50 (orange)
9
- https://wandb.ai/openclimatefix/india/runs/ziudzweq/
10
-
11
- 2. Use plevels of [2, 10, 25, 50, 75, 90, 98]. This is what is already used. (green)
12
- https://wandb.ai/openclimatefix/india/runs/xdlew7ib
13
-
14
- 3. Use plevels of [1, 02, 10, 20, 25, 30, 40, 50, 60, 70, 75, 80 (brown)
15
- , 90, 98, 99]
16
- https://wandb.ai/openclimatefix/india/runs/pcr2zsrc
17
-
18
-
19
- ## Training
20
-
21
- Each epoch took about ~4 hours, so the training runs took several days.
22
-
23
- TODO add number of samples
24
-
25
- ## Results
26
-
27
- MAE results show that using the plevel of 50 only, gives better results
28
- ![](Mae.png "Mae")
29
-
30
- The p50 results are about the same
31
- ![](p50.png "p50")
32
-
33
- We can see that for p10 the results are not right, as they should converge to 0.1
34
- ![](p10.png "p10")
35
-
36
- Interestingly the more plevels you have the better the results are for before 4 hours
37
- but the less plevels you have the better the results for >= 8 hours.
38
-
39
- | Timestep | P50 only MAE % | 7 plevels MAE % | 15 plevel MAE % | 7 plevels small patch MAE % |
40
- | --- | --- | --- | --- | --- |
41
- | 0-0 minutes | 5.416 | 5.920 | 3.933 | 7.586 |
42
- | 15-15 minutes | 5.458 | 5.809 | 4.003 | 8.021 |
43
- | 30-45 minutes | 5.525 | 5.742 | 4.442 | 7.233 |
44
- | 45-60 minutes | 5.595 | 5.698 | 4.772 | 7.187 |
45
- | 60-120 minutes | 5.890 | 5.816 | 5.307 | 7.231 |
46
- | 120-240 minutes | 6.423 | 6.080 | 6.275 | 7.287 |
47
- | 240-360 minutes | 6.608 | 6.375 | 6.707 | 7.319 |
48
- | 360-480 minutes | 6.728 | 6.638 | 6.904 | 7.285 |
49
- | 480-720 minutes | 6.634 | 6.747 | 6.872 | 7.143 |
50
- | 720-1440 minutes | 6.940 | 7.207 | 7.176 | 7.380 |
51
- | 1440-2880 minutes | 7.446 | 7.507 | 7.735 | 7.904 |
52
-
53
-
54
- ![](MAEvstimesteps.png "MAEvstimesteps")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/004_n_training_samples/log-plot.py DELETED
@@ -1,14 +0,0 @@
1
- """ Small script to make MAE vs number of batches plot"""
2
-
3
- import pandas as df
4
- import plotly.graph_objects as go
5
-
6
- data = [[100, 7.779], [300, 7.441], [1000, 7.181], [3000, 7.180], [6711, 7.151]]
7
- df = df.DataFrame(data, columns=["n_samples", "MAE [%]"])
8
-
9
- fig = go.Figure()
10
- fig.add_trace(go.Scatter(x=df["n_samples"], y=df["MAE [%]"], mode="lines+markers"))
11
- fig.update_layout(title="MAE % for N samples", xaxis_title="N Samples", yaxis_title="MAE %")
12
- # change to log log
13
- fig.update_xaxes(type="log")
14
- fig.show(renderer="browser")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/004_n_training_samples/mae_samples.png DELETED
Binary file (77 kB)
 
experiments/india/004_n_training_samples/mae_step.png DELETED

Git LFS Details

  • SHA256: 3a3180a382e4b2c1534524f92a633d488912475a1e8a4effb0b28caf44368834
  • Pointer size: 131 Bytes
  • Size of remote file: 325 kB
experiments/india/004_n_training_samples/readme.md DELETED
@@ -1,48 +0,0 @@
1
- # N samples experiments
2
-
3
- Kicked off an experiment that uses N samples
4
- This is done by adding `limit_train_batches` to the `trainer/default.yaml`.
5
-
6
- I checked that when limiting the batches, the same batches are shown to model for each epoch.
7
-
8
- ## Experiments
9
-
10
- Original is 6711 batches
11
-
12
- - 100: 3p6scx2r
13
- - 300: am46tno1
14
- - 1000: u04xlb6p
15
- - 3000: p11lhreo
16
-
17
- ## Results
18
-
19
- Overall
20
-
21
- | Experiment | MAE % |
22
- |------------|-------|
23
- | 100 | 7.779 |
24
- | 300 | 7.441 |
25
- | 1000 | 7.181 |
26
- | 3000 | 7.180 |
27
- | 6711 | 7.151 |
28
-
29
- Results by timestamps
30
-
31
-
32
- | Timestep | 100 MAE % | 300 MAE % | 1000 MAE % | 3000 MAE % | 6711 MAE % |
33
- | --- | --- | --- | --- | --- | --- |
34
- | 0-0 minutes | 7.985 | 7.453 | 7.155 | 5.553 | 5.920 |
35
- | 15-15 minutes | 7.953 | 7.055 | 6.923 | 5.453 | 5.809 |
36
- | 30-45 minutes | 8.043 | 7.172 | 6.907 | 5.764 | 5.742 |
37
- | 45-60 minutes | 7.850 | 7.070 | 6.790 | 5.815 | 5.698 |
38
- | 60-120 minutes | 7.698 | 6.809 | 6.597 | 5.890 | 5.816 |
39
- | 120-240 minutes | 7.355 | 6.629 | 6.495 | 6.221 | 6.080 |
40
- | 240-360 minutes | 7.230 | 6.729 | 6.559 | 6.541 | 6.375 |
41
- | 360-480 minutes | 7.415 | 6.997 | 6.770 | 6.855 | 6.638 |
42
- | 480-720 minutes | 7.258 | 7.037 | 6.668 | 6.876 | 6.747 |
43
- | 720-1440 minutes | 7.659 | 7.362 | 7.038 | 7.142 | 7.207 |
44
- | 1440-2880 minutes | 8.027 | 7.745 | 7.518 | 7.535 | 7.507 |
45
-
46
- ![](mae_step.png "mae_steps")
47
-
48
- ![](mae_samples.png "mae_samples")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/005_extra_nwp_variables/mae_steps.png DELETED

Git LFS Details

  • SHA256: 0ef7f7af4dafe38aac5a5df6cc74acc606cb4f0a1a9fc78972b09d68dd7574ad
  • Pointer size: 131 Bytes
  • Size of remote file: 215 kB
experiments/india/005_extra_nwp_variables/mae_steps_grouped.png DELETED

Git LFS Details

  • SHA256: 547d3aafbb1658602fe03ea1677589de4e208467756e9ce9cd1d8727f364dffa
  • Pointer size: 131 Bytes
  • Size of remote file: 133 kB
experiments/india/005_extra_nwp_variables/readmd.md DELETED
@@ -1,55 +0,0 @@
1
- # Adding extra nwp variables
2
-
3
- I wanted to run Windnet but testing some new nwp variables from ecmwf
4
-
5
- General conclusion, although more experiments could be done.
6
- The current nwp variables are about right.
7
- If you add lots it makes it worse.
8
- If you take some away, it makes it worse.
9
-
10
- ## Bugs
11
-
12
- Ran into a problem where found that some xamples have
13
- `d.__getitem__('nwp-ecmwf__init_time_utc').values` had size 50, where it should be just one values. I removed these examples. This might
14
-
15
- ## Experiments
16
-
17
- The number of samples were 8000 when training.
18
-
19
- ### 15 variablles
20
- Run windnet with `'hcc', 'lcc', 'mcc', 'prate', 'sde', 'sr', 't2m', 'tcc', 'u10',
21
- 'v10', 'u100', 'v100', 'u200', 'v200', 'dlwrf', 'dswrf'`.
22
-
23
- The experiment on wandb is [here](https://wandb.ai/openclimatefix/india/runs/k91rdffo)
24
-
25
- ### 7 variables
26
- Run windnet with the original 7 variables.
27
- `t2m, u10, u100, u200, v10, v100, v200 `
28
-
29
- The experiment on wandb is [here](https://wandb.ai/openclimatefix/india/runs/miszfep5)
30
-
31
- ### 3 variables
32
- Run windnet with only `t, u10, v100`
33
-
34
- The experiment on wandb is [here](https://wandb.ai/openclimatefix/india/runs/22v3a39g)
35
-
36
- ## Results
37
-
38
- | Timestep | 15 MAE % | 7 MAE % | 3 MAE % |
39
- | --- | --- | --- | --- |
40
- | 0-0 minutes | 7.450 | 6.623 | 7.529 |
41
- | 15-15 minutes | 7.348 | 6.441 | 7.408 |
42
- | 30-45 minutes | 7.242 | 6.544 | 7.294 |
43
- | 45-60 minutes | 7.134 | 6.567 | 7.185 |
44
- | 60-120 minutes | 7.058 | 6.295 | 7.009 |
45
- | 120-240 minutes | 6.965 | 6.290 | 6.800 |
46
- | 240-360 minutes | 6.807 | 6.374 | 6.580 |
47
- | 360-480 minutes | 6.749 | 6.482 | 6.548 |
48
- | 480-720 minutes | 6.892 | 6.686 | 6.685 |
49
- | 720-1440 minutes | 7.020 | 6.756 | 6.780 |
50
- | 1440-2880 minutes | 7.445 | 7.095 | 7.214 |
51
-
52
- ![](mae_steps_grouped.png "mae_steps")
53
-
54
- The raw data is here
55
- ![](mae_steps.png "mae_steps")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/006_da_only/bad.png DELETED

Git LFS Details

  • SHA256: 37cbbf51e7fa7dceb8b2074419267b4bde8186ddcd40b4a49c085735fdf72e43
  • Pointer size: 131 Bytes
  • Size of remote file: 358 kB
experiments/india/006_da_only/da_only.md DELETED
@@ -1,37 +0,0 @@
1
- ## DA forecasts only
2
-
3
- The idea was to create a forecast for DA (day-ahead) only for Windnet.
4
- We hope this would bring down the DA MAE values.
5
-
6
- We do this by not forecasting the first X hours.
7
-
8
- Unfortunately, it doesnt not look like ignore X hours, make the DA forecast better.
9
-
10
- ## Experiments
11
-
12
- 1. Baseline - [here](https://wandb.ai/openclimatefix/india/runs/miszfep5)
13
- 2. Ignore first 6 hours - [here](https://wandb.ai/openclimatefix/india/runs/uosk0qug)
14
- 3. Ignore first 12 hours - [here](https://wandb.ai/openclimatefix/india/runs/s9cnn4ei)
15
-
16
- ## Results
17
-
18
- | Timestep | all MAE % | 6 MAE % | 12 MAE % |
19
- | --- | --- |---------|---------|
20
- | 0-0 minutes | nan | nan | nan |
21
- | 15-15 minutes | nan | nan | nan |
22
- | 30-45 minutes | 0.065 | nan | nan |
23
- | 45-60 minutes | 0.066 | nan | nan |
24
- | 60-120 minutes | 0.063 | nan | nan |
25
- | 120-240 minutes | 0.063 | nan | nan |
26
- | 240-360 minutes | 0.064 | nan | nan |
27
- | 360-480 minutes | 0.065 | 0.068 | nan |
28
- | 480-720 minutes | 0.067 | 0.065 | nan |
29
- | 720-1440 minutes | 0.068 | 0.065 | 0.065 |
30
- | 1440-2880 minutes | 0.071 | 0.071 | 0.071 |
31
-
32
- ![](mae_steps.png "mae_steps")
33
-
34
- Here's two examples from the 6 hour ignore model, one that forecated it well, one that didnt
35
-
36
- ![](bad.png "bad")
37
- ![](good.png "good")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/006_da_only/good.png DELETED

Git LFS Details

  • SHA256: 5f4b6a11ac1560dbea1214ce381602b9eab7334a74110052dda072f0f53c3de8
  • Pointer size: 131 Bytes
  • Size of remote file: 424 kB
experiments/india/006_da_only/mae_steps.png DELETED

Git LFS Details

  • SHA256: 5ca49fbc24530c3d75d0ec5cd2ba6345082c1747a600143afc40faf7bade0cd6
  • Pointer size: 131 Bytes
  • Size of remote file: 122 kB
experiments/india/007_different_seeds/mae_all_steps.png DELETED

Git LFS Details

  • SHA256: b06eaa2f75d645185bea5b874d6020bae3bccd7de25ec519cf348cde511f27c6
  • Pointer size: 131 Bytes
  • Size of remote file: 203 kB
experiments/india/007_different_seeds/mae_steps.png DELETED

Git LFS Details

  • SHA256: 3adfaa5394e9f45c684812e47e385c25d1796a6c772d04f4e7a3cbcbeffafda3
  • Pointer size: 131 Bytes
  • Size of remote file: 130 kB
experiments/india/007_different_seeds/readme.md DELETED
@@ -1,33 +0,0 @@
1
- # Training models with different seeds
2
-
3
- Want to see the effect or training a model with different seeds.
4
-
5
- We can see that the results for different seeds can vary by 0.5%,
6
- and some models being better at different time horizons than others
7
-
8
- ## Experiments
9
- - seed 1 - [miszfep5](https://wandb.ai/openclimatefix/india/runs/miszfep5)
10
- - seed 2 - [cxshv2q4](https://wandb.ai/openclimatefix/india/runs/cxshv2q4)
11
- - seed 3 - [m46wdrr7](https://wandb.ai/openclimatefix/india/runs/m46wdrr7)
12
-
13
- These were trained with 1000 batches, and 300 batches for validation
14
-
15
- ## Results
16
-
17
- | Timestep | s1 MAE % | s2 MAE % | s3 MAE % |
18
- | --- | --- | --- | --- |
19
- | 0-0 minutes | 0.066 | 0.061 | 0.066 |
20
- | 15-15 minutes | 0.064 | 0.058 | 0.064 |
21
- | 30-45 minutes | 0.065 | 0.060 | 0.063 |
22
- | 45-60 minutes | 0.066 | 0.060 | 0.063 |
23
- | 60-120 minutes | 0.063 | 0.060 | 0.063 |
24
- | 120-240 minutes | 0.063 | 0.063 | 0.065 |
25
- | 240-360 minutes | 0.064 | 0.066 | 0.065 |
26
- | 360-480 minutes | 0.065 | 0.066 | 0.066 |
27
- | 480-720 minutes | 0.067 | 0.066 | 0.065 |
28
- | 720-1440 minutes | 0.068 | 0.068 | 0.066 |
29
- | 1440-2880 minutes | 0.071 | 0.072 | 0.071 |
30
-
31
- ![](mae_steps.png "mae_steps")
32
-
33
- ![](mae_all_steps.png "mae_steps")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/india/008_coarse4/mae_step.png DELETED

Git LFS Details

  • SHA256: 52e85df6c2ed7865e0f6f412ae47e7e5f0a1b12550b72702ebe7e166dec53636
  • Pointer size: 131 Bytes
  • Size of remote file: 179 kB
experiments/india/008_coarse4/mae_step_smooth.png DELETED

Git LFS Details

  • SHA256: 38e2772ac0c28684a10f8fc98fc55afc0401a403d025ac6e3d97a9d328ab8624
  • Pointer size: 131 Bytes
  • Size of remote file: 140 kB
experiments/india/008_coarse4/readme.md DELETED
@@ -1,77 +0,0 @@
1
- # Coarser data and more examples
2
-
3
- We downsampled the ECMWF data from 0.05 to 0.2.
4
- In previous experiments we used a 0.1 resolution, as this is the same as the live ECMWF data.
5
-
6
- By reducing the resolution we can increase the number of samples we have to train on.
7
- We used 41408 number of samples to train, and 10352 samples to validate
8
- This is approximately 5 times more samples than the previous experiments.
9
-
10
- ## Experiments
11
-
12
-
13
- ### b8_s1
14
- Batche size 8, with 0.2 degree NWP data.
15
- https://wandb.ai/openclimatefix/india/runs/w85hftb6
16
-
17
-
18
- ### b8_s2
19
- Batch size 8, different seed, with 0.2 degree NWP data.
20
- https://wandb.ai/openclimatefix/india/runs/k4x1tunj
21
-
22
- ### b32_s3
23
- Batch size 32, with 0.2 degree NWP data. Also kept the learning rate a bit higher
24
- https://wandb.ai/openclimatefix/india/runs/ktale7pa
25
-
26
- ### epochs
27
- We set the early stopping epochs from 10 to 15. This should mean model will train a bit more
28
- https://wandb.ai/openclimatefix/india/runs/8hfc83uv
29
-
30
- ### small model
31
- We made the model about 50% of the size by reduce the reducing the channels in the NWP encoder fomr 256 to 64 and reducing the hidden features in the output network fomr 1024 to 256
32
- https://wandb.ai/openclimatefix/india/runs/sk5ek3pk
33
-
34
-
35
- ### early stopping on MAE/val
36
- Changing from quantile_loss to MAE/val to stop early on. This should mean the model does more training epochs, and the results we are interested int.
37
- https://wandb.ai/openclimatefix/india/runs/a5nkkzj6
38
-
39
-
40
- ### old
41
- Old experiment with 0.1 degree NWP data.
42
- https://wandb.ai/openclimatefix/india/runs/m46wdrr7.
43
- Note the validation batches are different that the experiments above.
44
-
45
- Interesting the GPU memory did not increase much better experiments 2 and 3.
46
- Need to check that 32 batches were being passed through.
47
-
48
- ## Results
49
-
50
- The coarsening data does seem to improve the experiments results in the first 10 hours of the forecast.
51
- DA forecast looks very similar. Note the 0 hour forecast has a large amount of variation.
52
-
53
-
54
-
55
- Still spike results in the individual runs.
56
-
57
- | Timestep | b8_s1 MAE % | b8_s2 MAE % | b32_s3 MAE % | epochs MAE % | small MAE % | mae/val MAE % | old MAE % |
58
- | --- | --- | --- | --- | --- | --- | --- | --- |
59
- | 0-0 minutes | 0.052 | 0.047 | 0.027 | 0.030 | 0.041 | 0.041 | 0.066 |
60
- | 15-15 minutes | 0.052 | 0.049 | 0.031 | 0.033 | 0.041 | 0.041 | 0.064 |
61
- | 30-45 minutes | 0.052 | 0.051 | 0.037 | 0.039 | 0.043 | 0.043 | 0.063 |
62
- | 45-60 minutes | 0.053 | 0.052 | 0.040 | 0.043 | 0.044 | 0.044 | 0.063 |
63
- | 60-120 minutes | 0.056 | 0.054 | 0.048 | 0.052 | 0.048 | 0.048 | 0.063 |
64
- | 120-240 minutes | 0.061 | 0.060 | 0.060 | 0.064 | 0.057 | 0.057 | 0.065 |
65
- | 240-360 minutes | 0.061 | 0.062 | 0.063 | 0.065 | 0.061 | 0.061 | 0.065 |
66
- | 360-480 minutes | 0.062 | 0.062 | 0.062 | 0.063 | 0.063 | 0.063 | 0.066 |
67
- | 480-720 minutes | 0.063 | 0.063 | 0.062 | 0.064 | 0.064 | 0.064 | 0.065 |
68
- | 720-1440 minutes | 0.065 | 0.066 | 0.065 | 0.067 | 0.066 | 0.066 | 0.066 |
69
- | 1440-2880 minutes | 0.069 | 0.070 | 0.071 | 0.071 | 0.071 | 0.071 | 0.071 |
70
-
71
-
72
- ![](mae_step.png "mae_steps")
73
-
74
- ![](mae_step_smooth.png "mae_steps")
75
-
76
- I think its worth noting the model traing MAE is around `3`% and the validation MAE is about `7`%, so there is good reason to believe that the model is over fit to the trianing set.
77
- It would be good to plot some of the trainin examples, to see if they are less spiky.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/mae_analysis.py DELETED
@@ -1,152 +0,0 @@
1
- """
2
- Script to generate analysis of MAE values for multiple model forecasts
3
-
4
- Does this for 48 hour horizon forecasts with 15 minute granularity
5
-
6
- """
7
-
8
- import argparse
9
-
10
- import matplotlib
11
- import matplotlib.pyplot as plt
12
- import numpy as np
13
- import pandas as pd
14
- import wandb
15
-
16
- matplotlib.rcParams["axes.prop_cycle"] = matplotlib.cycler(
17
- color=[
18
- "FFD053", # yellow
19
- "7BCDF3", # blue
20
- "63BCAF", # teal
21
- "086788", # dark blue
22
- "FF9736", # dark orange
23
- "E4E4E4", # grey
24
- "14120E", # black
25
- "FFAC5F", # orange
26
- "4C9A8E", # dark teal
27
- ]
28
- )
29
-
30
-
31
- def main(project: str, runs: list[str], run_names: list[str]) -> None:
32
- """
33
- Compare MAE values for multiple model forecasts for 48 hour horizon with 15 minute granularity
34
-
35
- Args:
36
- project: name of W&B project
37
- runs: W&B ids of runs
38
- run_names: user specified names for runs
39
-
40
- """
41
- api = wandb.Api()
42
- dfs = []
43
- epoch_num = []
44
- for run in runs:
45
- run = api.run(f"openclimatefix/{project}/{run}")
46
-
47
- df = run.history(samples=run.lastHistoryStep + 1)
48
- # Get the columns that are in the format 'MAE_horizon/step_<number>/val`
49
- mae_cols = [col for col in df.columns if "MAE_horizon/step_" in col and "val" in col]
50
- # Sort them
51
- mae_cols.sort()
52
- df = df[mae_cols]
53
- # Get last non-NaN value
54
- # Drop all rows with all NaNs
55
- df = df.dropna(how="all")
56
- # Select the last row
57
- # Get average across entire row, and get the IDX for the one with the smallest values
58
- min_row_mean = np.inf
59
- for idx, (row_idx, row) in enumerate(df.iterrows()):
60
- if row.mean() < min_row_mean:
61
- min_row_mean = row.mean()
62
- min_row_idx = idx
63
- df = df.iloc[min_row_idx]
64
- # Calculate the timedelta for each group
65
- # Get the step from the column name
66
- column_timesteps = [int(col.split("_")[-1].split("/")[0]) * 15 for col in mae_cols]
67
- dfs.append(df)
68
- epoch_num.append(min_row_idx)
69
- # Get the timedelta for each group
70
- groupings = [
71
- [0, 0],
72
- [15, 15],
73
- [30, 45],
74
- [45, 60],
75
- [60, 120],
76
- [120, 240],
77
- [240, 360],
78
- [360, 480],
79
- [480, 720],
80
- [720, 1440],
81
- [1440, 2880],
82
- ]
83
-
84
- groups_df = []
85
- grouping_starts = [grouping[0] for grouping in groupings]
86
- header = "| Timestep |"
87
- separator = "| --- |"
88
- for run_name in run_names:
89
- header += f" {run_name} MAE % |"
90
- separator += " --- |"
91
- print(header)
92
- print(separator)
93
- for grouping in groupings:
94
- group_string = f"| {grouping[0]}-{grouping[1]} minutes |"
95
- # Select indicies from column_timesteps that are within the grouping, inclusive
96
- group_idx = [
97
- idx
98
- for idx, timestep in enumerate(column_timesteps)
99
- if timestep >= grouping[0] and timestep <= grouping[1]
100
- ]
101
- data_one_group = []
102
- for df in dfs:
103
- mean_row = df.iloc[group_idx].mean()
104
- group_string += f" {mean_row:0.3f} |"
105
- data_one_group.append(mean_row)
106
- print(group_string)
107
-
108
- groups_df.append(data_one_group)
109
-
110
- groups_df = pd.DataFrame(groups_df, columns=run_names, index=grouping_starts)
111
-
112
- for idx, df in enumerate(dfs):
113
- print(f"{run_names[idx]}: {df.mean()*100:0.3f}")
114
-
115
- # Plot the error per timestep
116
- plt.figure()
117
- for idx, df in enumerate(dfs):
118
- plt.plot(
119
- column_timesteps, df, label=f"{run_names[idx]}, epoch: {epoch_num[idx]}", linestyle="-"
120
- )
121
- plt.legend()
122
- plt.xlabel("Timestep (minutes)")
123
- plt.ylabel("MAE %")
124
- plt.title("MAE % for each timestep")
125
- plt.savefig("mae_per_timestep.png")
126
- plt.show()
127
-
128
- # Plot the error per grouped timestep
129
- plt.figure()
130
- for idx, run_name in enumerate(run_names):
131
- plt.plot(
132
- groups_df[run_name],
133
- label=f"{run_name}, epoch: {epoch_num[idx]}",
134
- marker="o",
135
- linestyle="-",
136
- )
137
- plt.legend()
138
- plt.xlabel("Timestep (minutes)")
139
- plt.ylabel("MAE %")
140
- plt.title("MAE % for each grouped timestep")
141
- plt.savefig("mae_per_grouped_timestep.png")
142
- plt.show()
143
-
144
-
145
- if __name__ == "__main__":
146
- parser = argparse.ArgumentParser()
147
- parser.add_argument("--project", type=str, default="")
148
- # Add arguments that is a list of strings
149
- parser.add_argument("--list_of_runs", nargs="+")
150
- parser.add_argument("--run_names", nargs="+")
151
- args = parser.parse_args()
152
- main(args.project, args.list_of_runs, args.run_names)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/uk/011 - Extending forecast to 36 hours (updated ECMWF data)/PVNEt_national_XG_comparison.png DELETED

Git LFS Details

  • SHA256: eab8cf00defbfb39a9d5b9cea319f1b78db8b05e2baec7ef80351dc37eb041c4
  • Pointer size: 131 Bytes
  • Size of remote file: 169 kB
experiments/uk/011 - Extending forecast to 36 hours (updated ECMWF data)/PVNet_day_ahead.md DELETED
@@ -1,22 +0,0 @@
1
- PVNet day ahead was retrained to produce a 36 hour forecast, it was given its [previous configuration](https://huggingface.co/openclimatefix/pvnet_uk_region/tree/main) and data except for being given ECMWF NWP data with a longer forecast horizon (max 85 hours but 37 hours given to the model). Longer horizon UKV NWP data was not available at time of training and will be a further addition in the future.
2
-
3
- **Results** \
4
- [The training run](https://wandb.ai/openclimatefix/pvnet_day_ahead_36_hours/runs/m4d3wlft/overview) had 3.15% normalised mean absolute error (NMAE) on validation data (100,000 samples from May 2022 to May 2023), [previous training of PVNet day ahead](https://wandb.ai/openclimatefix/pvnet2.1/runs/2ghzwbxg/overview?) had similar results of 3.19% NMAE.
5
-
6
-
7
- ![](PVNets_comparison.png "PVNets comparison")
8
-
9
- When comparing the two versions of PVNet day ahead (the new version in green) by forecast accuracy at each step on the validation dataset samples we see some small differences in the model up to 33 hours, such as first the first few steps and between steps 5 and 10, which could be explained by differences in samples seen and evaluated on between the two versions.
10
-
11
- However the larger difference is an improvement toward the end of the forecast horizon, from 33 hours onwards which is likely due to ECMWF data now being available for this period, when previously no NWP data was given past 33 hours due to the NWP forecast horizon of previous data and factoring in NWP initialization times and production delays.
12
-
13
- UKV NWP data used in the model is currently up to 30 hours, we would expect a further reduction in error from 30+ hours when training with longer horizon UKV data which would cover up to 36 hours.
14
-
15
-
16
- A very rough comparison is also plotted between these two PVNet model versions and the National XG model which is currently used for day ahead predictions in production.
17
-
18
- ![](PVNEt_national_XG_comparison.png "PVNets national XG comparison")
19
-
20
-
21
-
22
- This comparison is rough and should not be seen as a fair comparison as the national XG numbers are just an estimate derived from backtest data on different time periods. However, it can show roughly what relative improvement could be achieved from replacing the National XG Day ahead model with a PVNet Day Ahead model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
experiments/uk/011 - Extending forecast to 36 hours (updated ECMWF data)/PVNets_comparison.png DELETED

Git LFS Details

  • SHA256: e604d9b403293bbac688dc9c786cb4f0c70e1a9c6b78188a1e4f228ad0ae4b1b
  • Pointer size: 131 Bytes
  • Size of remote file: 160 kB