|
|
These locations are used for model training:
|
|
|
|
|
|
|
|
|
- (48.333056, 16.631944), # Vienna, Austria
|
|
|
- (64.1355, -21.8954), # Reykjavik, Iceland
|
|
|
- (35.1667, 33.3667), # Nicosia, Cyprus
|
|
|
- (59.3293, 18.0686), # Stockholm, Sweden
|
|
|
- (41.3851, 2.1734), # Barcelona, Spain
|
|
|
- (55.6761, 12.5683), # Copenhagen, Denmark
|
|
|
- (52.3676, 4.9041), # Amsterdam, Netherlands
|
|
|
- (48.8566, 2.3522), # Paris, France
|
|
|
- (37.9838, 23.7275), # Athens, Greece
|
|
|
- (53.3498, -6.2603), # Dublin, Ireland
|
|
|
- (50.0755, 14.4378), # Prague, Czech Republic
|
|
|
- (47.4979, 19.0402), # Budapest, Hungary
|
|
|
- (45.4408, 12.3155), # Venice, Italy
|
|
|
- (60.1695, 24.9354), # Helsinki, Finland
|
|
|
- (51.5074, -0.1278), # London, United Kingdom
|
|
|
- (40.4168, -3.7038), # Madrid, Spain
|
|
|
- (43.7102, 7.2620), # Nice, France
|
|
|
- (50.8503, 4.3517), # Brussels, Belgium
|
|
|
- (54.6872, 25.2797), # Vilnius, Lithuania
|
|
|
- (42.6977, 23.3219), # Sofia, Bulgaria
|
|
|
- (59.9139, 10.7522), # Oslo, Norway
|
|
|
- (46.2044, 6.1432), # Geneva, Switzerland
|
|
|
- (56.9496, 24.1052), # Riga, Latvia
|
|
|
- (44.4268, 26.1025), # Bucharest, Romania
|
|
|
- (48.1486, 17.1077), # Bratislava, Slovakia
|
|
|
- (44.7871, 20.4572), # Belgrade, Serbia
|
|
|
- (52.2297, 21.0122) # Warsaw, Poland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
and these for model validation:
|
|
|
|
|
|
|
|
|
- (52.3676, 4.9041), # Amsterdam, Netherlands
|
|
|
- (54.6872, 25.2797), # Vilnius, Lithuania
|
|
|
- (59.9139, 10.7522), # Oslo, Norway
|
|
|
- (56.9496, 24.1052), # Riga, Latvia
|
|
|
- (48.1486, 17.1077), # Bratislava, Slovakia
|
|
|
- (44.7871, 20.4572) # Belgrade, Serbia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For each location weather data (2017-04-01 to 2024-04-01) were fetched from open-meteo. Then I inspect and clean the data by dealing with NAN and plotting the data for these columns: T2M, RH2M, dew_point_2m, RR, pressure_msl, surface_pressure, global_tilted_irradiance.
|
|
|
|
|
|

|
|
|
|
|
|
After separating time index and feature engineering the dataframe consists of these columns:
|
|
|
|
|
|
**fetched**
|
|
|
- 'T2M',
|
|
|
- 'RH2M',
|
|
|
- 'dew_point_2m',
|
|
|
- 'RR',
|
|
|
- 'pressure_msl',
|
|
|
- 'surface_pressure',
|
|
|
- 'cloud_cover',
|
|
|
- 'wind_speed_10m',
|
|
|
- 'wind_direction_10m',
|
|
|
- 'is_day',
|
|
|
- 'global_tilted_irradiance',
|
|
|
- 'lat',
|
|
|
- 'lon'
|
|
|
|
|
|
**time**
|
|
|
- 'hour',
|
|
|
- 'day_of_year',
|
|
|
- 'week_of_year',
|
|
|
- 'month', 'year',
|
|
|
- 'hour_sin',
|
|
|
- 'hour_cos',
|
|
|
- 'month_sin',
|
|
|
- 'month_cos',
|
|
|
- 'dayofyear_sin',
|
|
|
- 'dayofyear_cos'
|
|
|
|
|
|
**interactions**
|
|
|
|
|
|
- 'temp_humidity_interaction',
|
|
|
- 'wind_rain_interaction',
|
|
|
- 'temp_rain_interaction',
|
|
|
- 'wind_temp_interaction'
|
|
|
|
|
|
|
|
|
**lags**
|
|
|
|
|
|
- 'T2M_lag_1hr',
|
|
|
- 'T2M_lag_3hr',
|
|
|
- 'T2M_lag_6hr',
|
|
|
- 'T2M_lag_12hr',
|
|
|
- 'T2M_lag_24hr',
|
|
|
- 'T2M_lag_48hr',
|
|
|
- 'RH2M_lag_1hr',
|
|
|
- 'RH2M_lag_3hr'
|
|
|
|
|
|
**differences and events**
|
|
|
|
|
|
- 'T2M_diff',
|
|
|
- 'RH2M_diff',
|
|
|
- 'temp_drop_event',
|
|
|
- 'rain_event'
|
|
|
|
|
|
**EWM**
|
|
|
|
|
|
- 'T2M_ewm_3h',
|
|
|
- 'T2M_ewm_6h',
|
|
|
- 'T2M_ewm_12h',
|
|
|
- 'T2M_ewm_24h',
|
|
|
- 'T2M_ewm_8760h'
|
|
|
|
|
|
**rolling window**
|
|
|
|
|
|
- '24h_rolling_mean',
|
|
|
- '24h_rolling_max',
|
|
|
- '24h_rolling_min',
|
|
|
- '24h_rolling_std',
|
|
|
- '365d_rolling_mean',
|
|
|
- '365d_rolling_max',
|
|
|
- '365d_rolling_min',
|
|
|
- '365d_rolling_std'
|
|
|
|
|
|
Afterwards sequences for the model are created. |