Home

Awesome

Analysis-Ready, Cloud Optimized ERA5

Recipes for reproducing Analysis-Ready & Cloud Optimized (ARCO) ERA5 datasets.

IntroductionOverviewAnalysis Ready DataRaw Cloud Optimized DataProject roadmapHow to reproduceFAQsHow to cite this workLicense

Introduction

Our goal is to make a global history of the climate highly accessible in the cloud. To that end, we present a curated copy of the ERA5 corpus in Google Cloud Public Datasets.

<details> <summary>What is ERA5?</summary>

ERA5 is the fifth generation of ECMWF's Atmospheric Reanalysis. It spans atmospheric, land, and ocean variables. ERA5 is an hourly dataset with global coverage at 30km resolution (~0.28° x 0.28°), ranging from 1979 to the present. The total ERA5 dataset is about 5 petabytes in size.

Check out ECMWF's documentation on ERA5 for more.

</details> <details> <summary>What is a reanalysis?</summary>

A reanalysis is the "most complete picture currently possible of past weather and climate." Reanalyses are created from assimilation of a wide range of data sources via numerical weather prediction (NWP) models.

Read ECMWF's introduction to reanalysis for more.

</details>

So far, we have ingested meteorologically valuable variables for the land and atmosphere. From this, we have produced a cloud-optimized version of ERA5, in which we have converted grib data to Zarr with no other modifications. In addition, we have created "analysis-ready" versions on regular lat-lon grids, oriented towards common research & ML workflows.

This two-pronged approach for the data serves different user needs. Some researchers need full control over the interpolation of data for their analysis. Most will want a batteries-included dataset, where standard pre-processing and chunk optimization is already applied. In general, we ensure that every step in this pipeline is open and reproducible, to provide transparency in the provenance of all data.

Overview

LocationTypeDescription
$BUCKET/ar/Analysis ReadyAn ML-ready, unified (surface & atmospheric) version of the data in Zarr.
$BUCKET/co/Cloud OptimizedA port of gaussian-gridded ERA5 data to Zarr.
$BUCKET/raw/Raw DataAll raw grib & NetCDF data.

Analysis Ready Data

These datasets have been regridded to a uniform 0.25° equiangular horizontal resolution to facilitate downstream analyses, e.g., with WeatherBench2.

0.25° Pressure and Surface Level Data

This dataset contains most pressure-level fields and all surface-level field regridded to a uniform 0.25° resolution. It is a superset of the data used to train GraphCast and NeuralGCM.

import xarray

ds = xarray.open_zarr(
    'gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3',
    chunks=None,
    storage_options=dict(token='anon'),
)
ar_full_37_1h = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))
<details> <summary>Data summary table</summary>
nameshort nameunitsdocs
100m_u_component_of_windu100m s**-1https://codes.ecmwf.int/grib/param-db/228246
100m_v_component_of_windv100m s**-1https://codes.ecmwf.int/grib/param-db/228247
10m_u_component_of_neutral_windu10nm s**-1https://codes.ecmwf.int/grib/param-db/228131
10m_u_component_of_windu10m s**-1https://codes.ecmwf.int/grib/param-db/165
10m_v_component_of_neutral_windv10nm s**-1https://codes.ecmwf.int/grib/param-db/228132
10m_v_component_of_windv10m s**-1https://codes.ecmwf.int/grib/param-db/166
10m_wind_gust_since_previous_post_processingfg10m s**-1https://codes.ecmwf.int/grib/param-db/175049
2m_dewpoint_temperatured2mKhttps://codes.ecmwf.int/grib/param-db/500018
2m_temperaturet2mKhttps://codes.ecmwf.int/grib/param-db/500013
air_density_over_the_oceansp140209kg m**-3https://codes.ecmwf.int/grib/param-db/140209
angle_of_sub_gridscale_orographyanorradianshttps://codes.ecmwf.int/grib/param-db/162
anisotropy_of_sub_gridscale_orographyisor~https://codes.ecmwf.int/grib/param-db/161
benjamin_feir_indexbfidimensionlesshttps://codes.ecmwf.int/grib/param-db/140253
boundary_layer_dissipationbldJ m**-2https://codes.ecmwf.int/grib/param-db/145
boundary_layer_heightblhmhttps://codes.ecmwf.int/grib/param-db/159
charnockchnk~https://codes.ecmwf.int/grib/param-db/148
clear_sky_direct_solar_radiation_at_surfacecdirJ m**-2https://codes.ecmwf.int/grib/param-db/228022
cloud_base_heightcbhmhttps://codes.ecmwf.int/grib/param-db/228023
coefficient_of_drag_with_wavescdwwdimensionlesshttps://codes.ecmwf.int/grib/param-db/140233
convective_available_potential_energycapeJ kg**-1https://codes.ecmwf.int/grib/param-db/59
convective_inhibitioncinJ kg**-1https://codes.ecmwf.int/grib/param-db/228001
convective_precipitationcpmhttps://codes.ecmwf.int/grib/param-db/228143
convective_rain_ratecrrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/228218
convective_snowfallcsfm of water equivalenthttps://codes.ecmwf.int/grib/param-db/239
convective_snowfall_rate_water_equivalentcsfrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/228220
downward_uv_radiation_at_the_surfaceuvbJ m**-2https://codes.ecmwf.int/grib/param-db/57
duct_base_heightdctbmhttps://codes.ecmwf.int/grib/param-db/228017
eastward_gravity_wave_surface_stresslgwsN m**-2 shttps://codes.ecmwf.int/grib/param-db/195
eastward_turbulent_surface_stressewssN m**-2 shttps://codes.ecmwf.int/grib/param-db/180
evaporationem of water equivalenthttps://codes.ecmwf.int/grib/param-db/182
forecast_albedofal(0 - 1)https://codes.ecmwf.int/grib/param-db/243
forecast_logarithm_of_surface_roughness_for_heatflsr~https://codes.ecmwf.int/grib/param-db/245
forecast_surface_roughnessfsrmhttps://codes.ecmwf.int/grib/param-db/244
fraction_of_cloud_covercc(0 - 1)https://codes.ecmwf.int/grib/param-db/248
free_convective_velocity_over_the_oceansp140208m s**-1
friction_velocityzustm s**-1https://codes.ecmwf.int/grib/param-db/228003
geopotential_at_surfacezm2 s-2https://codes.ecmwf.int/grib/param-db/129
gravity_wave_dissipationgwdJ m**-2https://codes.ecmwf.int/grib/param-db/197
high_cloud_coverhcc(0 - 1)https://codes.ecmwf.int/grib/param-db/3075
high_vegetation_covercvh(0 - 1)https://codes.ecmwf.int/grib/param-db/28
ice_temperature_layer_1istl1Khttps://codes.ecmwf.int/grib/param-db/35
ice_temperature_layer_2istl2Khttps://codes.ecmwf.int/grib/param-db/36
ice_temperature_layer_3istl3Khttps://codes.ecmwf.int/grib/param-db/37
ice_temperature_layer_4istl4Khttps://codes.ecmwf.int/grib/param-db/38
instantaneous_10m_wind_gusti10fgm s**-1https://codes.ecmwf.int/grib/param-db/228029
instantaneous_eastward_turbulent_surface_stressiewsN m**-2https://codes.ecmwf.int/grib/param-db/229
instantaneous_large_scale_surface_precipitation_fractionilspf(0 - 1)https://codes.ecmwf.int/grib/param-db/228217
instantaneous_moisture_fluxiekg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/232
instantaneous_northward_turbulent_surface_stressinssN m**-2https://codes.ecmwf.int/grib/param-db/230
instantaneous_surface_sensible_heat_fluxishfW m**-2https://codes.ecmwf.int/grib/param-db/231
k_indexkxKhttps://codes.ecmwf.int/grib/param-db/260121
lake_bottom_temperaturelbltKhttps://codes.ecmwf.int/grib/param-db/228010
lake_covercl(0 - 1)https://codes.ecmwf.int/grib/param-db/26
lake_depthdlmhttps://codes.ecmwf.int/grib/param-db/228007
lake_ice_depthlicdmhttps://codes.ecmwf.int/grib/param-db/228014
lake_ice_temperaturelictKhttps://codes.ecmwf.int/grib/param-db/228013
lake_mix_layer_depthlmldmhttps://codes.ecmwf.int/grib/param-db/228009
lake_mix_layer_temperaturelmltKhttps://codes.ecmwf.int/grib/param-db/228008
lake_shape_factorlshfdimensionlesshttps://codes.ecmwf.int/grib/param-db/228012
lake_total_layer_temperatureltltKhttps://codes.ecmwf.int/grib/param-db/228011
land_sea_masklsm(0 - 1)https://codes.ecmwf.int/grib/param-db/172
large_scale_precipitationlspmhttps://codes.ecmwf.int/grib/param-db/3062
large_scale_precipitation_fractionlspfshttps://codes.ecmwf.int/grib/param-db/50
large_scale_rain_ratelsrrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/228219
large_scale_snowfalllsfm of water equivalenthttps://codes.ecmwf.int/grib/param-db/240
large_scale_snowfall_rate_water_equivalentlssfrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/228221
leaf_area_index_high_vegetationlai_hvm2 m-2https://codes.ecmwf.int/grib/param-db/67
leaf_area_index_low_vegetationlai_lvm2 m-2https://codes.ecmwf.int/grib/param-db/66
low_cloud_coverlcc(0 - 1)https://codes.ecmwf.int/grib/param-db/3073
low_vegetation_covercvl(0 - 1)https://codes.ecmwf.int/grib/param-db/27
maximum_2m_temperature_since_previous_post_processingmx2tKhttps://codes.ecmwf.int/grib/param-db/201
maximum_individual_wave_heighthmaxmhttps://codes.ecmwf.int/grib/param-db/140218
maximum_total_precipitation_rate_since_previous_post_processingmxtprkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/228226
mean_boundary_layer_dissipationmbldW m**-2https://codes.ecmwf.int/grib/param-db/235032
mean_convective_precipitation_ratemcprkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235030
mean_convective_snowfall_ratemcsrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235056
mean_direction_of_total_swellmdtsdegreeshttps://codes.ecmwf.int/grib/param-db/140238
mean_direction_of_wind_wavesmdwwdegreeshttps://codes.ecmwf.int/grib/param-db/500072
mean_eastward_gravity_wave_surface_stressmegwssN m**-2https://codes.ecmwf.int/grib/param-db/235045
mean_eastward_turbulent_surface_stressmetssN m**-2https://codes.ecmwf.int/grib/param-db/235041
mean_evaporation_ratemerkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235043
mean_gravity_wave_dissipationmgwdW m**-2https://codes.ecmwf.int/grib/param-db/235047
mean_large_scale_precipitation_fractionmlspfProportionhttps://codes.ecmwf.int/grib/param-db/235026
mean_large_scale_precipitation_ratemlsprkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235029
mean_large_scale_snowfall_ratemlssrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235057
mean_northward_gravity_wave_surface_stressmngwssN m**-2https://codes.ecmwf.int/grib/param-db/235046
mean_northward_turbulent_surface_stressmntssN m**-2https://codes.ecmwf.int/grib/param-db/235042
mean_period_of_total_swellmptsshttps://codes.ecmwf.int/grib/param-db/140239
mean_period_of_wind_wavesmpwwshttps://codes.ecmwf.int/grib/param-db/500074
mean_potential_evaporation_ratemperkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235070
mean_runoff_ratemrorkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235048
mean_sea_level_pressuremslPahttps://codes.ecmwf.int/grib/param-db/151
mean_snow_evaporation_ratemserkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235023
mean_snowfall_ratemsrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235031
mean_snowmelt_ratemsmrkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235024
mean_square_slope_of_wavesmsqsdimensionlesshttps://codes.ecmwf.int/grib/param-db/140244
mean_sub_surface_runoff_ratemssrorkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235021
mean_surface_direct_short_wave_radiation_fluxmsdrswrfW m**-2https://codes.ecmwf.int/grib/param-db/235058
mean_surface_direct_short_wave_radiation_flux_clear_skymsdrswrfcsW m**-2https://codes.ecmwf.int/grib/param-db/235059
mean_surface_downward_long_wave_radiation_fluxmsdwlwrfW m**-2https://codes.ecmwf.int/grib/param-db/235036
mean_surface_downward_long_wave_radiation_flux_clear_skymsdwlwrfcsW m**-2https://codes.ecmwf.int/grib/param-db/235069
mean_surface_downward_short_wave_radiation_fluxmsdwswrfW m**-2https://codes.ecmwf.int/grib/param-db/235035
mean_surface_downward_short_wave_radiation_flux_clear_skymsdwswrfcsW m**-2https://codes.ecmwf.int/grib/param-db/235068
mean_surface_downward_uv_radiation_fluxmsdwuvrfW m**-2https://codes.ecmwf.int/grib/param-db/235027
mean_surface_latent_heat_fluxmslhfW m**-2https://codes.ecmwf.int/grib/param-db/235034
mean_surface_net_long_wave_radiation_fluxmsnlwrfW m**-2https://codes.ecmwf.int/grib/param-db/235038
mean_surface_net_long_wave_radiation_flux_clear_skymsnlwrfcsW m**-2https://codes.ecmwf.int/grib/param-db/235052
mean_surface_net_short_wave_radiation_fluxmsnswrfW m**-2https://codes.ecmwf.int/grib/param-db/235037
mean_surface_net_short_wave_radiation_flux_clear_skymsnswrfcsW m**-2https://codes.ecmwf.int/grib/param-db/235051
mean_surface_runoff_ratemsrorkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235020
mean_surface_sensible_heat_fluxmsshfW m**-2https://codes.ecmwf.int/grib/param-db/235033
mean_top_downward_short_wave_radiation_fluxmtdwswrfW m**-2https://codes.ecmwf.int/grib/param-db/235053
mean_top_net_long_wave_radiation_fluxmtnlwrfW m**-2https://codes.ecmwf.int/grib/param-db/235040
mean_top_net_long_wave_radiation_flux_clear_skymtnlwrfcsW m**-2https://codes.ecmwf.int/grib/param-db/235050
mean_top_net_short_wave_radiation_fluxmtnswrfW m**-2https://codes.ecmwf.int/grib/param-db/235039
mean_top_net_short_wave_radiation_flux_clear_skymtnswrfcsW m**-2https://codes.ecmwf.int/grib/param-db/235049
mean_total_precipitation_ratemtprkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235055
mean_vertical_gradient_of_refractivity_inside_trapping_layerdndzam**-1https://codes.ecmwf.int/grib/param-db/228016
mean_vertically_integrated_moisture_divergencemvimdkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/235054
mean_wave_directionmwdDegree truehttps://codes.ecmwf.int/grib/param-db/500185
mean_wave_direction_of_first_swell_partitionp140122degreeshttps://codes.ecmwf.int/grib/param-db/140122
mean_wave_direction_of_second_swell_partitionp140125degreeshttps://codes.ecmwf.int/grib/param-db/140125
mean_wave_direction_of_third_swell_partitionp140128degreeshttps://codes.ecmwf.int/grib/param-db/140128
mean_wave_periodmwpshttps://codes.ecmwf.int/grib/param-db/140232
mean_wave_period_based_on_first_momentmp1shttps://codes.ecmwf.int/grib/param-db/140220
mean_wave_period_based_on_first_moment_for_swellp1psshttps://codes.ecmwf.int/grib/param-db/140226
mean_wave_period_based_on_first_moment_for_wind_wavesp1wwshttps://codes.ecmwf.int/grib/param-db/140223
mean_wave_period_based_on_second_moment_for_swellp2psshttps://codes.ecmwf.int/grib/param-db/140227
mean_wave_period_based_on_second_moment_for_wind_wavesp2wwshttps://codes.ecmwf.int/grib/param-db/140224
mean_wave_period_of_first_swell_partitionp140123shttps://codes.ecmwf.int/grib/param-db/140123
mean_wave_period_of_second_swell_partitionp140126shttps://codes.ecmwf.int/grib/param-db/140126
mean_wave_period_of_third_swell_partitionp140129shttps://codes.ecmwf.int/grib/param-db/140129
mean_zero_crossing_wave_periodmp2shttps://codes.ecmwf.int/grib/param-db/140221
medium_cloud_covermcc(0 - 1)https://codes.ecmwf.int/grib/param-db/3074
minimum_2m_temperature_since_previous_post_processingmn2tKhttps://codes.ecmwf.int/grib/param-db/202
minimum_total_precipitation_rate_since_previous_post_processingmntprkg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/228227
minimum_vertical_gradient_of_refractivity_inside_trapping_layerdndznm**-1https://codes.ecmwf.int/grib/param-db/228015
model_bathymetrywmbmhttps://codes.ecmwf.int/grib/param-db/140219
near_ir_albedo_for_diffuse_radiationalnid(0 - 1)https://codes.ecmwf.int/grib/param-db/18
near_ir_albedo_for_direct_radiationalnip(0 - 1)https://codes.ecmwf.int/grib/param-db/17
normalized_energy_flux_into_oceanphiocdimensionlesshttps://codes.ecmwf.int/grib/param-db/140212
normalized_energy_flux_into_wavesphiawdimensionlesshttps://codes.ecmwf.int/grib/param-db/140211
normalized_stress_into_oceantauocdimensionlesshttps://codes.ecmwf.int/grib/param-db/140214
northward_gravity_wave_surface_stressmgwsN m**-2 shttps://codes.ecmwf.int/grib/param-db/196
northward_turbulent_surface_stressnsssN m**-2 shttps://codes.ecmwf.int/grib/param-db/181
ocean_surface_stress_equivalent_10m_neutral_wind_directiondwidegreeshttps://codes.ecmwf.int/grib/param-db/140249
ocean_surface_stress_equivalent_10m_neutral_wind_speedwindm s**-1https://codes.ecmwf.int/grib/param-db/140245
ozone_mass_mixing_ratioo3kg kg**-1https://codes.ecmwf.int/grib/param-db/500242
peak_wave_periodpp1dshttps://codes.ecmwf.int/grib/param-db/500190
period_corresponding_to_maximum_individual_wave_heighttmaxshttps://codes.ecmwf.int/grib/param-db/140217
potential_evaporationpevmhttps://codes.ecmwf.int/grib/param-db/228251
potential_vorticitypvK m2 kg-1 s**-1https://codes.ecmwf.int/grib/param-db/60
precipitation_typeptypecode table (4.201)https://codes.ecmwf.int/grib/param-db/260015
runoffromhttps://codes.ecmwf.int/grib/param-db/228205
sea_ice_coversiconc(0 - 1)https://codes.ecmwf.int/grib/param-db/262001
sea_surface_temperaturesstKhttps://codes.ecmwf.int/grib/param-db/151159
significant_height_of_combined_wind_waves_and_swellswhmhttps://codes.ecmwf.int/grib/param-db/500071
significant_height_of_total_swellshtsmhttps://codes.ecmwf.int/grib/param-db/140237
significant_height_of_wind_wavesshwwmhttps://codes.ecmwf.int/grib/param-db/500073
significant_wave_height_of_first_swell_partitionp140121mhttps://codes.ecmwf.int/grib/param-db/140121
significant_wave_height_of_second_swell_partitionp140124mhttps://codes.ecmwf.int/grib/param-db/140124
significant_wave_height_of_third_swell_partitionp140127mhttps://codes.ecmwf.int/grib/param-db/140127
skin_reservoir_contentsrcm of water equivalenthttps://codes.ecmwf.int/grib/param-db/198
skin_temperaturesktKhttps://codes.ecmwf.int/grib/param-db/235
slope_of_sub_gridscale_orographyslor~https://codes.ecmwf.int/grib/param-db/163
snow_albedoasn(0 - 1)https://codes.ecmwf.int/grib/param-db/228032
snow_densityrsnkg m**-3https://codes.ecmwf.int/grib/param-db/33
snow_depthsdm of water equivalenthttps://codes.ecmwf.int/grib/param-db/228141
snow_evaporationesm of water equivalenthttps://codes.ecmwf.int/grib/param-db/44
snowfallsfm of water equivalenthttps://codes.ecmwf.int/grib/param-db/228144
snowmeltsmltm of water equivalenthttps://codes.ecmwf.int/grib/param-db/45
soil_temperature_level_1stl1Khttps://codes.ecmwf.int/grib/param-db/139
soil_temperature_level_2stl2Khttps://codes.ecmwf.int/grib/param-db/170
soil_temperature_level_3stl3Khttps://codes.ecmwf.int/grib/param-db/183
soil_temperature_level_4stl4Khttps://codes.ecmwf.int/grib/param-db/236
soil_typeslt~https://codes.ecmwf.int/grib/param-db/43
specific_cloud_ice_water_contentciwckg kg**-1https://codes.ecmwf.int/grib/param-db/247
specific_cloud_liquid_water_contentclwckg kg**-1https://codes.ecmwf.int/grib/param-db/246
specific_humidityqkg kg**-1https://codes.ecmwf.int/grib/param-db/133
standard_deviation_of_filtered_subgrid_orographysdformhttps://codes.ecmwf.int/grib/param-db/74
standard_deviation_of_orographysdormhttps://codes.ecmwf.int/grib/param-db/160
sub_surface_runoffssromhttps://codes.ecmwf.int/grib/param-db/9
surface_latent_heat_fluxslhfJ m**-2https://codes.ecmwf.int/grib/param-db/147
surface_net_solar_radiationssrJ m**-2https://codes.ecmwf.int/grib/param-db/180176
surface_net_solar_radiation_clear_skyssrcJ m**-2https://codes.ecmwf.int/grib/param-db/210
surface_net_thermal_radiationstrJ m**-2https://codes.ecmwf.int/grib/param-db/180177
surface_net_thermal_radiation_clear_skystrcJ m**-2https://codes.ecmwf.int/grib/param-db/211
surface_pressurespPahttps://codes.ecmwf.int/grib/param-db/500026
surface_runoffsromhttps://codes.ecmwf.int/grib/param-db/174008
surface_sensible_heat_fluxsshfJ m**-2https://codes.ecmwf.int/grib/param-db/146
surface_solar_radiation_downward_clear_skyssrdcJ m**-2https://codes.ecmwf.int/grib/param-db/228129
surface_solar_radiation_downwardsssrdJ m**-2https://codes.ecmwf.int/grib/param-db/169
surface_thermal_radiation_downward_clear_skystrdcJ m**-2https://codes.ecmwf.int/grib/param-db/228130
surface_thermal_radiation_downwardsstrdJ m**-2https://codes.ecmwf.int/grib/param-db/175
temperaturetKhttps://codes.ecmwf.int/grib/param-db/500014
temperature_of_snow_layertsnKhttps://codes.ecmwf.int/grib/param-db/238
toa_incident_solar_radiationtisrJ m**-2https://codes.ecmwf.int/grib/param-db/212
top_net_solar_radiationtsrJ m**-2https://codes.ecmwf.int/grib/param-db/180178
top_net_solar_radiation_clear_skytsrcJ m**-2https://codes.ecmwf.int/grib/param-db/208
top_net_thermal_radiationttrJ m**-2https://codes.ecmwf.int/grib/param-db/180179
top_net_thermal_radiation_clear_skyttrcJ m**-2https://codes.ecmwf.int/grib/param-db/209
total_cloud_covertcc(0 - 1)https://codes.ecmwf.int/grib/param-db/228164
total_column_cloud_ice_watertciwkg m**-2https://codes.ecmwf.int/grib/param-db/79
total_column_cloud_liquid_watertclwkg m**-2https://codes.ecmwf.int/grib/param-db/78
total_column_ozonetco3kg m**-2https://codes.ecmwf.int/grib/param-db/206
total_column_rain_watertcrwkg m**-2https://codes.ecmwf.int/grib/param-db/228089
total_column_snow_watertcswkg m**-2https://codes.ecmwf.int/grib/param-db/228090
total_column_supercooled_liquid_watertcslwkg m**-2https://codes.ecmwf.int/grib/param-db/228088
total_column_watertcwkg m**-2https://codes.ecmwf.int/grib/param-db/136
total_column_water_vapourtcwvkg m**-2https://codes.ecmwf.int/grib/param-db/137
total_precipitationtpmhttps://codes.ecmwf.int/grib/param-db/228228
total_sky_direct_solar_radiation_at_surfacefdirJ m**-2https://codes.ecmwf.int/grib/param-db/228021
total_totals_indextotalxKhttps://codes.ecmwf.int/grib/param-db/260123
trapping_layer_base_heighttplbmhttps://codes.ecmwf.int/grib/param-db/228018
trapping_layer_top_heighttpltmhttps://codes.ecmwf.int/grib/param-db/228019
type_of_high_vegetationtvh~https://codes.ecmwf.int/grib/param-db/30
type_of_low_vegetationtvl~https://codes.ecmwf.int/grib/param-db/29
u_component_of_windum s**-1https://codes.ecmwf.int/grib/param-db/500028
u_component_stokes_driftustm s**-1https://codes.ecmwf.int/grib/param-db/140215
uv_visible_albedo_for_diffuse_radiationaluvd(0 - 1)https://codes.ecmwf.int/grib/param-db/16
uv_visible_albedo_for_direct_radiationaluvp(0 - 1)https://codes.ecmwf.int/grib/param-db/15
v_component_of_windvm s**-1https://codes.ecmwf.int/grib/param-db/500030
v_component_stokes_driftvstm s**-1https://codes.ecmwf.int/grib/param-db/140216
vertical_integral_of_divergence_of_cloud_frozen_water_fluxp80.162kg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/162057
vertical_integral_of_divergence_of_cloud_liquid_water_fluxp79.162kg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/162056
vertical_integral_of_divergence_of_geopotential_fluxp85.162W m**-2https://codes.ecmwf.int/grib/param-db/162085
vertical_integral_of_divergence_of_kinetic_energy_fluxp82.162W m**-2https://codes.ecmwf.int/grib/param-db/162082
vertical_integral_of_divergence_of_mass_fluxp81.162kg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/162081
vertical_integral_of_divergence_of_moisture_fluxp84.162kg m**-2 s**-1
vertical_integral_of_divergence_of_ozone_fluxp87.162kg m**-2 s**-1https://codes.ecmwf.int/grib/param-db/162087
vertical_integral_of_divergence_of_thermal_energy_fluxp83.162W m**-2https://codes.ecmwf.int/grib/param-db/162083
vertical_integral_of_divergence_of_total_energy_fluxp86.162W m**-2https://codes.ecmwf.int/grib/param-db/162086
vertical_integral_of_eastward_cloud_frozen_water_fluxp90.162kg m**-1 s**-1
vertical_integral_of_eastward_cloud_liquid_water_fluxp88.162kg m**-1 s**-1
vertical_integral_of_eastward_geopotential_fluxp73.162W m**-1https://codes.ecmwf.int/grib/param-db/162073
vertical_integral_of_eastward_heat_fluxp69.162W m**-1https://codes.ecmwf.int/grib/param-db/162069
vertical_integral_of_eastward_kinetic_energy_fluxp67.162W m**-1https://codes.ecmwf.int/grib/param-db/162067
vertical_integral_of_eastward_mass_fluxp65.162kg m**-1 s**-1https://codes.ecmwf.int/grib/param-db/162065
vertical_integral_of_eastward_ozone_fluxp77.162kg m**-1 s**-1https://codes.ecmwf.int/grib/param-db/162077
vertical_integral_of_eastward_total_energy_fluxp75.162W m**-1https://codes.ecmwf.int/grib/param-db/162075
vertical_integral_of_eastward_water_vapour_fluxp71.162kg m**-1 s**-1https://codes.ecmwf.int/grib/param-db/162071
vertical_integral_of_energy_conversionp64.162W m**-2https://codes.ecmwf.int/grib/param-db/162064
vertical_integral_of_kinetic_energyp59.162J m**-2
vertical_integral_of_mass_of_atmospherep53.162kg m**-2
vertical_integral_of_mass_tendencyp92.162kg m**-2 s**-1
vertical_integral_of_northward_cloud_frozen_water_fluxp91.162kg m**-1 s**-1
vertical_integral_of_northward_cloud_liquid_water_fluxp89.162kg m**-1 s**-1
vertical_integral_of_northward_geopotential_fluxp74.162W m**-1https://codes.ecmwf.int/grib/param-db/162074
vertical_integral_of_northward_heat_fluxp70.162W m**-1https://codes.ecmwf.int/grib/param-db/162070
vertical_integral_of_northward_kinetic_energy_fluxp68.162W m**-1https://codes.ecmwf.int/grib/param-db/162068
vertical_integral_of_northward_mass_fluxp66.162kg m**-1 s**-1https://codes.ecmwf.int/grib/param-db/162066
vertical_integral_of_northward_ozone_fluxp78.162kg m**-1 s**-1https://codes.ecmwf.int/grib/param-db/162078
vertical_integral_of_northward_total_energy_fluxp76.162W m**-1https://codes.ecmwf.int/grib/param-db/162076
vertical_integral_of_northward_water_vapour_fluxp72.162kg m**-1 s**-1https://codes.ecmwf.int/grib/param-db/162072
vertical_integral_of_potential_and_internal_energyp61.162J m**-2
vertical_integral_of_potential_internal_and_latent_energyp62.162J m**-2https://codes.ecmwf.int/grib/param-db/162062
vertical_integral_of_temperaturep54.162K kg m**-2https://codes.ecmwf.int/grib/param-db/162054
vertical_integral_of_thermal_energyp60.162J m**-2
vertical_integral_of_total_energyp63.162J m**-2
vertical_velocitywPa s**-1https://codes.ecmwf.int/grib/param-db/500032
vertically_integrated_moisture_divergencevimdkg m**-2https://codes.ecmwf.int/grib/param-db/213
volumetric_soil_water_layer_1swvl1m3 m-3https://codes.ecmwf.int/grib/param-db/39
volumetric_soil_water_layer_2swvl2m3 m-3https://codes.ecmwf.int/grib/param-db/40
volumetric_soil_water_layer_3swvl3m3 m-3https://codes.ecmwf.int/grib/param-db/41
volumetric_soil_water_layer_4swvl4m3 m-3https://codes.ecmwf.int/grib/param-db/42
wave_spectral_directional_widthwdwradianshttps://codes.ecmwf.int/grib/param-db/140222
wave_spectral_directional_width_for_swelldwpsradianshttps://codes.ecmwf.int/grib/param-db/140228
wave_spectral_directional_width_for_wind_wavesdwwwradianshttps://codes.ecmwf.int/grib/param-db/140225
wave_spectral_kurtosiswskdimensionlesshttps://codes.ecmwf.int/grib/param-db/140252
wave_spectral_peakednesswspdimensionlesshttps://codes.ecmwf.int/grib/param-db/140254
wave_spectral_skewnesswssdimensionlesshttps://codes.ecmwf.int/grib/param-db/140207
zero_degree_leveldeg0lmhttps://codes.ecmwf.int/grib/param-db/228024
</details>

0.25° Model Level Data

This dataset contains 3D fields at 0.25° resolution with ERA5's native vertical coordinates (hybrid pressure/sigma coordinates).

import xarray

ds = xarray.open_zarr(
    'gs://gcp-public-data-arco-era5/ar/model-level-1h-0p25deg.zarr-v1',
    chunks=None,
    storage_options=dict(token='anon'),
)
ar_native_vertical_grid_data = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))

It can combined with surface-level variables from the 0.25° pressure- and surface-level dataset:

ds = xarray.open_zarr(
    'gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3',
    chunks=None,
    storage_options=dict(token='anon'),
)
ar_full_37_1h = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))

ar_model_level_and_surface_data = xarray.merge([
    ar_native_vertical_grid_data, ar_full_37_1h.drop_dims('level')
])
<details> <summary>Data summary table</summary>
nameshort nameunitsdocsconfig
vorticity (relative)vos^-1https://apps.ecmwf.int/codes/grib/param-db?id=138era5_ml_dve.cfg
divergenceds^-1https://apps.ecmwf.int/codes/grib/param-db?id=155era5_ml_dve.cfg
geopotentialzm^2 s^-2https://apps.ecmwf.int/codes/grib/param-dbid=129era5_sfc.cfg
temperaturetKhttps://apps.ecmwf.int/codes/grib/param-db?id=130era5_ml_tw.cfg
vertical velocitywPa s^-1https://apps.ecmwf.int/codes/grib/param-db?id=135era5_ml_tw.cfg
specific humidityqkg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=133era5_ml_o3q.cfg
ozone mass mixing ratioo3kg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=203era5_ml_o3q.cfg
specific cloud liquid water contentclwckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=246era5_ml_o3q.cfg
specific cloud ice water contentciwckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=247era5_ml_o3q.cfg
fraction of cloud covercc(0 - 1)https://apps.ecmwf.int/codes/grib/param-db?id=248era5_ml_o3q.cfg
specific rain water contentcrwckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=75era5_ml_qrqs.cfg
specific snow water contentcswckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=76era5_ml_qrqs.cfg
u component of windum s**-1https://codes.ecmwf.int/grib/param-db/500028era5_pl_hourly.cfg
v component of windvm s**-1https://codes.ecmwf.int/grib/param-db/500030era5_pl_hourly.cfg
</details>

Raw Cloud Optimized Data

These datasets contain the raw data used to produce the Analysis Ready data. Whenever possible, parameters are represented by their native grid resolution See this ECMWF documentation for more.

Please view out our walkthrough notebook for a demo of these cloud-optimized datasets.

Model Level Wind

This dataset contains model-level wind fields on ERA5's native grid, as spherical harmonic coefficients.

import xarray

ds = xarray.open_zarr(
    'gs://gcp-public-data-arco-era5/co/model-level-wind.zarr-v2',
    chunks=None,
    storage_options=dict(token='anon'),
)
model_level_wind = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))
<details> <summary>Data summary table</summary>
nameshort nameunitsdocsconfig
vorticity (relative)vos^-1https://apps.ecmwf.int/codes/grib/param-db?id=138era5_ml_dve.cfg
divergenceds^-1https://apps.ecmwf.int/codes/grib/param-db?id=155era5_ml_dve.cfg
temperaturetKhttps://apps.ecmwf.int/codes/grib/param-db?id=130era5_ml_tw.cfg
vertical velocitywPa s^-1https://apps.ecmwf.int/codes/grib/param-db?id=135era5_ml_tw.cfg
</details>

Model Level Moisture

This dataset contains model-level moisture fields on ERA5's native reduced Gaussian grid.

import xarray

ds = xr.open_zarr(
    'gs://gcp-public-data-arco-era5/co/model-level-moisture.zarr-v2/',
    chunks=None,
    storage_options=dict(token='anon'),
)
model_level_moisture = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))
<details> <summary>Data summary table</summary>
nameshort nameunitsdocsconfig
specific humidityqkg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=133era5_ml_o3q.cfg
ozone mass mixing ratioo3kg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=203era5_ml_o3q.cfg
specific cloud liquid water contentclwckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=246era5_ml_o3q.cfg
specific cloud ice water contentciwckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=247era5_ml_o3q.cfg
fraction of cloud covercc(0 - 1)https://apps.ecmwf.int/codes/grib/param-db?id=248era5_ml_o3q.cfg
specific rain water contentcrwckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=75era5_ml_qrqs.cfg
specific snow water contentcswckg kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=76era5_ml_qrqs.cfg
</details>

Single Level Surface

This dataset contains single-level renanalysis fields on ERA5's native grid, as spherical harmonic coefficients.

import xarray

ds = xarray.open_zarr(
    'gs://gcp-public-data-arco-era5/co/single-level-surface.zarr-v2/',
    chunks=None,
    storage_options=dict(token='anon'),
)
single_level_surface = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))
<details> <summary>Data summary table</summary>
nameshort nameunitsdocsconfig
logarithm of surface pressurelnspNumerichttps://apps.ecmwf.int/codes/grib/param-db?id=152era5_ml_lnsp.cfg
surface geopotentialzsm^2 s^-2https://apps.ecmwf.int/codes/grib/param-db?id=162051era5_ml_zs.cfg
</details>

Single Level Reanalysis

This dataset contains single-level renanalysis fields on ERA5's native reduced Gaussian grid.

import xarray

ds = xarray.open_zarr(
    'gs://gcp-public-data-arco-era5/co/single-level-reanalysis.zarr-v2',
    chunks=None,
    storage_options=dict(token='anon'),
)
single_level_reanalysis = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))
<details> <summary>Data summary table</summary>
nameshort nameunitsdocsconfig
convective available potential energycapeJ kg^-1https://apps.ecmwf.int/codes/grib/param-db?id=59era5_sfc_cape.cfg
total column cloud ice watertciwkg m^-2https://apps.ecmwf.int/codes/grib/param-db?id=79era5_sfc_cape.cfg
vertical integral of divergence of cloud frozen water fluxwiiwdkg m^-2 s^-1https://apps.ecmwf.int/codes/grib/param-db?id=162080era5_sfc_cape.cfg
100 metre U wind component100um s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228246era5_sfc_cape.cfg
100 metre V wind component100vm s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228247era5_sfc_cape.cfg
sea ice area fractionci(0 - 1)https://apps.ecmwf.int/codes/grib/param-db?id=31era5_sfc_cisst.cfg
sea surface temperaturesstPahttps://apps.ecmwf.int/codes/grib/param-db?id=34era5_sfc_cisst.cfg
skin temperaturesktKhttps://apps.ecmwf.int/codes/grib/param-db?id=235era5_sfc_cisst.cfg
soil temperature level 1stl1Khttps://apps.ecmwf.int/codes/grib/param-db?id=139era5_sfc_soil.cfg
soil temperature level 2stl2Khttps://apps.ecmwf.int/codes/grib/param-db?id=170era5_sfc_soil.cfg
soil temperature level 3stl3Khttps://apps.ecmwf.int/codes/grib/param-db?id=183era5_sfc_soil.cfg
soil temperature level 4stl4Khttps://apps.ecmwf.int/codes/grib/param-db?id=236era5_sfc_soil.cfg
temperature of snow layertsnKhttps://apps.ecmwf.int/codes/grib/param-db?id=238era5_sfc_soil.cfg
volumetric soil water layer 1swvl1m^3 m^-3https://apps.ecmwf.int/codes/grib/param-db?id=39era5_sfc_soil.cfg
volumetric soil water layer 2swvl2m^3 m^-3https://apps.ecmwf.int/codes/grib/param-db?id=40era5_sfc_soil.cfg
volumetric soil water layer 3swvl3m^3 m^-3https://apps.ecmwf.int/codes/grib/param-db?id=41era5_sfc_soil.cfg
volumetric soil water layer 4swvl4m^3 m^-3https://apps.ecmwf.int/codes/grib/param-db?id=42era5_sfc_soil.cfg
ice temperature layer 1istl1Khttps://apps.ecmwf.int/codes/grib/param-db?id=35era5_sfc_soil.cfg
ice temperature layer 2istl2Khttps://apps.ecmwf.int/codes/grib/param-db?id=36era5_sfc_soil.cfg
ice temperature layer 3istl3Khttps://apps.ecmwf.int/codes/grib/param-db?id=37era5_sfc_soil.cfg
ice temperature layer 4istl4Khttps://apps.ecmwf.int/codes/grib/param-db?id=38era5_sfc_soil.cfg
total column cloud liquid watertclwkg m^-2https://apps.ecmwf.int/codes/grib/param-db?id=78era5_sfc_tcol.cfg
total column rain watertcrwkg m^-2https://apps.ecmwf.int/codes/grib/param-db?id=228089era5_sfc_tcol.cfg
total column snow watertcswkg m^-2https://apps.ecmwf.int/codes/grib/param-db?id=228090era5_sfc_tcol.cfg
total column watertcwkg m^-2https://apps.ecmwf.int/codes/grib/param-db?id=136era5_sfc_tcol.cfg
total column vertically-integrated water vapourtcwvkg m^-2https://apps.ecmwf.int/codes/grib/param-db?id=137era5_sfc_tcol.cfg
Geopotentialzm^2 s^-2https://apps.ecmwf.int/codes/grib/param-dbid=129era5_sfc.cfg
Surface pressurespPahttps://apps.ecmwf.int/codes/grib/param-db?id=134era5_sfc.cfg
Total column vertically-integrated water vapourtcwvkg m^-2https://apps.ecmwf.int/codes/grib/param-db?id=137era5_sfc.cfg
Mean sea level pressuremslPahttps://apps.ecmwf.int/codes/grib/param-db?id=151era5_sfc.cfg
Total cloud covertcc(0 - 1)https://apps.ecmwf.int/codes/grib/param-db?id=164era5_sfc.cfg
10 metre U wind component10um s^-1https://apps.ecmwf.int/codes/grib/param-db?id=165era5_sfc.cfg
10 metre V wind component10vm s^-1https://apps.ecmwf.int/codes/grib/param-db?id=166era5_sfc.cfg
2 metre temperature2tKhttps://apps.ecmwf.int/codes/grib/param-db?id=167era5_sfc.cfg
2 metre dewpoint temperature2dKhttps://apps.ecmwf.int/codes/grib/param-db?id=168era5_sfc.cfg
Low cloud coverlcc(0 - 1)https://apps.ecmwf.int/codes/grib/param-db?id=186era5_sfc.cfg
Medium cloud covermcc(0 - 1)https://apps.ecmwf.int/codes/grib/param-db?id=187era5_sfc.cfg
High cloud coverhcc(0 - 1)https://apps.ecmwf.int/codes/grib/param-db?id=188era5_sfc.cfg
100 metre U wind component100um s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228246era5_sfc.cfg
100 metre V wind component100vm s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228247era5_sfc.cfg
</details>

Single Level Forecast

This dataset contains single-level forecast fields on ERA5's native reduced Gaussian grid.

import xarray

ds = xarray.open_zarr(
    'gs://gcp-public-data-arco-era5/co/single-level-forecast.zarr-v2/', 
    chunks=None,
    storage_options=dict(token='anon'),
)
single_level_forecasts = ds.sel(time=slice(ds.attrs['valid_time_start'], ds.attrs['valid_time_stop']))
<details> <summary>Data summary table</summary>
nameshort nameunitsdocsconfig
snow densityrsnkg m^-3https://apps.ecmwf.int/codes/grib/param-db?id=33era5_sfc_pcp.cfg
snow evaporationesm of water equivalenthttps://apps.ecmwf.int/codes/grib/param-db?id=44era5_sfc_pcp.cfg
snow meltsmltm of water equivalenthttps://apps.ecmwf.int/codes/grib/param-db?id=45era5_sfc_pcp.cfg
large-scale precipitation fractionlspfshttps://apps.ecmwf.int/codes/grib/param-db?id=50era5_sfc_pcp.cfg
snow depthsdm of water equivalenthttps://apps.ecmwf.int/codes/grib/param-db?id=141era5_sfc_pcp.cfg
large-scale precipitationlspmhttps://apps.ecmwf.int/codes/grib/param-db?id=142era5_sfc_pcp.cfg
convective precipitationcpmhttps://apps.ecmwf.int/codes/grib/param-db?id=143era5_sfc_pcp.cfg
snowfallsfm of water equivalenthttps://apps.ecmwf.int/codes/grib/param-db?id=144era5_sfc_pcp.cfg
convective rain ratecrrkg m^-2 s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228218era5_sfc_pcp.cfg
large scale rain ratelsrrkg m^-2 s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228219era5_sfc_pcp.cfg
convective snowfall rate water equivalentcsfrkg m^-2 s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228220era5_sfc_pcp.cfg
large scale snowfall rate water equivalentlssfrkg m^-2 s^-1https://apps.ecmwf.int/codes/grib/param-db?id=228221era5_sfc_pcp.cfg
total precipitationtpmhttps://apps.ecmwf.int/codes/grib/param-db?id=228era5_sfc_pcp.cfg
convective snowfallcsfm of water equivalenthttps://apps.ecmwf.int/codes/grib/param-db?id=239era5_sfc_pcp.cfg
large-scale snowfalllsfm of water equivalenthttps://apps.ecmwf.int/codes/grib/param-db?id=240era5_sfc_pcp.cfg
precipitation typeptypecode table (4.201)https://apps.ecmwf.int/codes/grib/param-db?id=260015era5_sfc_pcp.cfg
surface solar radiation downwardsssrdJ m^-2https://apps.ecmwf.int/codes/grib/param-db?id=169era5_sfc_rad.cfg
top net thermal radiationttrJ m^-2https://apps.ecmwf.int/codes/grib/param-db?id=179era5_sfc_rad.cfg
gravity wave dissipationgwdJ m^-2https://apps.ecmwf.int/codes/grib/param-db?id=197era5_sfc_rad.cfg
surface thermal radiation downwardsstrdJ m^-2https://apps.ecmwf.int/codes/grib/param-db?id=175era5_sfc_rad.cfg
surface net thermal radiationstrJ m^-2https://apps.ecmwf.int/codes/grib/param-db?id=177era5_sfc_rad.cfg
</details>

Project roadmap

Updated on 2024-06-25

  1. Phase 0: Ingest raw ERA5
  2. Phase 1: Cloud-Optimize to Zarr, without data modifications
    1. Use Pangeo-Forge to convert the data from grib to Zarr.
    2. Create example notebooks for common workflows, including regridding and variable derivation.
  3. Phase 2: Produce an Analysis-Ready corpus
    1. Update GCP CPDs documentation.
    2. Create walkthrough notebooks.
  4. Phase 3: Automatic dataset updates, data is back-fillable.
  5. WIP Phase 4: Mirror ERA5 data in Google BigQuery.
  6. Phase 5: Derive a high-resolution version of ERA5
    1. Regrid datasets to lat/long grids.
    2. Convert model levels to pressure levels (at high resolution).
    3. Compute derived variables.
    4. Expand on example notebooks.

How to reproduce

All phases of this dataset can be reproduced with scripts found here. To run them, please clone the repo and install the project.

git clone https://github.com/google-research/arco-era5.git

Or, via SSH:

git clone git@github.com:google-research/arco-era5.git

Then, install with pip:

cd arco-era5
pip install -e .

Acquire & preprocess raw data

Please consult the instructions described in raw/.

Cloud-Optimization

All our tools make use of Apache Beam, and thus are portable to any cloud (or Runner). We use GCP's Dataflow to produce this dataset. If you would like to reproduce the project this way, we recommend the following:

  1. Ensure you have access to a GCP project with GCS read & write access, as well as full Dataflow permissions (see these "Before you begin" instructions).
  2. Export the following variables:
    export PROJECT=<your-gcp-project>
    export REGION=us-central1
    export BUCKET=<your-beam-runner-bucket>
    

From here, we provide examples of how to run the recipes at the top of each script.

pydoc src/single-levels-to-zarr.py
pydoc src/ar-to-zarr.py

You can also discover available command line options by invoking the script with -h/--help:

python src/model-levels-to-zarr.py --help

Automating dataset Updates in zarr and BigQuery

This feature is works in 4 parts.

  1. Acquiring raw data from CDS, facilitated by weather-dl tool.
  2. Splitting raw data using weather-sp.
  3. Ingest this splitted data into a zarr file.
  4. [WIP] Ingest AR data into BigQuery with the assistance of the weather-mv.

How to Run.

  1. Set up a Cloud project with sufficient permissions to use cloud storage (such as GCS) and a Beam runner (such as Dataflow).

    Note: Other cloud systems should work too, such as S3 and Elastic Map Reduce. However, these are untested. If you experience an error here, please let us know by filing an issue.

  2. Acquire one or more licenses from Copernicus.

  3. Add the all Copernicus licenses into the secret-manager with value likes this: {"api_url": "URL", "api_key": "KEY"}

    NOTE: for every API_KEY there must be unique secret-key.

  4. Update all of these variable in docker-file.

    • PROJECT

    • REGION

    • BUCKET

    • MANIFEST_LOCATION

    • API_KEY_*

    • WEATHER_TOOLS_SDK_CONTAINER_IMAGE

    • ARCO_ERA5_SDK_CONTAINER_IMAGE

    • BQ_TABLES_LIST

    • REGION_LIST

    • In case of multiple API keys, API_KEY must follow this format: API_KEY_*. here * can be numeric value i.e. 1, 2.
    • API_KEY_* value is the resource name of secret-manager key and it's value looks like this :: projects/PROJECT_NAME/secrets/SECRET_KEY_NAME/versions/1
    • BQ_TABLES_LIST is list of the BigQuery table in which data is ingested and it's value is like this :: '["PROJECT.DATASET.TABLE1", "PROJECT.DATASET.TABLE2", ..., "PROJECT.DATASET.TABLE6"]'.
    • REGION_LIST is list of the GCP_region in which the job of ingestion will run :: '["us-east1", "us-west4",..., "us-west2"]'.
    • Size of BQ_TABLES_LIST and REGION_LIST must be 6 as total 6 zarr file processed in the current pipeline and also, data ingestion in Bigquery are corresponding to ZARR_FILES_LIST of raw-to-zarr-to-bq.py so add table name in BQ_TABLES_LIST accordingly.
    • WEATHER_TOOLS_SDK_CONTAINER_IMAGE is made using this dockerfile and is stored in a docker registry.
    • ARCO_ERA5_SDK_CONTAINER_IMAGE is made using this dockerfile and is stored in a registry.
  5. Create docker image.

export PROJECT_ID=<your-project-here>
export REPO=<repo> eg:arco-era5-raw-to-zarr-to-bq

gcloud builds submit . --tag "gcr.io/$PROJECT_ID/$REPO:latest" 
  1. Create a VM using above created docker-image
export ZONE=<zone> eg: us-central1-a
export SERVICE_ACCOUNT=<service account> # Let's keep this as Compute Engine Default Service Account
export IMAGE_PATH=<container-image-path> # The above created image-path

gcloud compute instances create-with-container arco-era5-raw-to-zarr-to-bq \ --project=$PROJECT_ID \
--zone=$ZONE \
--machine-type=n2-standard-4 \
--network-interface=network-tier=PREMIUM,subnet=default \
--maintenance-policy=MIGRATE \
--provisioning-model=STANDARD \
--service-account=$SERVICE_ACCOUNT \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--image=projects/cos-cloud/global/images/cos-stable-109-17800-0-45 \
--boot-disk-size=200GB \
--boot-disk-type=pd-balanced \
--boot-disk-device-name=arco-era5-raw-to-zarr-to-bq \
--container-image=$IMAGE_PATH \
--container-restart-policy=on-failure \
--container-tty \
--no-shielded-secure-boot \
--shielded-vtpm \
--shielded-integrity-monitoring \
--labels=goog-ec-src=vm_add-gcloud,container-vm=cos-stable-109-17800-0-45 \
--metadata-from-file=startup-script=start-up.sh
  1. Once VM is created, the script will execute on 7th day of every month as this is default set in the cron-file.Also you can see the logs after connecting to VM through SSH.

Log will be shown at this(/var/log/cron.log) file. Better if we SSH after 5-10 minutes of VM creation.

Making the dataset "High Resolution" & beyond...

This phase of the project is under active development! If you would like to lend a hand in any way, please check out our contributing guide.

FAQs

How did you pick these variables?

This dataset originated in Loon, Alphabet’s project to deliver internet service using stratospheric balloons, and is now curated by Google Research & Google Cloud Platform. Loon’s Planning, Simulation and Control team needed accurate data on how the stratospheric winds have behaved in the past to evaluate the effectiveness of different balloon steering algorithms over a range of weather. This led us to download the model-level data. But Loon also needed information about the atmospheric radiation to model balloon gas temperatures, so we downloaded that. And then we downloaded the most commonly used meteorological variables to support different product planning needs (RF propagation models, etc)...

Eventually, we found ourselves with a comprehensive history of weather for the world.

Where are the U/V components of wind? Where is geopotential height? Why isn’t X variable in this dataset?

We intentionally did not include many variables that can be derived from other variables. For example, U/V components of wind can be computed from divergence and vorticity; geopotential is a vertical integral of temperature.

In the second phase of our roadmap (towards "Analysis Ready" data), we aim to compute all of these variables ourselves. If you’d like to make use of these parameters sooner, please check out our example notebooks where we demo common calculations. If you notice non-derived missing data, such as surface variables, please let us know of your needs by filing an issue, and we will be happy to incorporate them into our roadmap.

Do you have plans to get all of ERA5?

We aim to support hosting data that serves general meteorological use cases, rather than aim for total completeness. Wave variables are missing from this corpus, and are a priority on our roadmap. If there is a variable or dataset that you think should be included here, please file a Github issue.

For a complete ERA5 mirror, we recommend consulting with the Pangeo Forge project (especially staged-recipes#92).

Why are there two model-level datasets and not one?

It definitely is possible for all model level data to be represented in one grid, and thus one dataset. However, we opted to preserve the native representation for variables in ECMWF's models. A handful of core model variables (wind, temperature and surface pressure) are represented as spectral harmonic coefficients , while everything else is stored on a Gaussian grid. This avoids introducing numerical error by interpolating these variables to physical space. For a more in depth review of this topic, please consult these references:

Please note: in a future releases, we intend to create a dataset version where all model levels are in one grid and Zarr.

Why doesn’t this project make use of Pangeo Forge Cloud?

We are big fans of the Pangeo Forge project, and of Pangeo in general. While this project does make use of their Recipes, we have a few reasons to not use their cloud. First, we would prefer to use internal rather than community resources for computations of this scale. In addition, there are several technical reasons why Pangeo Forge as it is today would not be able to handle this case (0, 1, 2, 3). To work around this, we opted to combine familiar-to-us infrastructure with Pangeo-Forge's core and to use the right tool for the right job.

Why use this dataset? What uses are there for the data?

ERA5 can be used in many applications. It can be used to train ML models that predict the impact of weather on different phenomena. ERA5 data could also be used to train and evaluate ML models that forecast the weather. The data could be used to compute climatologies, or the average weather for a region over a given period of time. ERA5 data can be used to visualize and study historical weather events, such as Hurricane Sandy.

Where should I be cautious? What are the limitations of the dataset?

Mumbai, India <br> Mumbai, IndiaSan Francisco, USA <br> San Francisco, USA
Tokyo, Japan <br> Tokyo, JapanSingapore <br> Singapore
ERA5 Topography <br> ERA5 TopographyGMTED2010 Topography <br> GMTED2010 Topography

It is important to remember that a reanalysis is an estimate of what the weather was, it is not guaranteed to be an error-free estimate. There are several areas where the novice reanalysis user should be careful.

First, the user should be careful using reanalysis data at locations near coastlines. The first figure shows the fraction of land (1 for land, 0 for ocean) of ERA5 grid points at different coastal locations. This is important because the land-surface model used in ERA5 tries to blend in the influence of water with the influence of land based on this fraction. The most visible effect of this blending is that as the fraction of land decreases, the daily variation in temperature will also decrease. Looking at the first figure, there are sharp changes in the fraction of land between neighboring grid cells so there could be differences in daily temperature range that might not be reflected in actual weather observations.

The user should also be careful when using reanalysis data in areas with large variations in topography. The second figure is a plot of ERA5 topography around Mount Everest compared with GMTED2010 topography. The ERA5 topography is completely missing the high peaks of the Everest region and missing most of the structure of the mountain valleys. Topography strongly influences temperature and precipitation rate, so it is possible that ERA5’s temperature is too warm and ERA5’s precipitation patterns could be wrong as well.

ERA5’s precipitation variables aren’t directly constrained by any observations, so we strongly encourage the user to check ERA5 against observed precipitation (for example, Wu et al., 2022). A study comparing reanalyses (not including ERA5) against gridded precipitation observations showed striking differences between reanalyses and observation Lisa V Alexander et al 2020 Environ. Res. Lett. 15 055002.

Can I use the data for {research,commercial} purposes?

Yes, you can use our ERA5 data according to the terms of the Copernicus license.

Researchers, see the next section for how to cite this work.

Commercial users, please be sure to provide acknowledgement to the Copernicus Climate Change Service according to the Copernicus Licence terms.

How to cite this work

Please cite our presentation at the 22nd Conference on Artificial Intelligence for Environmental Science describing ARCO-ERA5.

Carver, Robert W, and Merose, Alex. (2023):
ARCO-ERA5: An Analysis-Ready Cloud-Optimized Reanalysis Dataset.
22nd Conf. on AI for Env. Science, Denver, CO, Amer. Meteo. Soc, 4A.1,
https://ams.confex.com/ams/103ANNUAL/meetingapp.cgi/Paper/415842

In addition, please cite the ERA5 dataset accordingly:

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., 
Muñoz‐Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., 
Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., 
Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., 
Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., 
Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., 
Hogan, R.J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., 
Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F.,
Villaume, S., Thépaut, J-N. (2017): Complete ERA5: Fifth generation of 
ECMWF atmospheric reanalyses of the global climate. Copernicus Climate 
Change Service (C3S) Data Store (CDS). (Accessed on DD-MM-YYYY)

Hersbach et al, (2017) was downloaded from the Copernicus Climate Change 
Service (C3S) Climate Data Store. We thank C3S for allowing us to 
redistribute the data.

The results contain modified Copernicus Climate Change Service 
information 2022. Neither the European Commission nor ECMWF is 
responsible for any use that may be made of the Copernicus information 
or data it contains.

License

This is not an official Google product.

Copyright 2022 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.