# NOAA GEFS analysis


_View the latest documentation at https://dynamical.org/catalog/noaa-gefs-analysis/ ._

| | |
|---|---|
| Spatial domain | Global |
| Spatial resolution | 0.25 degrees (~20km) |
| Time domain | 2000-01-01 00:00:00 UTC to Present |
| Time resolution | 3.0 hours |

### Data Access
```
https://data.dynamical.org/noaa/gefs/analysis/latest.zarr?email=optional@email.com
```

*\* Email optional. Providing your email as a query param helps us understand usage and impact to keep dynamical.org supported for the long-term. For catalog updates follow [here](https://dynamical.org/updates).*

The Global Ensemble Forecast System (GEFS) is a National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP) weather forecast model.

This analysis dataset is an archive of the model's best estimate of past weather. It is created by concatenating the first few hours of each historical forecast to provide a dataset with dimensions time, latitude, and longitude.

This dataset is designed to be used in conjunction with the [GEFS forecast 35 day](https://dynamical.org/catalog/noaa-gefs-forecast-35-day) dataset.

Storage for this dataset is generously provided by [Source Cooperative](https://source.coop/), a [Radiant Earth](https://radiant.earth/) initiative.

## Dimensions

| | min | max | units |
|---|---|---|---|
| **latitude** | -90 | 90 | degrees_north |
| **longitude** | -180 | 179.75 | degrees_east |
| **time** | 2000-01-01T00:00:00 | Present | seconds since 1970-01-01 |

## Variables

| | units | dimensions |
|---|---|---|
| **categorical_freezing_rain_surface** <br>*Categorical freezing rain* | 0=no; 1=yes | time × latitude × longitude |
| **categorical_ice_pellets_surface** <br>*Categorical ice pellets* | 0=no; 1=yes | time × latitude × longitude |
| **categorical_rain_surface** <br>*Categorical rain* | 0=no; 1=yes | time × latitude × longitude |
| **categorical_snow_surface** <br>*Categorical snow* | 0=no; 1=yes | time × latitude × longitude |
| **downward_long_wave_radiation_flux_surface** <br>*Surface downward long-wave radiation flux* <br>*Average value in the last 6 hour period (00, 06, 12, 18 UTC) or 3 hour period (03, 09, 15, 21 UTC).* | W/(m^2) | time × latitude × longitude |
| **downward_short_wave_radiation_flux_surface** <br>*Surface downward short-wave radiation flux* <br>*Average value in the last 6 hour period (00, 06, 12, 18 UTC) or 3 hour period (03, 09, 15, 21 UTC).* | W/(m^2) | time × latitude × longitude |
| **geopotential_height_cloud_ceiling** <br>*Geopotential height* | gpm | time × latitude × longitude |
| **maximum_temperature_2m** <br>*Maximum temperature* | C | time × latitude × longitude |
| **minimum_temperature_2m** <br>*Minimum temperature* | C | time × latitude × longitude |
| **percent_frozen_precipitation_surface** <br>*Percent frozen precipitation* <br>*Contains the value -50 when there is no precipitation.* | % | time × latitude × longitude |
| **precipitable_water_atmosphere** <br>*Precipitable water* | kg/(m^2) | time × latitude × longitude |
| **precipitation_surface** <br>*Total Precipitation* <br>*Average precipitation rate since the previous forecast step.* | mm/s | time × latitude × longitude |
| **pressure_reduced_to_mean_sea_level** <br>*Pressure reduced to MSL* | Pa | time × latitude × longitude |
| **pressure_surface** <br>*Surface pressure* | Pa | time × latitude × longitude |
| **relative_humidity_2m** <br>*2 metre relative humidity* | % | time × latitude × longitude |
| **temperature_2m** <br>*2 metre temperature* | C | time × latitude × longitude |
| **total_cloud_cover_atmosphere** <br>*Total Cloud Cover* <br>*Average value in the last 6 hour period (00, 06, 12, 18 UTC) or 3 hour period (03, 09, 15, 21 UTC).* | % | time × latitude × longitude |
| **wind_u_100m** <br>*100 metre U wind component* | m/s | time × latitude × longitude |
| **wind_u_10m** <br>*10 metre U wind component* | m/s | time × latitude × longitude |
| **wind_v_100m** <br>*100 metre V wind component* | m/s | time × latitude × longitude |
| **wind_v_10m** <br>*10 metre V wind component* | m/s | time × latitude × longitude |

*Don't see what you're looking for? Let us know at [feedback@dynamical.org](mailto:feedback@dynamical.org).*

## Examples

* [Open notebook in github](https://github.com/dynamical-org/notebooks/blob/main/noaa-gefs-analysis.ipynb)
* [Open notebook in colab](https://colab.research.google.com/github/dynamical-org/notebooks/blob/main/noaa-gefs-analysis.ipynb)

```python
import xarray as xr  # xarray>=2025.1.2 and zarr>=3.0.4 for zarr v3 support

ds = xr.open_zarr("https://data.dynamical.org/noaa/gefs/analysis/latest.zarr?email=optional@email.com")
ds['temperature_2m'].sel(time="2025-01-01T00", latitude=0, longitude=0).compute()
```

## Details

### Sources
To provide the longest possible historical record, this dataset in constructed from three distinct GEFS forecast archives.

- From 2000-01-01 to 2019-12-31 we use the [GEFS reforecast](https://registry.opendata.aws/noaa-gefs-reforecast/).
- From 2020-01-01 to 2020-09-23 we use [GEFS forecast archive](https://registry.opendata.aws/noaa-gefs/) data which has a lower spatial and temporal resolution.
- From 2020-09-23 to Present we use [GEFS operational forecast archives](https://registry.opendata.aws/noaa-gefs/).

### Variable availability
Data is available for all variables at all times with the following exceptions.

- Unavailable before 2020-01-01:
  `relative_humidity_2m`,
  `percent_frozen_precipitation_surface`,
  `categorical_freezing_rain_surface`,
  `categorical_ice_pellets_surface`,
  `categorical_rain_surface`,
  `categorical_snow_surface`
- Unavailable 2020-01-01T00 to 2020-09-22T21:
  `geopotential_height_cloud_ceiling`

### Construction
To create a single time dimension we concatenate the first few hours of each forecast. From 2000-01-01 to 2019-12-31 reforecasts are available once per day and this dataset uses the first 21 or 24 hours of each forecast. From 2020-01-01 to present forecasts are available every 6 hours and this dataset uses the first 3 or 6 hours of each forecast. Variables with an instantaneous `step_type` use the shortest possible lead times (e.g. 0 and 3 hours) while accumulated variables must use one additional forecast step (e.g. 3 and 6 hours) because they do not have an hour 0 forecast value.

### Interpolation
For most of the time range of the archive the source data is available at 0.25-degree resolution and a 3 hourly time step and we perform no interpolation. There are two exceptions to this. 1) From 2020-01-01 to 2020-09-23 the source data has a 1.0-degree spatial resolution and a 6 hourly time step. 2) From 2020-09-23 to present the 100m wind components have a 0.5-degree spatial resolution in the source data. To provide a consistent archive in the above two cases we first perform bilinear interpolation in space to 0.25-degree resolution followed by linear interpolation in time to a 3-hourly timestep if necessary. The original, uninterpolated data can be obtained by selecting latitudes and longitudes evenly divisible by 1 and, in case 1), time steps whose hour is divisible by 6.

### Compression
The data values in this dataset have been rounded in their binary floating point representation to improve compression. See [Klöwer et al. 2021](https://www.nature.com/articles/s43588-021-00156-2) for more information on this approach. The exact number of rounded bits can be found in our [reformatting code](https://github.com/dynamical-org/reformatters/blob/main/src/reformatters/noaa/gefs/common_gefs_template_config.py).
