'CDC' (now a part of PSD) netCDF Conventions: Gridded Data
The following format description indicates the minimum requirements for creating netCDF files using the CDC netCDF standard format.
The conventions in this document are compatible with, but more restrictive than standards developed jointly between the institutions participating in the now-defunct NOAA Cooperative Ocean-Atmosphere Research Data Service (COARDS). This basic format may be enhanced with additional dimensions, variables, and attributes as long as the standard elements are included.
- FILE NAME
- All netCDF files must have '.nc' as the final suffix of the file name.
- DIMENSIONS/COORDINATE VARIABLES
- One or more dimensions can be used with each data variable.
There are four standard dimensions: time, level, lat, lon. In
data variable definitions, these dimensions must be used in this order
(if present) as they appear in CDL. In Fortran, this order is
reversed in function calls.
When "extra" dimensions are used, such as with model runs, they should appear to the left of the standard dimensions in a variable definition (in CDL order). The dimension names should begin with a letter and be composed of letters, digits, and underscores.
time is defined as the Unlimited (or record) dimension, except in those cases where "extra" dimensions are used.
Coordinate variables that correspond to the dimensions must have the same names as the dimensions. Coordinate values of a coordinate variable must be either monotonically increasing or monotonically decreasing. However, the coordinate values need not be evenly spaced. Missing values are not allowed in coordinate variables.
- DATA VARIABLES
- One or more per file.
Variable names should begin with a letter and be composed of letters, digits, and underscores. The data type should be byte, short, long, float, or double.
PSD has standard variable abbreviations for most climate-related variables that should be used where possible.
- The type for attributes is character except as noted.
Data variable attributes:
- Dictionary attributes (dataset, var_desc, level_desc, statistic,
parent_stat) -- for use by CRDtools dictionary browse application.
Select from list of valid
attributes available, or use "Other", "-". If these are not
specified, the default is "Other".
- valid_range -- expected "reasonable" range for variable. Same
type as unpacked values.
- actual_range -- actual data range for variable. Same type
as unpacked values.
- least_significant_digit -- power of ten of the smallest decimal
place in unpacked data that is a reliable value. Type is short.
- precision -- number of places to right of decimal point that are
significant, based on packing used. Type is short.
- units -- units the variable is recorded in. Where
possible, the units should follow the
- missing_value -- the value that signifies grid points
for which there is no data available. This value should
be outside of the valid_range of the data and should not
equal the netCDF standard initial data value for the data
type (nor should it equal _FillValue if used).
missing_value has the (possibly packed) data value
- long_name -- a long descriptive name.
This could be used for labelling plots, for example.
If a variable has no long_name attribute, the variable
name will be used as a default.
- add_offset -- If present for a variable, this number is to be added
to the data after it is read by the application that
accesses the data. add_offset has the unpacked value
data type. Where the data is not packed, add_offset = 0.
If add_offset is omitted, the default is 0.
- scale_factor -- If present for a variable, the data are to be
multiplied by this factor after the data are read by
the application that accesses the data. scale_factor
has the unpacked value data type. Where the data
is not packed, scale_factor = 1.
If scale_factor is omitted, the default is 1.
The attributes scale_factor and add_offset can be used together to provide simple data compression to store low-resolution floating-point data as small integers in a netCDF file.
The unpacking algorithm is:
unpacked value = add_offset + ((packed value) * scale_factor)
name -- "time"
type -- double
- long_name -- "Time"
- units -- a character string formatted as recommended in the Unidata
udunits package. The string contains multiple parts:
- a time unit -- The valid units for time are listed in the Unidata udunits standard. The most commonly used of these strings (and their abbreviations) includes day (d), hour (hr, h), minute (min), second (sec, s), year (yr). A year is defined as being exactly 3.1536e7 secs or 365 days (i.e. no leap years). Plural forms are also acceptable.
- the string "since"
- a base date in the form "year-month-day"
- an optional base time in the form "hours:minutes:seconds"
- an optional base time zone offset from GMT
"hours since 1900-01-01 06:00:00 -6:00"
indicates the number of hours since January 1st, 1900 at 6:00 in the morning in the Mountain Daylight Time zone.
NOTE: The normally used CDC base date/time is: "0001-01-01 00:00:00".
Time coordinate variables representing climatological time (an axis of 12 months, 4 seasons, etc. that is located in no particular year) should be encoded like other time axes but with the added restriction that they be encoded to begin in the year 0000.
NOTE: There are udunits functions that can interpret and manipulate the time units string.
- actual_range -- start and end times in the same time units
and base as in the units attribute. Type is double.
- delta_t -- The amount of time between time coordinate values,
in the format "yyyy-mm-dd hh:mm:ss". Smaller (unused)
time elements are zero-filled (e.g., if the delta_t
is one month, "0000-01-00 00:00:00" signifies one month
between time values).
If there is no regular time increment, all the elements should be zero-filled. If delta_t is omitted, no regular time increment is implied.
- avg_period -- Required only for time-averaged data. The
period of time over which the data was averaged,
in the format "yyyy-mm-dd hh:mm:ss". Smaller (unused) time
elements are zero-filled (e.g., if the averaging period is
one month, "0000-01-00 00:00:00" signifies that each value
is an average of one month's values).
- prev_avg_period -- Required only for time-averaged data.
The average period represented in the source
variable before taking the average. Format
is "yyyy-mm-dd hh:mm:ss". Smaller (unused) time
elements are zero-filled.
- ltm_range -- Required only for time-averaged data. The
begin and end values of the time period used to create
the averaged data using the same time units and base
as in the units attribute. Type is double.
- subset_begin, subset_end -- Required only for
time-averaged data not averaged over a full time
unit. The portion of a time unit actually used.
Format is "yyyy-mm-dd hh:mm:ss". Smaller (unused)
time elements are zero-filled.
For example, if a long term mean was created using only the months March through June, subset_begin would be "0000-03-00 00:00:00" and subset_end would be "0000-06-00 00:00:00".
name -- "level"
type -- float
- long_name -- Standard choices are: "Level" for pressure
levels, "Sigma" for sigma levels, "Isentropic" for theta levels and
"Depth" for depth below a datum (usually sea level). If the level is
not one of these, an arbitrary name may be chosen.
- units -- corresponds to long_name choices above.
Standard choices are: "millibar", "sigma_level", "degree_K", and
"meter". If the level does not use one of these, the units should
- actual_range -- level range in same units as the units
attribute. Type is float.
- positive -- indicates the direction of positive for the
level axis. Valid values are "up" and "down". Normally, pressure
levels, sigma levels, and depths are "down", theta levels and heights
are "up". The positive attribute is not required for pressure
levels, where it defaults to "down".
name -- "lat"
type -- float
- long_name -- "Latitude"
- units -- "degrees_north"
- actual_range --
latitude range in degrees. The range values are used to indicate
order of storage (e.g., 90,-90 would indicate the latitudes started
with 90 and ended with -90). Type is float.
name -- "lon"
type -- float
- long_name -- "Longitude"
- units -- "degrees_east"
- actual_range -- longitude range in degrees. The range
values are used to indicate order of storage (e.g., 0,360 would
indicate the longitudes started with 0 and ended with 360). Type is
Longitudes may be represented modulo 360, meaning that -180 and 180 are both valid representations of the International Dateline and 0 and 360 are both valid representations of the Prime Meridian. Note, however, that the sequence of numerical longitude values stored in the netCDF file must be monotonic (in a non-modulo sense).
- title -- data set title (not specific to any one variable).
- history -- brief description on multiple lines of the procedures
used to generate the file. The description should include:
- an attribution for CDC and the initials (in reverse order if privacy is desired) of the individual that generated the file.
- the source dataset and name(s) or range of the input file(s)
- the date the file was generated
- the name of the software package or a description of the custom source code used to create the file.
Here's an example history attribute:
Created by NOAA-CIRES Climate Diagnostics Center Data Management Group (SAC, email@example.com) from the NCEP Reanalysis data set on 1997/07/03 by ltmmaker using /Datasets/ncep.reanalysis/surface_gauss/air.sfc.73.nc thru /Datasets/ncep.reanalysis/surface_gauss/air.sfc.96.ncWhen a file is updated, the original history attribute contents should be retained and prefixed with information about the changes, including:
- an attribution for CDC and the initials (in reverse order if privacy is desired) of the individual that updated the file, if the update procedure isn't part of an automated process
- update date
- the name of the software package or a description of the custom source code used to update the file
- source(s) of new data