ObsPack Data Product
Each ObsPack data product includes
- a unique product name
- prepared data sets
- a product summary
- an e-mail distribution list of all data providers
- a complete set of configuration files used by NOAA to construct the data product
Each ObsPack data product has a unique product name using the following structure.
obspack_<trace gas identifier>_<preparation lab number>_<product name>_<product version number>_<preparation date>
The version numbering scheme is major.minor[.minor] where a major release is indicated by the first number in the sequence and minor revisions are indicated by the second and third (optional) numbers in the sequence. Below are a few examples.
obspack_co2_1_PROTOTYPE_v1.0.0_2012-11-06 (first major release of PROTOTYPE data product)
obspack_co2_1_PROTOTYPE_v0.9.3_2012-07-26 (minor revision to PROTOTYPE beta-version)
obspack_co2_1_PROTOTYPE_v0.9.2_2012-07-26 (minor revision to PROTOTYPE beta-version)
obspack_co2_1_PROTOTYPE_v0.9.1_2012-07-24 (minor revision to PROTOTYPE beta-version)
obspack_co2_1_PROTOTYPE_v0.9_2012-07-23 (first major release of PROTOTYPE beta-version)
obspack_co2_1_GLOBALVIEW-CO2_v1.0_2012-07-25 (first major release of GLOBALVIEW-CO2 using ObsPack framework)
Please note: The latest minor revision of a major release includes all changes included in intermediate minor revisions if they exist. We can expect a considerable number of minor revisions while the ObsPack framework is being developed. Once the framework has been thoroughly vetted, the number of minor revisions should be greatly reduced.
Prepared Data Sets
An ObsPack data set is 1) a collection of measurements for a single trace gas species, 2) derived from a single laboratory-project, and 3) prepared according to a set of instructions. A set of instructions, specific to each data set, configures ObsPack software to subset data, average data, or pass data through without alteration. Multiple instruction sets for a given measurement record will create multiple unique data sets. For example, the NOAA quasi-continuous CO2 measurement record from the 396 magl intake height on the Wisconsin tall tower site (LEF) could be subsetted into 2 data sets; one consisting of average values of afternoon measurements only, and a second consisting of average values of nighttime measurements only. The ways in which data are prepared depend on the intended use of the data product.
Data sets are presented as individual files. File names are unique and include the trace gas species identifier, 3-letter site or project code, measurement project, laboratory identification number, a data selection tag, and the file type identifier, e.g., "nc" (netCDF4) and "txt" (ASCII text). The file name structure is as follows.
<trace gas identifier>_<site code>_<project>_<lab number>_<selection tag>.<filetype extension>
Below are a few examples.
Each data set includes comprehensive metadata describing the sampling location, sampling strategy, preparation strategy, and contact information for the contributing laboratory and data providers. Also included in each data set is a URL link to images of a world map highlighting the site location, the contributing lab's logo (where available) and country flag. These metadata provide users with all the information required to give proper attribution when displaying data from an ObsPack product. Figure 1 is constructed entirely from data and metadata extracted from a single data set.
Figure 1: Data and metadata for this graph from a single ObsPack data file.
Inside the Data File
Each data file includes a single prepared data set and associated metadata. Each data item in a data set includes the sample collection time, position, reported mole fraction or isotope ratio, estimated uncertainty (when available), the number (n) of individual measurements contributing to the reported value, and a unique ID that distinguishes the item from all other data items in the product. Metadata are presented as global attributes that describe general features of the data set and variable attributes that describe characteristics of the variables associated with each data item. Tables 1 and 2 describe global and variable attributes included in a typical ObsPack netCDF data file.
|site_code||3-letter site code as defined by GAWSIS. (e.g., LEF)|
|site_name||Standard site name (e.g., Park Falls, Wisconsin)|
|site_country,site_country_flag||Country in which site is located and link to image of flag|
|site_longitude||Longitude (decimal degree) at representative site location|
|site_latitude||Latitude (decimal degree) at representative site location|
|site_elevation||Ground or surface elevation at representative site location|
|site_elevation_unit||site_elevation is reported in meters above sea level (masl)|
|site_map||URL link to world map highlighting site location (file type is png)|
|site_utc2lst||Hour conversion from UTC to LST|
|site_url||URL link to site web page (optional)|
|site_comments||Additional relevant site information (optional)|
|dataset_num||Integer that uniquely identifies the data set in the ObsPack data product|
|dataset_name||Character string that uniquely identifies the data set in the ObsPack data product. Data set name are discussed here.|
|dataset_globalview_prefix||Character string of equivalent GLOBALVIEW file name prefix (see GLOBALVIEW for details).|
|dataset_parameter||Identifies trace gas species included in data set (e.g., co2, c13co2)|
|dataset_process||String description of ObsPack data preparation (e.g., PassThru, TimeStepAverage)|
|dataset_project||Typically identies sampling platform and strategy (e.g., surface-flask, tower-insitu, aircraft-pfp)|
|dataset_db||Boolean T/F. Indicates source data are from NOAA operational database (internal use only).|
|dataset_archive_dir||Source data archive directory (internal use only).|
|dataset_archive_file||Source data file or file filter (internal use only).|
|dataset_intake_ht||This attribute is set when it is necessary to subset source data by sample intake height (internal use only).|
|dataset_intake_ht_unit||dataset_intake_ht is reported in meters above ground level (magl) (internal use only).|
|dataset_time_window_utc||Attribute set when necessary to subset source data by sample collection time (UTC) (internal use only).|
|dataset_time_window_lst||Attribute set when necessary to subset source data by sample collection time (LST) (internal use only).|
|dataset_parse_function||Python module used to read source data (internal use only).|
|dataset_data_frequency||Measurement frequency of source data.|
|dataset_data_frequency_unit||Indicates the time unit of the data set_data_frequency attribute.|
|dataset_platform||Fixed or Mobile.|
|dataset_start_date||Date of first item in data set (ISO 8601 format).|
|dataset_stop_date||Data of last item in data set (ISO 8601 format).|
|dataset_selection||Brief description of how data have been selected by data contributor or prepared by NOAA.|
|dataset_selection_tag||Short descriptor to help convey how data have been selected by data contributor or prepared by NOAA. The selection tag is included in the data set name.|
|dataset_calibration_scale||Measurements are relative to reported calibration scale.|
|dataset_fair_use||This is the ObsPack fair use statement agreed upon by data providers.|
|dataset_reference_number||Number indicating how many references to published literature to expect in this file.|
|dataset_reference_#_name||Reference provided by data contributor. # is a number from 1 to relative data set_reference_number.|
|lab_num||Laboratory identification number. See Lab Table.|
|lab_abbr||Laboratory abbreviation or acronym (e.g., CONTRAIL, UHEI-IUP)|
|lab_ongoing_atmospheric_air_comparison||If "yes", lab participates in at least one ongoing direct atmospheric air comparison experiment.|
|lab_comparison_activity||Brief description of measurement comparison activities (optional).|
|program_abbr [ _name, _address, _country, _country_flag, _url, _logo ]||Providers may make a distinction between the measurement lab and an over-arching research program (e.g., NACP, ICOS).|
|provider_number||Number of providers listed in the file.|
|obspack_contact_name [ _lab, _email ]||Contact information of ObsPack preparer.|
|obspack_data_time_step||Time interval at which ObsPack data are presented (e.g., day, hour).|
|obspack_name||Unique ObsPack identification string. Structure is obspack_<parameter>_<preparation/distribution lab number>_<product name>_<version number>_<preparation date> (e.g., obspack_co2_1_PROTOTYPE_v0.9.1_2012-07-20).|
|obspack_description||Brief description of data product contents.|
|obspack_version||ObsPack software version number.|
|obspack_creation_date||Date when the ObsPack data product was prepared.|
|obspack_citation||Required ObsPack citation. This citation is in addition to the requirements of the ObsPack Fair Use statements.|
|obspack_fair_use||These cooperative data products are made freely available to the scientific community and are intended to stimulate and support carbon cycle modeling studies. We rely on the ethics and integrity of the user to assure that each contributing national and university laboratory receives fair credit for their work. Fair credit will depend on the nature of the work and the requirements of the institutions involved. Your use of an ObsPack data product implies an agreement to contact each contributing laboratory to discuss the nature of the work and the appropriate level of acknowledgement. If an ObsPack data product is essential to the work, or if an important result or conclusion depends on an ObsPack product, co-authorship may be appropriate. This should be discussed with each data provider at an early stage in the work. Contacting the data providers is not optional; if you use an ObsPack data product, you must contact the data providers. To help you meet your obligation, each data product includes an e-mail distribution list of all data providers. ObsPack data products must be obtained directly from the ObsPack Data Portal at www.esrl.noaa.gov/gmd/ccgg/obspack/ and may not be re-distributed.
Beginning November 2013, all new ObsPack data products will have a unique Digital Object Identifier (DOI) registered with the International DOI Foundation. In addition to the conditions of fair use as stated above, users must also include the ObsPack product citation in any publication or presentation using the product. The required citation is included in every data product and in the automated e-mail sent to the user during product download.
Beginning November 2013, there are no longer any exceptions to this policy; it applies to all ObsPack products including GLOBALVIEW.
|obs_num||Unique observation number in a single data set. Ranges from 1 to UNLIMITED (netCDF).|
|obs_id||Unique identification string that distinguishes the data item from all other data items in the ObsPack data product. It includes dataset_name and obs_num.|
|obspack_num||Unique observation index number across all data sets in the ObsPack distribution. Ranges from 1 to max_obspack_num.|
|obspack_id||Unique identification string that distinguishes the data item from all other data items in any ObsPack data product. It includes obspack_name, dataset_name, and obspack_num delimited by a tilde (~).|
|time||Air sample collection time (UTC). POSIX time (number of seconds since January 1, 1970 in UTC).|
|time_decimal||Air sample collection time (UTC) in decimal year notation (e.g., 2012.4523312).|
|time_components||Air sample collection time (UTC) represented as a 6-element array [year, month, day, hour, minute, second]. Calendar time components as integers.|
|solartime_components||Air sample collection time (solar time) represented as a 6-element array [year, month, day, hour, minute, second]. UTC time is converted to local solar time based on longitude and day-of-year. Solar time components as integers.|
|value||Reported mole fraction or isotope ratio. Units depend on trace gas species.|
|value_unc||Standard deviation of the reported mean value when nvalue is greater than 1. Units depend on trace gas species.|
|nvalue||Number of individual measurements used to compute reported value.|
|latitude||Latitude at which air sample was collected (units: decimal degrees).|
|longitude||Longitude at which air sample was collected (units: decimal degrees).|
|altitude||Altitude (surface elevation plus sample intake height) at which air sample was collected. Units are meters above sea level (masl).|
|elevation||Surface or ground elevation at which air sample was collected. Units are meters above sea level (masl).|
|intake_height||Height above ground at which air sample was collected. Units are meters above ground level (magl).|
|obs_flag||Representation flag indicates that reported value has large spatial scale representation (1) or is locally influenced (0). This attribute is derived from the data providers source data. The implementation of this flag is still being developed. Suggestions welcome.|
The ObsPack product summary (<product name>_dataset_summary.txt) briefly summarizes the contents of the data product including 1) the ObsPack Fair Use Statement, 2) a brief description of the data product and its intended use, 3) the total number of data sets (max_dataset_num) and the total number of observations (max_obs_num) included in the package, and 4) a list of all data sets in the data product. Listed with each data set is the contributing laboratory abbreviation; the start and end date of the included data; indication of lab participation in ongoing direct atmospheric air comparison experiments; and a short phrase indicating the data selection strategy used by the data provider.
Summary files for currently available data products can be found by clicking on the information icon located next to the list of available product versions.
Data Provider E-mail Distribution List
Use of an ObsPack data product implies agreement to contact each contributing laboratory to discuss the nature of the work and the appropriate level of acknowledgement, which may include co-authorship (see the ObsPack Fair Use Statement). To help users meet this obligation, each data product includes an e-mail distribution list of all data providers. The text file <product name>_data_provider_email_list.txt provides the e-mail list in two formats to facilitate use. The list includes e-mail addresses for those data providers who have contributed to the particular data product.