getmet

getmet is a wrapper to data.avg and data.export that is intended to simplify access to MET data through the DB system. It handles setting the environment to be compatible with the DB system and calling the two handler programs.

The arguments consist of a combination of the possible arguments to those two programs.

See the wx record definition for the definition of the variables.

Simple Usage

getmet [--interval=<time>] [--mode=excel|xl|csv|idl|r] 
       station start end

start and end

The time specifiers for the data to be retrieved. Start is inclusive while end is exclusive, so all data contained within the half open interval [start,end) will be returned. Any convertible time format is accepted.

station

The station identifier code. For example 'brw'. Case insensitive.

--interval=<time>

The averaging interval. This may be ⇐ 0 to perform no averaging, an exact amount of time in either seconds, an integer and a multiplier (one of “d” for days, “h” for hours, “m” for minutes or “s” for seconds, e.x. “15m” for 15 minutes), or a special time range. Special time ranges imply alignment to the relevant boundary. The following are valid:

  • year - Yearly averages
  • quarter - Standard quarter averages
  • month or m - Monthly averages
  • week - Weekly averages
  • day or d - Daily averages
  • hour or h - Hourly averages
  • minute - One minute averages
  • forever, infinity, inf, or always - Entire input data range (not aligned)

If a special time range is not given and alignment is enabled (that is, the noalign flag is not used), then if the interval is greater divisible by one day the starting day boundary, if it is divisible by one hour to the hour boundary, divisible by a minute to that boundary, otherwise it is not aligned. Special time ranges are always aligned (except “forever”).

--mode=excel|xl|csv|idl|r|R

Select mode. Setting a mode sets the defaults for all parameters relevant to that mode.

CSV

Export data in CSV format with year and DOY timestamps.

Excel, xl

Export data in CSV format for importing into Excel.

idl

Export data in space separated and converted format for reading with IDL. Also generates a header on standard error with the names and index (zero-based) of each column.

This will also generate a header on standard error with the column names as above.

r and R

Export data in CSV format for importing into R.

Full argument listing

getmet [--interval=<time>] [--stddev=on/off | --nostddev] 
       [--ignorecut=pattern1,pattern2,...]
       [--count=on/off] [--source=archive] [--fill]
       [--picklast=pattern11,pattern2,...]
       [--noalign] [--decimal-format=FORMAT,MVC]
       [--include=RANGES] [--mode=excel|xl|csv|idl|r|R|archive|cpd2] 
       [--join=csv|space] [--station=on|off] 
       [--date-epoch=on|off] [--date-excel=on|off] [--date-iso=on|off] 
       [--date-fyear=on|off] [--date-julian=on|off]
       [--date-yeardoy=on|off] [--date-doy=on|off]
       [--mvc-type=blank|mvc|na] [--mvc-flag=follow|end|off]
       [--flags-type=breakdown|0x|default] [--header-mark=...]
       [--header-names=on|off|-1] [--header-desc=on|off]
       [--header-namesstderr=on|off] [--header-wavelength=on|off]
       [--header-cpd2global=on|off] [--header-mvcs=on|off]
       [--format-default=off|<format>] [--format-<name>=<format>]
       station start end [source]

Arguments

start and end

The time specifiers for the data to be retrieved. Start is inclusive while end is exclusive, so all data contained within the half open interval [start,end) will be returned. Any convertible time format is accepted.

station

The station identifier code. For example 'brw'. Case insensitive.

--interval=<time>

The averaging interval. This may be ⇐ 0 to perform no averaging, an exact amount of time in either seconds, an integer and a multiplier (one of “d” for days, “h” for hours, “m” for minutes or “s” for seconds, e.x. “15m” for 15 minutes), or a special time range. Special time ranges imply alignment to the relevant boundary. The following are valid:

  • year - Yearly averages
  • quarter - Standard quarter averages
  • month or m - Monthly averages
  • week - Weekly averages
  • day or d - Daily averages
  • hour or h - Hourly averages
  • minute - One minute averages
  • forever, infinity, inf, or always - Entire input data range (not aligned)

If a special time range is not given and alignment is enabled (that is, the noalign flag is not used), then if the interval is greater divisible by one day the starting day boundary, if it is divisible by one hour to the hour boundary, divisible by a minute to that boundary, otherwise it is not aligned. Special time ranges are always aligned (except “forever”).

--stddev=on/off --nostddev

Enable or disable (default enabled) standard deviation calculation for each field. A separate field for each field is generated that contains the standard deviation for that record.

--count=on/off

Enable or disable (default disable) generating a count field for each variable (after cut splitting). The count field is the variable name with an “N” added to the end.

--source=archive source

Selected the source archive to request data from, defaulting to clean. Has no effect when working on standard input.

--fill

Enable filling of missing time ranges. This enables filling with MVCs for time ranges that are missing entirely from the source archive. Alignment and normal binning are preserved. That is a gap of two hours in length when generating one hour averages would result in two records of all MVCs. When not combining records filling will also cause missing record types for a given bin to be output.

--picklast=pattern1,pattern2,...

Set patterns to only pick the latest value instead of averaging. If a variable matched the regular expression defined by the pattern encased as “^pattern$” then only the last value of it is output instead of any averaging being done.

--noalign

Disable bin alignment as described above.

--decimal-format=FORMAT,MVC

Set the format and/or MVC of all decimal numbers that are averaged. For example “–decimal-format=%0+10.3e,+9.999e-99” would cause all decimal floating point numbers to be output in scientific notation instead of their existing decimal format.

--include=RANGES

Specify the range(s) separated by “;” or “,” for the times that should be included within each average bin. Each range consists of either a single time amount specifier or a pair of two separated by “-”. For single times it specifies the exact time after the bin start time to include, for a range of two any between the start (inclusive) and end (exclusive) are included.

Each time specifier can be formed like “[ [hours:]minutes:]seconds”. That is, it may consist of an optional hour and minute specifiers in addition to the number of seconds after the bin start. It may also be an integer and a multiplier (one of “d” for days, “h” for hours, or “m” for minutes), for example “30m” for 30 minutes.

For example: to only include minutes 10-15 and 25-35 in an hour average: –include=10:00-15:00,25:00-35:00 or –include=10m-15m,25m-30m

--mode=excel|xl|csv|idl|r|R|archive|cpd2

Select mode. Setting a mode sets the defaults for all parameters relevant to that mode.

CSV

Export data in CSV format with year and DOY timestamps.

Excel, xl

Export data in CSV format for importing into Excel.

idl

Export data in space separated and converted format for reading with IDL. Also generates a header on standard error with the names and index (zero-based) of each column.

This will also generate a header on standard error with the column names as above.

r and R

Export data in CSV format for importing into R.

archive

Archive export mode.

cpd2

Reformatted CPD2 style data. This mode does not accept any other format switches.

--join=csv|space

Select joining mode. Joining determines how values are combined into record lines and how they are separated within those lines.

csv

Comma separated value join mode. If values contain spaces or any other condition that would require quoting, that value is quoted as per Text::CSV_XS.

space

Space separated values. No handling for values with embedded spaces; do not use on string data.

--station=on|off

Enable or disable the leading column with the station code.

--date-epoch=on|off

Enable or disable outputting epoch time (seconds since Jan 1 1970).

--date-excel=on|off

Enable or disable outputting Excel formatted time (yyyy-mm-dd hh:mm:ss).

--date-iso=on|off

Enable or disable outputting ISO formatted time (yyyy-mm-ddThh:mm:ssZ).

--date-fyear=on|off

Enable or disable outputting fractional year.

--date-julian=on|off

Enable or disable outputting fractional Julian day.

--date-yeardoy=on|off

Enable or disable outputting a year and fractional day of year.

--date-doy=on|off

Enable or disable outputting a fractional day of year.

--mvc-type=blank|mvc|na

Select format for undefined values.

  • blank - Leave that position as a zero length blank.
  • mvc - Use the missing value code for that column.
  • na - Print the string 'NA' for all missing values.

--mvc-flag=follow|end|off

Set the mode for MVC flags for each column. The flag is either 0 (valid value) or 1 (invalid).

  • follow - The column with the MVC flag immediatly follows the value column.
  • end - All MVC flags are placed at the end of the value columns in the same order as the value columns.
  • off - Disable MVC flags.

--flags-type=breakdown|0x|default

breakdown

Enable breakdowns of bitwise flags. When enabled instead of a single column with the bit set of flags N columns of values of either 0 or 1 will be generated for each bitwise flag. N is the number of bits that the bit set can contain (e.g. 16 for a four digit hexadecimal number).

0x

Insert '0x' before any hexadecimal numbers.

default

Do not reformat flags.

--header-mark=...

Set the header marker. This string is prepended to all header lines.

--header-names=on|off|-1

Enable or disable outputting a header line before the data begins with the names of each variable. The names do not contain spaces. If set to -1 then the names are output immediately before the actual data instead of at the start of the header lines.

--header-desc=on|off

Enable or disable outputting a header line with the field description of each variable before data. The field description may contain spaces.

--header-namesstderr=on|off

Enable or disable printing a header line to standard error with the names and column numbers of the data. The line is comma separated values of the form “NAME:INDEX”. Where the name is the name as –header-names above and the index is the zero-based index of that column.

--header-wavelength=on|off

Enable or disable printing a header line with the wavelength information for each variable.

--header-cpd2global=on|off

Enable or disable printing header lines for all CPD2 global headers.

--header-mvcs=on|off

Enable or disable printing a header line with the MVC for each variable.

--format-default=off|<format>

Set the default format for all value columns or disable default formatting. When a default format is set, all columns use that format unless overridden with –format-<name> (see below). The format is a CPD2 extended printf code.

--format-<name>=<format>

Override the format for a given variable (specified by <name>). The format is a CPD2 extended printf code.

Example:

  1. -format-BaG_A11=%07.2f

Example Usage

Unaveraged BRW data

getmet brw 2011-01-01 2011-01-02 > output.csv

Unaveraged BRW data with Excel time stamps

getmet --mode=xl brw 2011-01-01 2011-01-02 > output.csv

Averaged SUM data with fractional years

getmet --interval=1h --date-fyear brw 2011-01-01 2011-01-02 > output.csv