data.avg
data.avg can average, fill and covert data to be rectangular (single record type). It always coverts data to be rectangular, but if the averaging interval is less than or equal to the data frequency, the output is just the input (possibly filled). The result is output to standard output.
The normal mode of operation is to average on a one hour interval with no filling, but with variables split on cut size and standard deviations for them generated. Average bins are attempted to be aligned on intelligent time boundaries (e.g. one hour average bins start on the hour).
It can operate either by requesting data from data.get itself or by operating on its input (as part of a pipe).
Command Line Usage
data.avg [--interval=seconds|month] [--stddev=on/off | --nostddev]
[--cut=on/off | --nocut] [--cutflags]
[--ignorecut=pattern1,pattern2,...]
[--count=on/off] [--source=archive] [--fill]
[--noalign]
[--contam] [--contam-match=pattern1,pattern2,...]
[--cpx] [--nocache]
[station rec start end]
Arguments
If the station, record, start and end are omitted data.avg works on standard input instead of requesting data. Specifying a toggle argument without an “on/off” code enables that toggle. That is ”–count” turns on count generation.
start and end
The time specifiers for the data to be retrieved. Start is inclusive while end is exclusive, so all data contained within the half open interval [start,end) will be returned. Any convertible time format is accepted.
station
The station identifier code. For example 'brw'. Case insensitive.
records
The cpd2 record type to be retrieved. For example: 'S11a'. Case sensitive. Multiple record types may be separated by ”,”, ”;” or ”:”. Note that this is a single argument and that spaces are not allowed.
--interval=seconds|month
Averaging interval in seconds, defaulting to 3600. If the result is divisible by 60 seconds, the first bin has the minute field rounded down. If it is divisible by 3600 the hour field is also rounded, and the day field if it is day divisible. May also be “month” to average on month intervals.
--stddev=on/off --nostddev
Enable or disable (default enabled) standard deviation calculation for each field. A separate field for each field is generated that contains the standard deviation for that record.
--cut=on/off --nocut
Enable or disable (default enabled) cut size splitting. If cut size splitting is enabled, all fields will be separated into 0 (default/coarse/10um) and 1 (flagged/fine/1um) variants based on the cut size flag for that record.
--cutflags
Split flags based on the cut size (default off).
--ignorecut=pattern1,pattern2,...
Set patterns to ignore the cut split on. If a variable matched the regular expression defined by the pattern encased as “^pattern$” then cut size splitting on that variable is not done.
--count=on/off
Enable or disable (default disable) generating a count field for each variable (after cut splitting). The count field is the variable name with an “N” added to the end.
--source=archive
Selected the source archive to request data from, defaulting to clean. Has no effect when working on standard input.
--fill
Enable filling of missing time ranges. This enables filling with MVCs for time ranges that are missing entirely from the source archive. Alignment and normal binning are preserved. That is a gap of two hours in length when generating one hour averages would result in two records of all MVCs.
--noalign
Disable bin alignment as described above.
--contam
Enable averaging of contaminated data.
--contam-match=pattern1,pattern2,...
List of Perl regular expressions that define all variables that are affected by contamination. Any variable that matches one of these patterns will not be averaged when it is contaminated (assuming –contam has not been set). The default is to match all scattering, backscattering, absorption and CN counts.
--cpx
Pass –cpx to data.get.
--nocache
Pass –nocache to data.get.
Example Usage
Single record one hour average
data.avg sgp S11a 2008:10 2008:11
Multiple records of raw data in one day average
data.avg --interval=86400 --source=raw bnd S11a,A11a 2003W02 2003W03
Filling data from a pipe to a file
data.get sgp A11a 2008 2009 raw | data.avg --fill > 2008_hourly