By Michael E. Barth, Patricia A. Miller, and Alexander E. MacDonald
Meteorological observations are essential to successful weather prediction because they provide a direct indication of current atmospheric conditions. Forecasters view these observations to detect and follow weather disturbances, and to provide critical detail to the public about the formation and movement of major meteorological phenomena such as precipitation events, severe storms, and flight-level turbulence. Observations also form the "initial" conditions for data assimilation systems which produce the objective numerical weather prediction outputs heavily used in all areas of weather forecasting. However, access to meteorological observations has not always been readily available outside the world's major meteorological centers. The Meteorological Assimilation Data Ingest System (MADIS) was established so that FSL's observations and observation-handling technology could be shared with the greater meteorological community. Through MADIS, value-added data are available from FSL's Central Facility, and observations (Figure 1) are provided to support data assimilation, numerical weather prediction, and other meteorological applications and uses.
Figure 1. MADIS observations currently available in North America.
MADIS subscribers have access to a reliable and easy-to-use database containing real-time and archived datasets available either via ftp or using Unidata's Local Data Manager (LDM) software. Quality Control (QC) of MADIS observations is also provided, since considerable evidence exists that the retention of erroneous data, or the rejection of too many good data, can substantially distort forecast products. Observations in the FSL database are stored with a series of flags indicating the quality of the observation from a variety of perspectives (e.g. temporal consistency and spatial consistency), or more precisely, a series of flags indicating the results of various QC checks. MADIS users can then inspect the flags and decide whether or not to utilize the observation.
MADIS also includes an Application Program Interface (API) that provides users with easy access to the data and QC information. The API allows each user to specify station and observation types, as well as QC choices, and domain and time boundaries. Many of the implementation details that arise in data ingest programs are automatically performed. Users of the MADIS API, for example, can choose to have wind data automatically rotated to a specified grid projection, and/or choose to have mandatory and significant levels from radiosonde date interleaved, sorted by descending pressure, and corrected for hydrostatic consistency.
The API is designed so that the underlying format of the database is completely invisible to the user, and it can be easily extended to other databases. In the initial version of the API, support is provided for the FSL database, and also for the database used in the National Weather Service (NWS) Advanced Weather Interactive Processing System (AWIPS) deployed at NWS weather forecast offices.
The FSL MADIS database and API are freely available to interested parties in the meteorological community. Observations initially available include radiosonde, automated aircraft, wind profiler, and surface datasets. The latter includes over 4,000 mesonet stations from local, state, and federal agencies, and private firms. Organizations already receiving MADIS datafeeds include NWS forecast offices, the National Center for Atmospheric Research, the NWS National Centers for Environmental Prediction, and several private meteorological firms and major universities. Although most of the MADIS data are available without restrictions, aircraft and mesonet observations are proprietary to the data providers, and thus are subject to review and restriction.
MADIS data files are compatible with AWIPS, with the analysis software provided by the FSL Local Analysis and Prediction System (LAPS), and will also be compatible with the data ingest subsystem in future versions of the community-developed Weather Research and Forecasting (WRF) model.
Available Observational DatasetsThe observational datasets supported in the initial version of MADIS include:
These datasets include the observation types that are slated for use in the initial version of the WRF data assimilation system. They are acquired at FSL from a variety of sources including NOAAPORT, Aeronautical Radio INCorporated (ARINC), and FSL's NPN and ground-based GPS Meteorology (GPS-Met) data hubs. Mesonet data are decoded and stored with software originally developed for the NWS LDAD system. Major contributors to the mesonet data stream are state Departments of Transportation, which provide both meteorological and road observations, the NOAA Cooperative Institute for Regional Prediction at the University of Utah, which provides "MesoWest" data from the cooperative mesonets in the western United States, and the Boulder NWS Forecast Office, which provides mesonet data (Table 1) from the local Denver/Boulder area, and also data form the Remote Automated Weather System (RAWS) network run by the National Interagency Fire Center. The mesonet dataset also includes observations from over 500 volunteer citizen weather observers from around the world. Wherever possible, FSL uses redundant sources to maximize data availability.
Note: New mesonet data are added weekly to the FSL MADIS database. See http://madis.noaa.gov/mesonet_providers.html for an updated list of mesonet stations. For a real-time display of mesonet observations, see http://madis.noaa.gov/sfc_display/.
While some of the datasets are global, the bulk of the data currently available extends from Alaska into Central America. We plan to extend the geographic coverage of the existing MADIS datasets, and also to provide access to additional datasets such as multi-agency boundary layer profiler data, and satellite and radar observations. The emphasis will be on datasets identified for inclusion in an upcoming version of the WRF data assimilation system, and on datasets not readily available elsewhere.
Automated Quality ControlQuality control procedures for the MADIS observations are, for the most part, based on the 1994 NWS Techniques Specification Package (TSP) 88-21-R2. Two categories of automated QC checks, static and dynamic, are described in the TSP for a variety of observation types, including the surface, maritime, profiler, aircraft, and radiosonde data that make up the initial MADIS datasets. The static QC checks are single-station, single-time checks which, as such, are unaware of the previous and current meteorological or hydrologic situation described by other observations and grids. Checks falling into this category include validity, internal consistency, and vertical consistency. Although useful for locating extreme outliers in the observational database, the static checks can have difficulty with statistically reasonable, but invalid data. To address this problem, the TSP also describes dynamic checks that refine the QC information by taking advantage of other available hydrometeorological information. Some examples of dynamic QC checks include position consistency, temporal consistency, and spatial consistency.
In addition to the automated QC processing, the capability to subjectively override the results of the automated checks is maintained at FSL, through the use of two text files for each dataset, a "reject" and an "accept" list. The reject file is a list of stations and associated input observations that will be labeled as bad regardless of the outcome of the QC checks. The accept file is the corresponding list for stations that will be labeled as good regardless of the outcome of the QC checks. In both cases, observations associated with the stations in the lists can be individually flagged. For example, wind observations at a particular station may be added to the reject list, but not temperature observations.
In the initial version, QC checks are not applied to all observations supported by MADIS. For instance, velocity variance observations measured by wind profilers and 24-hour minimum surface temperatures are not yet quality controlled, but are included in the FSL database. Also note that the level of QC varies among the observations, and that some quality control checks are not always applicable. For instance, temporal consistency checks cannot be applied to stations that have not reported for several consecutive hours. To document the varying quality control information available, the FSL database includes a "QC" applied bitmap indicating which QC checks were applied to each observation, and a "QC" results bitmap indicating the results of the various QC checks. The TSP also describes single character "data descriptors" for each observation, which provide an overall opinion of the quality of the observation by combining the information from the various QC checks. Algorithms used to compute the data descriptor are a function of the types of QC checks applied to the observation, and the sophistication of those checks. Level 1 QC checks are considered the least sophisticated, whereas level 3 the most sophisticated. Table 2 provides a complete list of the data descriptors. (See madis.noaa.gov for more information on MADIS QC, including a detailed description of the algorithms used in the automated QC processing.)
MADIS APIThe MADIS API is a library of subroutines, callable from Fortran, which provides access to all of the observation and QC information in the MADIS database available from FSL, and/or other supported meteorological databases (i.e., the AWIPS netCDF database). In general, the API is very easy to use, with only three basic subroutine calls required to access any single dataset. Optional subroutine calls serve to further refine or limit the information provided by the basic routines.
For users who do not want to write Fortran programs but would like easy access to the FSL or AWIPS data, utility programs for each dataset are included in the API package. These programs can be used to read station information, observations, and QC information for a single time, and then to dump these data to an output text file. The operation of each program is controlled by a text parameter file that allows the user to exercise all of the options included in the MADIS system. The programs can be run as needed to access any FSL or AWIPS data stored on the user's system, or can be run as time-scheduled tasks to get data keyed to the current time.
The utility programs (written in Fortran) are also meant to serve as tutorials that demonstrate how users can use the MADIS API calls to write their own Fortran programs. An additional tutorial program is included that demonstrates how all datasets might be read in together for use in an hourly data assimilation cycle.
Basic API Calls To initialize the MADIS software, users of the MADIS API need to first call subroutine MINIT for each dataset desired. The input arguments required include the dataset desired (such as "RAOB" or "SFC"), the name of the database being accessed (i.e, "FSL" or "AWIPS") and a logical (for example, ".true." or ".false") indicating whether the user wishes error messages to be written to the standard output file.
Once the initialization process is complete, users can then access the station information available for the data specified, via the M*STA subroutines, where * indicates the dataset, that is, MRAOBSTA or MSFCSTA for radiosonde and meteorological surface data, respectively. The only input argument required for these calls is the year/day/hour of the data to be read. Output arguments include the number of stations that are available; arrays including station identification information such as station names and ID numbers; position information such as latitude, longitude, and elevation information; and also the exact observation times.
The data and QC information for the time and dataset requested can then be accessed with a call to the corresponding MGET* subroutine, where again * indicates the dataset. For example, MGETRAOB loads in all radiosonde data, and MGETNPN loads all profiler data for the time specified. Specific variables, or observation types, within the indicated dataset are requested via variable code names included in the input argument list to the MGET* routines. Depending on the variable requested, users can choose to either access "raw" observations, such as mandatory and significant level information from radiosonde observations, or choose to have the raw data further processed as in integrated radiosonde soundings generated by interleaving the mandatory and significant data, sorting by ascending height, interpolating missing pressure levels, and correcting for hydrostatic consistency. Additional processing beyond the raw observations stored in the database includes the ability to return various forms of certain variables. For example, the user can request specific humidity and this will be calculated from the stored moisture, temperature and pressure observations; likewise, pressure and altimeter setting can both be requested, regardless of which one is actually reported by a given station and stored in the database. For surface data, a single MGETSFC call automatically retrieves data from all of the MADIS datasets, including METAR, SAO, maritime, and mesonet surface observations. In all cases, observations are automatically returned with QC arrays indicating the results of the QC checks.
The three subroutine calls needed to access integrated temperature profiles from all radiosonde sites in the database are shown in Table 3. Note that the year/day/hour argument is specified with a 9-character string that includes a 2-digit year and 3-digit Julian day. To aid the user, MADIS also offers utility routines that convert these character strings back and forth from standard integer representations of the year, month, day, and hour, as well as translations to and from the AWIPS standard time-and-date string.
Note: This sample code is for reading integrated temperature profiles and QC flags from the MADIS radiosonde dataset. Only three MADIS API calls are needed to read the data from all radiosonde sites in the database. The API automatically generates the integrated soundings by interleaving the mandatory and significant data, sorting by ascending height, interpolating missing pressure levels, and correcting for hydrostatic consistency.
Optional API Calls The optional API subroutines are discussed below and summarized in Table 4.
Subroutine MSETWIN is used for selecting a specific time window within which to return the observations. The input arguments for this routine include the beginning and end of the time window (in minutes relative to the nominal year/date/hour time specified in the M*STA call), and also a code number to indicate how multiple reports from a single station within the time window should be handled. Allowable codes specified that either all reports should be returned (within the time window, or within the file), or only a single report should be returned, either closest in time to the nominal time, closest in time to the beginning of the time window, or closest in time to the end of the time window.
Subroutine MSETDOM provides arguments to specify the domain within which to return the observations. Domains can be specified either in terms of latitude and longitude corners, or in terms of a grid box length and projection. Projections supported in the initial version of MADIS are polar stereographic and Lambert conformal conic, with the Mercator projection planned for inclusion in the next version. In the case of a projection specification, the MADIS API will automatically rotate all wind observations to be true to the specified grid, and also provides the capability to obtain the (i,j) position within the grid for each station location.
Subroutine MSETQC allows the user to filter the observations based on QC results. Allowable options include the capability of returning only those observations that have passed a specific QC level, e.g. level 1 validity and position conistency checks, and an option to return only those observations that have passed the highest level of QC specified for that dataset and observation type. In all cases, the full QC structures are returned to provide information on exactly which QC checks were applied, which were passed, and which were failed. Also available in the QC structures is information indicating which radiosonde observations have been interpolated, and which have been hydrostatically corrected. Subroutine MSETCOR allows the user to activate or deactivate the QC corrections.
Subroutines MSET*PVDR, the final optional routines in this version of MADIS, allow the user to choose among specific meteorological surface (MSETSFCPVDR) or hydrological surface (MSETHYDROPVDR) datasets. One or more of the MADIS surface datasets may be chosen, or the user can choose to further stratify the data by choosing specific data providers within the datasets. For example, a MADIS user may choose to access all available METAR observations, or choose to access only those observations from the Automated Surface Observing System (ASOS) stations. Likewise, mesonet data from only certain providers may be specified. In all cases, MADIS returns surface data with the provider type information included.
Accessing MADIS Data and SoftwareAccess to all MADIS documentation, software, and the FSL database is available through the MADIS Website at http://madis.noaa.gov/. The documentation for each dataset includes details such as the extent of geographic coverage, the volume of data, and the real-time schedule for the data. Also included are lists of all the variables, annotated with their units, processing and interpretation notes, and a list of the QC algorithms that have been applied. The QC documentation includes a detailed description of the algorithms used for each of the automated checks, the subjective intervention lists currently used at FSL, and the details needed to understand the QC data structure that accompanies the observations.
Subscriptions to the real time FTP and LDM data feeds, and requests for archived data, can be requested by filling out a data application form available on the Website. Since some of the data are proprietary, different distribution categories have been set up to handle restricting these datasets, which include some of the mesonets, and the aircraft data. In general, no restrictions apply to government agencies supporting forecasting operations, or to research and eductional organizations.
Source code for the API software, and precompiled binary versions for many platforms, can be downloaded from the Website. Binary versions are available for the systems on which the API has been tested. These supported systems include Linux, several different Unix platforms, and Windows. Instructions are also provided for building the API from the source code, if desired. Basically, the API has been designed to build and run on any Unix or Windows operating system. In addition, 13 hours of sample data files are available for each of the datasets. These are made available for anyone who wants to download the software and some data, then try out the system before requesting real-time or archived data from FSL.
SummaryWith the goal of more accurate forecasts, FSL initiated the MADIS project to expand availability of value-added observations to the greater meteorological community. The FSL MADIS database-freely available from FSL to interested parties in the meteorological community-provides reliable and easy-to-use access to real-time and archived datasets. The datasets supported in the initial version of MADIS include surface and maritime observations from a variety of different networks, profiler winds, radiosonde soundings, and automated aircraft reports. Also available is the MADIS API, a software package that provides access to all of the observation and QC information in the FSL database, and to observations in other supported meteorological databases. Future plans include providing access to additional datasets and support for other meteorological databases.
Note: More information and a complete list of references on this and related topics are available at the main FSL Website, www.fsl.noaa.gov, under "Publications" and click on "Research Articles."
(Michael Barth is a computer specialist in the Systems Development Division, and can be reached at e-mail firstname.lastname@example.org. Patricia Miller is Lead of the Scientific Applications Group, and can be reached at e-mail email@example.com.)