9.7.3 Data Archiving

Maintaining a complete and reliable data archive is an important component of a QAPP. Upper-air instruments, especially remote sensors, produce a large amount of data consisting of raw and reduced data. The amount of data from these upper-air sensors can require in excess of several gigabytes of computer storage space per site per year. A protocol for routinely archiving the data should be established.

Raw data are the most basic data elements from which the final data are produced. Archiving these data is important because at a later date the raw data may need to be reprocessed to account for problems, errors, or calibrations. In addition, future processing algorithms may become available to extract more information from the raw data. Raw data are generally stored on-site and should be archived as part of the operational checks. Data should be stored on convenient and reliable archive media such as diskette, tape, or optical disk. The primary archive should be stored in a central repository at the agency responsible for collecting the data. A second backup of the raw data should be made and stored off-site to ensure a backup if the primary data archive becomes corrupted or destroyed.

Reduced data, which are created from the raw data by averaging, interpolating, or other processing methods, should also be archived. Reduced data include hourly averaged winds and temperatures from remote sensors, and vertically averaged winds and thermodynamic data from radiosonde sounding systems. Data validation is performed on the reduced data to identify and flag erroneous and questionable data. Both the reduced and validated data should be routinely (e.g., weekly or monthly) archived onto digital media, with one copy stored onsite and a second copy stored offsite.

Other supporting information should be archived along with the data such as:

  • Site and maintenance logs
  • Audit and calibration reports
  • Site information
  • Log of changes made to the data and the data quality control codes.
  • Information that future users would need to decode, understand, and use the data
  • Surface measurements and other relevant weather data

Data should be retained indefinitely because they are often used for modeling and analysis many years following their collection. Periodically, the integrity of the archive media should be checked to ensure that data will be readable and have not become corrupted. Data should be recycled by transfer from old to new media approximately every 5 to 10 years. If an archive is scheduled to be eliminated, potential users should be notified beforehand so that any important or useful information can be extracted or saved.

