Overview of ARM Archive User Interface

(2/5/98; revised 09/04/2009 Stefanie Hall)

The ARM Archive contains more than 2,200,000 user accessible data files formatted in more than 2000 types of data streams. (The total data volume of the Archive is more than 6,500,000 files and 38 terabytes.) The user interface for the Archive is designed to facilitate the identification of specific ARM data files that should be retrieved for a data user's request without going through numerous, very long, lists of obscure filenames. The magnitude of the ARM data collection requires that data be stored in a Mass Storage System (MSS: a collection of computers and automated tape libraries containing 1000's of tape cartridges). Because the data files are not 'on-line', the user interface processes 'directory' information from an on-line database to identify the availability of data files. A schematic of the Archive can be seen here. Secondary processing by the Archive computers copies requested data files from the MSS to an accessible FTP site. Users are notified by e-mail when all requested files are available at the FTP site. Accessibility to the data files is completed when the user has copied the files via FTP to their own system. Processing of requests greater than 25 GB (~10,000 files) is suspended until the Archive staff confirm the availability of online storage.

The following sections provide additional information on:

  • The computing capabilities needed to access the Archive and use ARM data
  • A logical overview of the Archive User Interface
  • User Interface choices
    • Data Browser Interface
    • Catalog Interface
    • Thumbnail Browser
    • Statistical Browser
    • IOP Data Browser

Presumed computing capabilities by Archive interface users

Users of the ARM Archive interface and retrieved data files are presumed to have the following computing capabilities:

  • A WWW browser (the interface is designed and tested for Netscape 4.0 or higher; other web browsers appear to be okay as well)
    • required to view the user interface
    • helpful for accessing ARM documentation (www.arm.gov/)
  • An e-mail address (required for retrieval notifications)
  • tools for Internet file transfer by way of FTP
    • very large requests can be transmitted by tape (contact armarchive@ornl.gov or call 1-888-ARM-DATA for assistance)
  • system acceptance of long filenames (ARM filenames range from ~20-64 characters)
  • netCDF or HDF tools
    • compilers [C or Fortran] for incorporating public domain subroutines into user written software or
    • commercial applications for analyzing netCDF (e.g., IDL, ATLAB)

Logical Flow of the User Interface

The logic of the user interface includes the following steps:

  • Login to interface
    • This step enables the interface to track your request specifications and notify you when your files are retrieved.
    • specify your username or email address, if you have previously registered
    • register a username, if you are a new user
      • we need to know an e-mail address for notification of successful file retrievals.
      • name, address, and phone number also provides important information for contacting you and characterizing the ARM data user community.
  • Review request status or specify new request
  • Select Interface type
    • Data Browser Interface
      • Specify files to be requested with exact specifications for site, date range, instrument or measurement type, and facility.
    • Catalog Interface
      • Browse tables of data availability summarized by location, year, instrument type, etc. and select data in monthly increments
    • Thumbnail Browser
      • Browse daily thumbnails and quicklooks of files with specifications for site, date range, instrument, and measurement.
    • Statistical Browser
      • Browse a series of drill-down statistical graphs for showcase datasets with the option to extract more statistical information or order ARM data files.
    • IOP Data Browser
      • Review Intensive Operational Period (IOP) data stored in an online, documented directory tree and download files individually or build collections of files as a TAR file.
  • Select ARM data
    • enter query specifications in data browser interface
    • select entries from the catalog interface
    • download or "check" items in IOP data browser
  • Review data selection results and submit retrieval request
    • Each interface displays and estimate of the number of files and bytes contained in the request
  • Review
  • Specify additional requests or logoff the interface
    • This is the end of an interactive session with the user interface
    • Users are notified by e-mail when the requested files are accessible from online storage.
  • A secondary computer program supervises the copying of the requested files from the Mass Storage System to the user accessible FTP storage.
    • Requests greater than 25 MB (sum of file sizes) or 10,000 files are suspended until the availability of FTP storage is confirmed by Archive staff.
    • The time required to complete the retrieval of files from the MSS depends on:
      • The number of files requested (e.g., >5000 files may require a few hours to complete)
      • The number of other requests pending in the retrieval 'queue'.
  • Review data notifications
    • description of data quality report system
    • request for credit and publications
  • Use FTP to download data files (follow link in notification message)
    • connect to ftp.archive.arm.gov
    • enter username: armguest
    • enter email address as FTP password

User Interface choices

The Archive provides five online user interfaces for the specification of files that need to be accessed by a data user. The user interfaces accomplish the same function - facilitate user access to the data files -, but support complementary solutions to finding the files that you want from the 5,000,000+ files stored in the Archive. Summary descriptions of the user interfaces are:

  • Data Browser Interface
    • Identifies available data files from exact specifications of site, date range, instrument or measurement type, and facility, etc
    • The Data Browser Interface provides an overview of ARM data quality. It displays daily quality color (green, yellow, red) for user specified subsets of sites, facilities, measurements and date ranges.
    • The Data Browser can also provide detailed information about Data Quality Reports and quick looks for user specified search criteria
       
  • Catalog Interface
    • Supports browsing of summary tables (by combinations of year, site, data source, etc.) about file availability and the specification of data requests in one month increments
  • Thumbnail Browser
    • Displays specified files in a thumbnail format for browsing and viewing quicklooks in greater detail.
  • Statistical Browser
    • Allows users to view statistical plots of showcase datasets, and then drill down through time scales ranging from the full period of record to individual months.
    • Provides the option to extract the data behind the statistical graphs, obtain the measurements used to calculate the statistics, and order the ARM data files from which the measurements were obtained.
  • IOP Data Browser
    • Provides access to IOP data stored in an online, documented directory tree.

More information about these interfaces are provided in the sections below. Assistance with requests for data can also be submitted to the Archive User Services (email: armarchive@ornl.gov or phone 1-888-ARM-DATA or 1-865-241-4851).

Data Browser Interface

The identification of the requested data files is determined from a query to an online database representing the 'directory' of available files. Requested files are typically identified from queries related to site, time, instrument or measurement or data stream, and facility. Besides ordering files, users can view data quality information (such as Data Quality Report, Data Quality Color Calendar, Quick Looks) for the selected data streams and date ranges. The queries for user-defined selections of files are based on the following three logical pathways

  1) Novice Interface (Show Figure):

  • Site:
    • data must be selected from one geographic site per request
  • Date range:
    • starting and ending dates for the query must be specified
    • This is explicitly specified by the user by way of month, day, and year selections for each date criteria.
  • Search Path:
    • Instruments or Measurements
  • Instruments or Measurements Category:
    • One or more categories can be selected
  • Instruments or Measurements:
    • one or more Instruments or Measurements can be selected within the selected category
  • Facilities:
    • List of facilities are displayed based on the previous selection criteria (specific to site, date range, category, and instruments or facilities)
    • One or more facilities can be selected from all the available facilities
  • Files to order:
    • A list of files is displayed based on the selected search criteria

  2) Datastream Interface (Show Figure):

        [Datastream Interface is equivalent to the data streams options found in the previous Power User application].

  • Site:
    • data must be selected from one geographic site per request
  • Date range:
    • starting and ending dates for the query must be specified
    • This is explicitly specified by the user by way of month, day, and year selections for each date criteria.
  • Data Streams:
    • List of available data streams based on the selected site and date range are displayed
  • Files to order:
    • A list of files is displayed based on the selected search criteria

Additional information about these query options is provided in the table below.
For a step-by-step tutorial on using the Data Browser, click here.

Query option type of logic user efficiency User actions limitations
Novice Interface:
Instrument indirect

background filtering of the potential data stream list from user selected criteria for site, date range, instrument categories, instruments, facilities and highest data level

high: when searching for data from specific instruments

low: when selecting data for a diversity of instruments

selects site, date range, instrument categories, instruments, and  facilities lengthy list of instrument names

presumes knowledge of the instrument's measurement capabilities

Measurement secondary, indirect

background filtering of the potential data stream list from user selected criteria for site, date range, measurement categories, measurements, facilities and highest data level

high: when searching for many possible variations of a measurement type

low: when searching for diverse set of unrelated measurements

selects site, date range, measurement categories, measurements, and  facilities lengthy list of measurement names

availability of measurements is confounded by site, date, facility, and data level criteria

Datastream Interface:
Data streams indirect

background filtering of the potential data stream list from user selected criteria for site, date range

high: when search for a few specific data stream types

low: when selecting a diversity of data stream types

selects site, date range, data stream names presumes a working knowledge of ARM data stream name codes

requires scrolling a VERY long list of data stream names

 

Catalog Interface

The catalog based user interface presents, in an interactive sequence of tables, a hierarchical summary of available data files ( see Figure 1) organized in a way that will be useful to the inexperienced, as well as the expert Archive user. In addition to leading the user to specifying a subset of data, the intent of the catalog is also to display the availability of the data. The availability of data is irregular in time and space because of incremental changes in the installation and operation of the field sites (points of data generation). The content of the table's cell values indicates the quantity of available data (number of files) within the criteria represented by each cell. Criteria combinations for which data are available contain cell values greater than 0 and are linked to the next subset levels. Combinations containing no data display '0' and are not linked.

The navigation catalog metadata is combined with a "Data Cart" concept for collecting file sets of particular interest. At any level, the user may view the contents of the data cart, remove file sets from the data cart, or submit the list for retrieval from the Archive.

Description of the Interface

The ARM Archive catalog interface consists of two major components: 1) a catalog of available data files organized in a four level hierarchy, and 2) a data cart collection scheme that allows the user to store, edit and display a list of selected file sets. The interface programs display a sequence of linked HTML tables that allow the user to move through the various catalog levels, converging to desired sets of files. The hierarchy includes links to tables for increasingly narrow subsets of the data collection (see Figure 1). Selecting a value in each table leads to a table showing more detail in the next step. After the fourth step, data may be selected for addition to the Data Cart. The section below describes the user interface at each of these levels.

Instructions

For a step-by-step tutorial on using the Catalog Browser Interface, click here.

Step 1: Selecting the Site and Year
Following a login screen, the top level of the interface presents the number of files available in the Archive grouped by site and year (Figure 2). The user selects a site and year by clicking on the corresponding number of files in the table, assuming the number is nonzero.

Step 2: Selecting the Instrument Category and Facility Type
This selection takes the user to the second level, Figure 3, which displays all instrument categories and types of facilities from which ARM data were collected for the site and year chosen on the previous page. From this level an instrument category and facility type are chosen by clicking on the number of files in the appropriate cell of the table. Alternatively, the user may return to level 1 (to change the previous selection) by clicking on "Year" or "Site" at the top of the page.

Step 3: Selecting the Instrument and Data Level
The third level (Figure 4) lists the number of available files by instrument code and data level, for the previously selected combination of site, year, instrument category and facility type. The data level reflects the amount of processing done on raw data. Instrument and data level codes are briefly described below the table. Again, options are available to return to levels 1 or 2 via links at the top of the page.

Step 4: Selecting the Facility and Month
The final level (Figure 5) in the hierarchy of metadata attributes allows the user to select file sets by facility and month, or return to one of the previous three levels.

Step 5: Adding Files to the Data Cart
After the selection of facility and month (by clicking on a nonzero number of files in the table), the user is then presented a summary of the selections (Figure 6) together with the number and total size of the data files. At this point the user may elect to add these files to the data cart, return to any of the previous interface levels to edit selections, or continue browsing. Adding the set of files to the data cart returns the user to the original catalog interface, with the selected data added to "Current Selections" (Figure 7) The user may then continue browsing and adding data selections to the Data Cart. Each time, the chosen datastream will be added under "Current Selections." To remove a data selection, highlight the selection and click "Remove Selected Streams."

Step 6: Ordering Selected Data Files
When the user is satisfied with a collection of file sets, clicking "Proceed to Order" will bring up the selected data. The user may then elect to "Select All" files, choose only certain files for ordering, or extract measurements (Figure 8). Clicking "Order Files" will submit the user's request for the selected files. An "Order Confirmation" will then be displayed (Figure 9).

Summary and Discussion

The catalog interface enables the ARM researcher to efficiently identify files of interest, determine the existence of data, and collect sets of data prior to submitting a retrieval request. Important aspects of the system described here include the assignment of descriptive instrument categories and the dynamic explanation of instrument codes. Collection of data sets is currently done at the facility/month level. The collection (data cart) may be listed and edited from any level.



Thumbnail Browser

Background

In November 2004, the Thumbnail Browser Interface joined the existing suite of user tools for finding and selecting data on the ARM Data Archive website. This relatively new interface provides users with a graphical view of the data files before they decide to request or download them for additional use. The Thumbnail Browser also offers the following advantages:
  • The graphical images are small enough they can be stored online and dynamically displayed for each user.
  • The time needed to communicate the images within the user interface is reasonable.
  • The data plots show features in the data that are easily recognized visually, but very difficult to represent in other forms.
  • Description

    The first segment of the user interface enables the user to specify general criteria for "data of interest". The initial pages of the Thumbnail Browser are very similar to those of the Data Browser (select sites, facilities, date range, etc.). The user is provided three pathways to make these initial selections: Novice Interface, Power Interface, and Catalog Interface. These pathways are logically similar to the pathways in the Data Browser. Further steps in the Thumbnail Browser allow the user to view thumbnails of plots containing many of the primary measurements from ARM data streams. From the thumbnail views, the users can directly access larger-scale data plots called "Quick Looks", review Data Quality Reports, or select data files to be requested for retrieval and download. The selection of data files is made with graphical check boxes that allow selection by day or by multiple days for a single data stream or selection of all files for all datastreams.

    (Click here to see an example of the Thumbnail Browser results page)

    The Thumbnail Browser provides the user with many options for customizing the thumbnail display within two thematic views:
  • "Date range" view - This has an appearance similar to a "multi-channel strip chart". It includes options to re-arrange the row of thumbnails and the width of the display interval (days per display). (Link to more documentation).
  • Day at a Time View - This is designed for concurrently showing 100s of thumbnails for the same date (Link to more documentation). The user can customize the row and column placement (More Info) of thumbnails.
  • Both of these views allow the user to easily view consecutive time ranges or days in a sequence. These thumbnail views may be saved for future reference or emailed as a link to other users to show them the same "views" and make additional specifications for their data requests.

    Instructions

    For a step-by-step tutorial on using the Thumbnail Browser Interface, click here.


    Statistical Browser

    Background and Description

    The Statistical Browser (also referred to as "statistical views") is the newest interface to be developed, and currently consists of pre-computed products for nested time ranges (whole period of record; annual; seasonal; and monthly - as appropriate). For each time range and measurement, a variety of simple statistics are computed. Graphs of the statistical distribution of measurements (e.g., histograms) are also linked to the actual statistics displayed in the graphs. The graphs are available through a web-based interface. Users select a location and measurement and then drill down through times scales ranging from the full period of record to individual months. In addition to viewing graphs displayed by the user interface, users are able to extract the data behind the statistical graphs, obtain the measurements that were used in calculating the statistics, and order the ARM data files from which the measurements were obtained. This interface currently contains statistical views for showcase datasets.

    Instructions

    Step 1: Begin by selecting an ARM site for which to view available statistical plots and summaries. Current ARM sites available in the Statistical Browser are: SGP, NSA, TWP, and HFE.

    Step 2: Next, select a dataset. Current showcase datasets available are:

    • ARM Surface Radiation Data (qcrad1long)
    • Climate Modeling Best Estimate Data (CMBE)
    • Long-Term Continuous Forcing Data from Variational Analysis (CONSTRVARANA)

    Step 3: Select a facility from the list of those available.

    Step 4: Select a measurement from the list of those available. This will display the available plot types for the selected measurement.

    Step 5: Select a plot type from the list of those available. Plot types will vary based on dataset and measurement, ranging from daily to monthly to seasonal plots. Users can mouse-over the image in parentheses to see a description of each plot type. Clicking on the image will bring up a sample of that particular plot type.

    Step 6: Select a date range by entering start month/year and end month/year, and click on "Get Plots" to view the plots.

    Step 7: The plots will be displayed below the interface in thumbnail form. Users may click on any thumbnail image to view the detailed data plot and utilize additional features for accessing the data. These features are:

    • Get Statistics: Available in Text, Excel, and XML formats. After choosing a format, the statistics will be displayed.
    • Get Data for the Selected Range: Available in Text, Compressed Text(qz), Excel, and NetCDF formats. Once the format is chosen, a "Download Data" screen will appear while the request is being processed. This may take a few seconds. When the download is complete, follow the URL given to download the extracted measurement data.
    • Get ARM Data Files: Select "Add to Cart" or "List Files" to order the ARM Data files chosen by the user.
    Users must login with their email address or Archive User ID to access these features. Some features are still currently under development. For a comprehensive list of known issues and future developments, click here.




    IOP Data Browser

    Background

    IOP Intensive Operational Periods (IOPs) generate data that are "non-routine" because they originate from extra or guest data sources. The data may also be "non-routine" because the instruments are operated with temporary, experimental (non-production) protocols. All of these exceptions from normal operations causes significant "clutter" in the metadata and logic used in the query and catalog interfaces. Constraining the structure of the IOP data to follow the simple logic required to successfully manage the 5,000,000+ ARM data files, challenged the creativity of the ARM data managers and frustrated the IOP data generators (who are often guest collaborators with ARM and are not (or should not) fully indoctrinated with ARM-specific data management practices. The IOP Data Browser is also used for storage and access of reference data sets (e.g., geographic overlays of states, rivers, etc. for satellite images) and special data (e.g., preliminary versions of VAP output).

    The IOP Data Browser was designed to provide the following features:

    • It presented enough structure so that potential data users could follow an understandable path to identify and access the IOP data sets.
    • It allowed for considerable flexibility in how the data were structured within an IOP.
      • Minimal rules about names for sub-directories and files within each IOP
        • Every subdirectory has a "readme" explaining its contents. The specifications for the readme are minimal, but links to more extensive web-based documentation are allowed.
      • Minimal expectations for IOP data to follow similar naming or documentation
    • It enables users to select and download a few individual files or a few individual sub-directories
    • It enables the Archive to track "who accessed which data when" for reporting and update notification purposes.

    Description

    The IOP Data Browser contains a documented, online directory tree of IOP data. The IOP data are organized in a hierarchy of year / site / IOP / insturment - PI subdirectories. Additional subdirectories may be used within an IOP. Each subdirectory has a "readme" file to guide the user through that level's information. Data from IOPs may be downloaded as individual files by clicking on each file link. If the user needs to download large portions of IOP data (multiple files or subdirectories), a "check box system" (described in the outline below) can be used to select files and directories to be built into a single TAR file for download. The creation of the TAR file occurs after the end of an IOP browsing session and the user is notified by email when the TAR file is ready to download.

    The IOP Data Browser presents a 3 section display:

    • The top section displays the contents of the readme for the current subdirectory.
      • This readme may link to additional information at other web sites generated or referenced by the IOP participants.
        • The primary ARM documentation about IOPs is contain in a series of web pages located at: www.arm.gov/campaigns
        • The ARM documentation has a directory structure that is similar to the one used for the IOP data
      • Other web sites may be visited without losing your place in the IOP data structure.
    • The middle section shows a traditional browser-based directory and file list than can be used to navigate the data collection.
      • The top of this section shows the current directory path for "where am I".
      • The main portion of this section lists directories and files within the current directory.
        • Users may click on directory links to navigate to lower levels.
        • Users may click on file link to open or download individual files.
          • For some formats (e.g., netCDF), other information about the data files maybe displayed.
      • Very large data files (e.g., cloud radar, WSI, etc.) may be stored in the Mass Storage System of the Archive.
        • The readme information for these files will include information on how to find these IOP data in the Archive.
      • Each directory or file link displayed has a "check box" on the left side to select data to be added to a TAR file.
        • Clicking the check box for a file will add the file to a TAR file.
        • Clicking the check box for a directory will add the entire contents of the directory (including the contents of lower subdirectories and files) to the TAR file.
        • After sub-trees of the directory have been "checked", lower level files and sub-directories maybe unchecked as needed to specify the exact collection of IOP data to be included in the TAR file
    • The bottom section shows information and options about the TAR file being specified for downloading multiple files and directories
      • Lists of included directories are displayed (and can be removed as needed)
      • Lists of excluded directories are displayed (and can be removed as needed)
      • Option for "zipping" the TAR file can be selected
      • Control buttons for submitting the request for TAR construction are located in the section.

    Access and login to the IOP Data Browser

    The IOP Data Browser can be access after a login to the Archive User Interface; or it can be accessed directly at iop.archive.arm.gov/arm-iop/. (The IOP Data Browser can also be accessed from links located throughout ARM IOP documentation; see web page located under www.arm.gov/campaigns). All attempts to access IOP Data Browser will request a web login requiring the entry of a username and password. The user should enter their Archive account name for BOTH the username and password. Although this login appears to be redundant, it enables the Archive record the user access of each file. The records of access are important for distributing notifications about future updates to IOP data and reporting statistics on the usage of IOP data.

    gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.