Executive summary
SCD provides a world-class supercomputing facility and a complete
set of integrated support services for researchers in the atmospheric,
oceanic, and related sciences. SCD's facilities support researchers
around the world as well as at consortium member universities and at
NCAR and UCAR. SCD supports the development and execution of large,
long-running numerical simulations and the archiving, manipulation,
and analysis of extremely large datasets.
SCD's
mission specifies how this work
is structured,and it defines the management units of SCD's internal
organization. This Annual Scientific Report describes the
accomplishments of each SCD management unit.
This executive summary reports SCD's FY2003 progress in each
areas defined by SCD's FY2003 program plan:
- Computing resources
- Mass Storage Systems
- Operations and Infrastructure
- High-speed networking and data communications
- Computing user support services
- Visualization and enabling technologies
- Research data support and services
- Computational science, networking, and visualization research
and development
- Climate Simulation Laboratory (CSL) for large, long-running
simulations of climate
Computing resources
The production supercomputer environment managed by SCD for
NCAR evolved from serial codes running on single processors to
codes that harness the power of multiple CPUs in cluster systems.
In FY2003, SCD more than quadrupled the capacity of these cluster
systems by completing Phase II of the Advanced Research Computing
System (ARCS) and a late-FY2003 augmentation to that system.
A complete new IBM Cluster 1600 system, named bluesky, was
added to the SCD computational environment in early FY2003. This
bluesky system had a total of 1,216 POWER4 processors and a peak
computational rate of 6.323 teraflops. Late in FY2003, bluesky was
augmented by 12, 32-way p690 SMP frames adding an additional 384
POWER4 processors and raising the total peak computational rate of
the system to 8.23 teraflops. The augmentation also increased
bluesky's total disk capacity to 31 TB. Bluesky now has 50 POWER4
p690 SMP frames, making it the single largest system of this type
in the world. The addition of the bluesky system in FY2003
increased both Community Computing and Climate Simulation Lab
resources by a factor of two each. Initially, the 12 additional
bluesky frames will be used to increase the resource capacity
available to CCSM IPCC activity.
Other accomplishments in FY2003:
- Acquired two 32-way P690 SMP frames in late FY2003 for the
evaluation of Phase III of the ARCS plan
- Monitored and maintained hardware and software at high levels
of utilization for seven computational servers in the CSL computing
environment
- Monitored and maintained hardware and software at high levels
of utilization for five computational servers in the Community
Computing production environment
See the full FY2003 report at
High performance computing.
Mass Storage System
The NCAR Mass Storage System (MSS) is a large-scale data archive
that stores data used and generated by climate models and other
programs executed on NCAR's supercomputers and compute servers. At
the end of FY2003, the NCAR MSS managed more than 20.3 million files
containing more than 1.5 petabytes (PB - 1,502 terabytes), over 880
unique terabytes (TB), and the net growth rate of data in the MSS
was approximately 27 TB per month. SCD faces an increasing demand
to archive data from ever-faster supercomputers.
The new Storage Manager (STMGR) disk cache component (internal
disk cache) was placed into production as a replacement for the
aging IBM 3390 diskfarm. In FY2003, STMGR provided a 2.7x increase
in storage capacity and a 10x increase in aggregate data transfer
rate. STMGR is most notable for its potential of providing more
than a 220x increase in storage capacity, a 33x increase in
aggregate data transfer rate, and the ability to buffer files of
all sizes. Further, it will permit newly written files to reside
in the cache longer to reduce tape mounts and tape I/O.
To aid capacity planning and performance tuning of the MSS, a
simulator that includes all the major hardware and software
components of the MSS was developed in 2003. The simulator enables
the MSS group to consider different design alternatives for new
software and hardware components and estimate how the different
designs will perform before the components are added to the actual
system. Simulation studies were conducted in 2003 using an earlier
version of this simulator (that only simulated the disk cache
component of the MSS) to aid in configuring and sizing the STMGR
disk cache system.
Simulator output was also combined with MSS warehouse
information to help measure the effectiveness of external data
caches which were deployed to avoid re-reading data from the MSS,
thus avoiding the abuse of a data archive as a file server. The
external cache deployment resulted in as much as a 60% drop in
such re-reads.
Other accomplishments in FY2003:
- Deployed beta-test versions of web-based tools to help users
manage their MSS holdings
- Completed decommissioning the IBM 3490 drives and media
- Completed decommissioning the Redwood drives and media
- Continued research and development of external disk cache
systems
See the full FY2003 report at
Data archiving and management
system: The MSS.
Operations and infrastructure
The Operations an Infrastructure Section (OIS) contributes the
infrastructure necessary to support and operate the computers,
networks and services that are integral to SCD's mission. In
addition to physical infrastucture such as electrical distribution,
cooling and 7x24 operational oversight of the facility, OIS also
contributes to the software infrastructure. Examples include the
development and maintenance of the problem-tracking system within
the division, the room reservation system, and more recently the
SCD portal.
Planning for the second-phase ARCS system bluesky culminated in
FY2003 with its smooth commissioning. Plans began to add 14 more
compute nodes to bluesky. This addition placed more heat stress on the
computing infrastructure as each node produces up to of 42,000 BTU/hr.
For the first time, SCD began quantitatively measuring and recording
changes in electrical consumption based on the workload of the
supercomputers. The SCD Portal, a web-based entry point to SCD
computing resources, was released to all users. At end-FY2003, more
than 200 fixed assets are being tracked, tagged, recorded in a database,
and delivered for configuration and deployment. This process has greatly
increased the accuracy and detail of SCD's asset information.
Other accomplishments in FY2003:
- Beyond the large IBM installations, OIS managed the installation
and removal of numerous other servers and equipment housed in the
computing center.
- In support of the IBM installations, OIS completed the
installation of additional cooling equipment and addressed single
points of failure to provide a more robust infrastructure.
- Provided equipment and infrastructure to support SCD staff,
including small and large computers, printers, environmental
control equipment, test equipment, and other services
- Transitioned operations staff schedule to a format that
allows them to work on projects, attend training, and interact
with other SCD staff
- Began construction of a facility for housing two 1.2-Megawatt
backup power generators
- Completed conversion of the 3490E tape media; this has
reduced manual tape mounts by 80% and the need for temporary
staff
- Managed SCD's business continuity plan for recovering critical
functions after a catastophic event
See the full FY2003 report at
Computing center operations
and infrastructure.
High-speed networking and data communications
Networking is an essential technology that is vital to UCAR's
ability to function and prosper in a rapidly evolving technological
environment. Networking capabilities fundamentally support UCAR's
goals for the advancement of science, technology, and education.
Primary accomplishments in FY2003 included new telecommunications
and networking systems for all UCAR staff, providing network support
for the expansion to the Center Green campus, and participating in
multiple national network research projects.
Other accomplishments in FY2003:
- Deployed Voice over IP (VoIP) telecommunications throughout UCAR
- Rebuilt the LAN infrastructure throughout the buildings in the new
Center Green campus to bring it into compliance with UCAR standards
- Replaced the OC-3 connection between NCAR and the FRGP with Gigabit
Ethernet on dark fiber
- Produced multiple proposals including one to include UCAR on the
TeraGrid network
- Participated in networking research projects described in the
Research section of this executive summary
- Provided engineering and support for the FRGP consortium
- Led participation in LambdaRail, a consortium of leading U.S.
research universities and private-sector technology companies seeking
to develop a new networking infrastructure for all forms of education
and research
- Contributed to SCD's business continuity plan for recovering critical
functions after a catastophic event
See the full FY2003 report at
Network engineering and
telecommunications.
Computing user support services
SCD's User Support Section (USS) provides leading-edge software
expertise for the climate, atmospheric, and oceanic research
communities to facilitate their high-performance computing endeavors
at NCAR. The section provides a variety of services to users, both
local and remote, that enable them to pursue their research within
SCD's "end-to-end" high-performance computing environment. USS
provides users with focused support services tailored to their
specific needs.
USS also oversees the process for allocating supercomputing
resources for the university and NCAR communities, as well as for
the Climate Simulation Laboratory (CSL). FY2003 achievements
were made in five areas: CSL, database management, user consulting,
infrastructure support, and digital information.
In FY2003, USS helped users become productive on bluesky, a new
1312-processor POWER4 IBM cluster system through code testing,
documentation, code scaling assistance, and problem resolution.
USS supported two User Forums featuring 20 invited speakers, and
responded to the needs voiced by participants by creating new
products and services and revising existing ones. Under the
direction of the NCAR/UCAR Computer Security Advisory Committee,
computer security received ongoing improvement and users were
guided to mitigate the impact of security issues. Also in FY2003,
USS staff continued to increase the number of high-availability
clusters that provide critical infrastructure services for users
and UCAR staff.
Other accomplishments in FY2003 include:
- Implemented accounting charges based on the wall-clock time
a job runs rather than CPU time. This more accurately reflects
the computational resources used by individual computer jobs
- Streamlined the process for users to manage their MSS files
- Tested and worked to approve more than 30 software changes to
the supercomputers by testing functionality and results before
and after the modifications
- Installed a Lightweight Directory Access Protocol (LDAP)
database in the UCAR mail relay system
- Developed a formal proposal for evaluating user software
requests to respond more quickly to user requests
See the full FY2003 report at
Assistance and support for
NCAR's research community.
Visualization and enabling technologies
SCD's Visualization and Enabling Technologies Section (VETS) has
a primary focus on advancing the knowledge development process.
Activities span the development and delivery of software tools for
analysis and visualization, advancing visualization and collaboration
environments, web engineering for all of UCAR, R&D endeavors in
collaboratories, the development of a new generation of data
management and access, Grid R&D, novel visualization capabilities,
and a sizable outreach effort.
In the Cyberinfrastructure (CI) Strategic Initiative, VETS
released a newly designed Community Data Portal (CDP) site with
powerful search and browse capabilities and published metadata for
hundreds of datasets from across our organization. We also made
substantial inroads on the other major component of the CI
initiative, the Web Outreach, Redesign, and Development (WORD)
effort, and we achieved a strong, UCAR-wide consensus on a new
strategy and design for our institutional web presence. In addition
to this internal initiative, we were also awarded new R&D contracts
in the areas of Grid computing and modeling (NASA) and advanced
visualization (NSF/ITR), and we continued to play a strong role
in the Unidata-led NSF/NSDL THREDDS project.
VETS made substantial progress across all of its areas of endeavor.
Some of the highlights include:
- Internal and external project funding enabled us to expand our
staff to 22 people this year, and along the way, we improved our
project management and tracking processes and launched a major
effort to improve our metrics gathering and reporting.
- Continued an aggressive outreach effort that included the
launching of the VizKids program, an extensive list of Vislab tours,
and a fine presence at the SC2002 conference -- while reducing
overall external events and the associated costs.
- Added a new full-time staff member to support collaboration
services, participated in the NCSA/Alliance Scientific Workspaces
of the Future (SWOF) effort, and partnered with Howard University
to place an AccessGrid node at their site.
- Executed a successful mid-term review of the Earth System Grid
(ESG) project with DOE and demonstrated an early multi-site Grid
for climate research.
- Advanced enterprise web services substantially, including a
large storage and server increment, a formal streaming video
service, a new visualization portal, new directory services, and
improved statistics and metric capabilities.
- Continued to observe our NCAR Command Language growing in
popularity and usage across the community and made additional
advances in the areas of remote visualization services,
Python-based data analysis and visualization, and new
multi-resolution data capabilities. We also engaged in a number
of planning meetings with various DOE groups in the area of
building larger community efforts in visualization and data
analysis.
See the full FY2003 report at
Visualization and enabling
technologies.
Research data support and services
The Data Support Section (DSS) maintains a large, organized
archive of computer-accessible research data that is made available
to scientists around the world. This 22.5-terabyte archive of 550
datasets is an irreplaceable store of observed data and analyses
that are used for major national and international atmospheric
and oceanic research projects.
We carried out a number of data development projects to add
more data to the archives. We have added about three old sets of
surface observations taken during the 1928 - 1973 period for the
U.S. and Canada. Progress is being made to add data to the surface
ocean dataset (I-COADS). We got ready to accept data from the
ERA-40 reanalysis for 1957 - 2002 from ECMWF and the NCEP regional
reanalysis for 1979 - 2002.
Much progress on the document project was made during FY2003.
The goal of this project is to assemble hardcopy information about
datasets and projects into documents, and scan these for online
use. We also wrote many more data reports for inclusion. The
production of scanned documents started in March 2000. By October
2003 we has prepared 305 documents and 18,233 pages, an increase
of over 3,000 pages during FY2003.
Major progress was made in the data backup project. The goal
is to put backups for about 10% of our archives (the most
important) into distant archives. In May 1999, we started with
40-GB-per-tape technology and moved to 110 GB per tape in September
2002. We have completed backups for 1,733 GB of data, leaving some
of our very important observations to be backed up.
Other accomplishments in FY2003:
- Added new datasets, concentrating on older observations,
updates of recent data, and oceanography data
- Increased the amount of data available via the web, especially
the reanalysis data
- Continued involvement in large projects such as reanalyses and
comprehensive international data-compilation efforts
- Continued supporting users by providing consulting services
from scientifically knowledgeable staff
- Carried out major improvements on the I-COADS marine surface
data collection, and continued similar work on other important
atmospheric datasets
- Continued saving legacy data still remaining on 7- and 9-track
reel tapes
- Acquired new tape storage systems to handle transfers that can
only be done using tape technology
- Updated reanalysis observations and numerous other research
data products on an ongoing basis
- Continued assembling seven important sets of global
observations for reanalysis and climate research
See the full FY2003 report at
Research data support
and services.
Research in computation, networking, and
visualization
Advanced research and development activities take place
across the division and sections and often include collaboration with
other NCAR/UCAR divisions and programs.
Computational science research
SCD's Computational Science Section (CSS)
helps realize the end-to-end scientific simulation environment
envisioned by the NCAR Strategic Plan. The mission of CSS is to
develop much of the critical software and intellectual infrastructure
needed to achieve the plan's ambitious goals. CSS tracks computer
technology, learns to extract performance from it, pioneers new and
efficient numerical methods, creates software frameworks to
facilitate scientific advancement -- particularly interdisciplinary
geoscience collaborations -- and share the resulting software and
findings with the community through open source software,
publications, talks, and websites.
The past year has been productive for CSS in many areas. Research
and development activities funded in FY2003 are up and running. The
section had nine papers published or accepted in peer-reviewed journals
in the last year, and has four more currently submitted for publication.
CSS has generated five posters or papers published in conference
proceedings. One patent application was initiated. CSS staff initiated
work under two new grants in FY2003. A critical DOE cooperative
agreement research grant (CCPP) was renewed for another year at full
funding. In terms of institutional impact, the three most significant
accomplishments by CSS in FY2003 are:
- The ESMF software development team in CSS released the first
prototype version of the Earth System Modeling Framework (ESMF 1.0)
software in May 2003. This early version of the software included a
coupled fluid flow demonstration, showing how the ESMF infrastructure
could be used to build a compressible fluid flow solver and how the
ESMF superstructure could be used to assemble multiple components
into an application. To date there have been 300+ downloads of the
framework software. The on-time delivery of proof-of-concept
framework software helped build community confidence in the ESMF
effort. It also demonstrated the testing, release, and support
capabilities of the ESMF team.
- CSS staff participated in an NCAR-DOE-IBM "tiger team" to tune
the performance of the Community Atmosphere Model (CAM) for the
upcoming IPCC runs. This collaboration resulted in 30% performance
improvements, which greatly enhanced the scientific throughput
capability of the NCAR POWER4 IBM cluster for this important series
of climate change experiments.
- CSS scientists successfully implemented a non-conforming spectral
element mesh refinement scheme and applied it to the shallow water
equations on the sphere. They demonstrated the utility of a scalable
Operator Integration Factor (OIF) scheme by using it to take large
time-steps on a such a refined mesh.
In addition to these important accomplishments, CSS has:
- Made significant advances in the long-term process of building a
computer science program within SCD by developing an increasingly
fruitful partnership between CSS and the University of Colorado
department of computer science.
- Made significant progress integrating the High-Order Multiscale
Modeling Environment (HOMME) dynamical core with CAM2 physics.
- Done important work developing space-filling curves for parallel
decompositions of spectral element atmospheric models.
- Collaborated in a successful effort with CU computer scientists
to improve the numerical and computational performance of the GMRES
algorithm.
- Developed and implemented (for the shallow water equations) a
new conservative advection scheme based on Discontinuous Galerkin
methods.
- Applied new mathematical methods to solve part of the highly
nonlinear elliptic problem of the solar Coronal Mass Ejections
(CMEs).
- Expanded the Spectral Toolkit to include SPHEREPACK
functionality needed to create basic, high performance
infrastructure for spectral analysis and synthesis.
Networking research
Over the last decade, we have witnessed a tremendous increase in raw
network capacity. Today, we are seeing the ubiquitous deployment of
cross-country 10 Gb/s optical networks and early standards efforts in
support of 40Gb/s and 100Gb/s networking technology. As a result,
"long fat pipes" (elephants) [RFC1323] are no longer a rarity -- in
fact they are becoming the norm. For example, the NSF funded the
40Gb/s Distributed Terascale Facility (DTF) [RB02], UCAID has announced
plans for the 10Gb/s Abilene Network of the Future [Int02], and
contractual negotiations are near completion on a dark fiber lambda
network called the National LamdaRail. However, it is not clear
whether these network capacity increases will actually result in
comparable increases in application performance, due to a number of
specific underlying technical issues and limitations.
For UCAR scientists to benefit from these faster networks, it is
critical that SCD stay active in the network research activities to
influence and make available the latest technology.
NETS has successfully continued their work on the Web100 and
Net100 research projects. Additionally, NETS secured a Cisco grant
to continue related work and an NSF STI award for ongoing network
performance research.
Ongoing networking research initiatives include:
- Continue work on the Web100 collaboration
- Continue work on the Net100 collaboration
- Collaborate with PSC on Cisco University Research Program
Funding -- Investigating Large Maximum Transmission Units (MTU)
- Collaborate with PSC NSF Strategic Technologies for the Internet
Proposal -- Effective Diagnostic Strategies for Wide Area Networks
Visualization and enabling technologies research
SCD engages in a number of research efforts aimed at advancing
our ability to manage, access, analyze, and visualize scientific
data. Our efforts span novel visualization techniques and
algorithms, Grid technologies, access methods and geoscientific
metadata, and applications of advanced visualization in the
educational arena. SCD recruited a new postdoctoral researcher
who will be studying the applications of advanced visualization
techniques in undergraduate geoscience curricula.
- Continued participation in the Unidata-led NSF National Science
Digital Libraries (NSDL) THREDDS (Thematic Real-time Distributed
Data Servers).
- Continued to advance the Earth System Grid (ESG) project, a
DOE-funded collaboration of several DOE centers and NCAR. We
executed a successful mid-term review of the Earth System Grid
(ESG) project with DOE and demonstrated an early multi-site Grid
for climate research. We also published a paper on ontologies
and semantic approaches to metadata related to ESG.
- Secured a three-year NASA grant to collaborate with CGD and
the University of Colorado to develop a distributed, Grid-based
environment for biogeochemical modeling, data management, and
analysis.
- Secured a five-year NSF Information Technology (ITR) research
grant to collaborate with U.C. Davis to explore new approaches to
visualizing time-varying scientific data.
- Explored new wavelet-based multiresolution approaches to dealing
with very large rectilinear gridded data and developed a prototype
toolkit for this endeavor called mtk.
- Developed a proposal to join the NSF Extensible Terascale
Facility (ETF) with partnerships among NCAR, the University of Utah,
the University of Colorado, and Colorado State University. This
proposal was favorably reviewed but not funded. This was a
cross-sectional effort across VETS, NETS, and CSS.
See the full FY2003 report at
Computational science
research and development.
Education and outreach
SCD has a vibrant education and outreach program. We
collaborate with UCAR's Education and Outreach program such
that we complement and support their endeavors. We also
represent our program of scientific supercomputing at
conferences and other events, and we support an aggressive
and busy schedule of Visualization Lab demonstrations and
presentations for a wide variety of visiting audiences, ranging
from scientific through educational. Lastly, SCD provides
training and promotes other activities that support the use of our
services and the proliferation of advanced technologies to the
university community.
SCD's Visualization and Enabling Technologies Section continued
a very strong outreach program, providing dozens of presentations
in our Visualization Lab. We also spun up a new program, informally
called VizKids, where UCAR's Public Visitors Program (PVP) prepares
and delivers highly visual presentations to visiting educational
groups, and the results thus far have been very positive. Through
teamwork, we are able to accommodate a much greater number of
visitors with only a modest impact on SCD technical staff. We also
engaged in an outreach activity to provide Howard University in
Washington, D.C. with an AccessGrid node for their atmospheric
sciences program. We had a strong presence at the SC2002 conference
and showed off a new design scheme in our exhibit, one that
emphasized our computing, visualization, and research efforts as
well as our sponsorship by NSF (and other agencies, to a lesser
degree). SC2002 was our only formal exhibit in FY2003, which
reflected the implementation of our strategic plan to reduce our
exhibit participation in conferences in favor of more technical
R&D and a growing, stronger presence presenting and publishing
papers. This new direction recouped a substantial amount of
high-level staff time.
User Support organized three multi-day and four single-day
seminars to help supercomputer users understand the background
and learn optimal techniques for making productive use of the
IBM SP-cluster systems. One staff member has a teaching appointment
at the Colorado School of Mines, Golden, Colorado, as a member of
the Colorado Higher Education Computing Organization.
See the full FY2003 report at
Education and outreach.
Climate Simulation Laboratory (CSL) facilities
for large, long-running simulations
NCAR has established a dedicated climate model computing
facility in support of the multiagency U.S. Global Change
Research Program (USGCRP). The Climate Simulation
Laboratory (CSL) is administered by the National Science
Foundation, and the CSL computational facilities are housed,
operated, and maintained by SCD.
The purpose of the CSL is to provide high performance
computing and data storage systems to support large-scale,
long-running simulations of the earth's climate system
(defined as the coupled atmosphere, oceans, land and
cryosphere, and associated biogeochemistry and ecology, on
time scales of seasons to centuries), including appropriate
model components, that need to be completed in a short
calendar period. A large simulation is one that typically
requires thousands of processor hours for its completion and
usually produces many gigabytes of model output that must be
archived for analysis and intercomparison with other
simulations and with observations.
During FY2003, CSL projects used a total of 905,567 General
Accounting Units (GAUs) on CSL-allocated machines. The largest
project was the Community Climate System Model (CCSM), which used
over 600,000 GAUs. Computer resources were provided to 328
universities and U.S. non-profit research organizations. The largest
single request granted was 60,000 GAUs. A total of more than 535,000
GAUs were allocated to universities during FY2003. SCD's User
Support section provided personalized contact for researchers, which
involved assistance with user accounts, computing, data management,
consulting help, computing infrastructure services, and web-based
user documentation.
Other accomplishments in FY2003:
- More than doubled the total computing capacity available to
CSL users
- Prepared new cooling capacity for the computer room that will
be installed in October 2003
- Provided CSL user support in the areas of consulting,
visualization, research data, and model development assistance
Next page |
Table of contents