Special Resources at Lawrence Berkeley National Laboratory (LBNL)
NERSC at LBNL is one of the largest open and unclassified computing facilities for basic science in the country. Operating the NERSC Center has enabled LBNL to acquire unsurpassed expertise in providing comprehensive scientific support that enables researchers to make the most productive use of these resources. NERSC supports more than 2,900 users nationally and internationally. Over 50% of the users are from universities. NERSC is known worldwide for the quality of its computing services, and its success is measured by the scientific productivity of its users. For selected CSE projects it will be possible to access the following resources.
Franklin, a Cray XT4 supercomputer, composed of AMD dual core processors running at 2.6 GHz, will have 19,344 compute CPUs, each with 2 GB (gigabytes) of memory per CPU (over 38.6 TB (trillion bytes) altogether), and a peak performance of 101.5 Teraflop/s (trillion calculations per second). The system will have a bisection bandwidth of 6.3 TB/s and 402 TB of usable disk.
PDSF, a 764-processor Linux cluster with over 300 TB of disk storage, used by the high energy physics community for data intensive analysis and simulations. PDSF has been in production longer than any other Linux cluster in the world, undergoing several hardware upgrades since it went online in 1998.
HPSS mass storage. The NERSC HPSS system archives 1.5 petabytes (PB) of data in 45 million files, and will have a capacity of 16 PB by the end of 2005. HPSS sustains an average transfer rate of more than 100 MB/s, 24 hours per day, with peaks to 450 MB/s, to and from NERSC computational systems and to and from sources outside NERSC, such as scientific experiments.
Computational Research Division
The Computational Research Division (CRD) at Berkeley Lab is home to about 150 researchers in areas such as applied mathematics, computer science, and computational sciences. CRD comprises two departments, the High-Performance Computing Research Department and the Distributed Systems Department. It also manages several SciDAC (Scientific Discovery through Advanced Computing) programs. SciDAC is the first federally funded program that aims at implementing the CSE vision of bringing together applications scientists with mathematicians and computer scientists to solve large computational science problems.
- The High-Performance Computing Research Department (HPCRD) addresses long-term research and development questions in HPC. With more than 125 staff and expertise in computer science, computational science, and applied mathematics, HPCRD can provide additional resources and talent for the advanced development needs of the new CSE center and for focused high-end support of the application areas.
- The Advanced Computing for Science Department (ACS) seeks to allow scientists to address complex and large-scale computing and data analysis problems beyond what is possible today. ACS is developing software components which will operate in a distributed environment. We are addressing several research areas: (1) providing tools to facilitate Grid Services; (2) building prototype tools that allow us to explore models of informal collaboration among geographically remote researchers; (3) techniques for assuring the security of such distributed applications and to provide site security while permitting access by a wide variety of remote applications; and (4) developing customized tools and support for specific scientific collaborations.
- SciDAC (Scientific Discovery through Advanced Computing) centers: Berkeley Lab is the leader of four SciDAC centers and participates in eighteen other SciDAC projects. The new CSE center will leverage the activities of these projects, in particular the Applied Partial Differential Equations Center (APDEC) and the Performance Evaluation Research Institute (PERI), two of SciDAC’s Integrated Software Infrastructure Centers.
The UCSD Triton Resource
The UCSD Triton Resource is a high-performance, data-centric compute resource housed at the San Diego Supercomputer Center. UC Berkeley faculty and researchers may apply for computing capacity on the Triton Resource through the Triton Affiliates and Partners Program (TAPP). The resource features exceptional data analysis capabilities and short wait times. Triton comprises three major components: the Triton Compute Cluster, the Petascale Data Analysis Facility, and the Data Oasis. Each [cid:image001.jpg@01CA53EB.28F2EF30] offers unique and powerful capabilities designed to support data analysis needs on scale with today’s most demanding and data-intensive processing tasks.
Triton Compute Cluster (TCC). The TCC is a 256-node cluster with dual quad-core Intel Xeon 5500 (Nehalem) processors running at 2.4GHz and 24 gigabytes of memory on each node. TCC has a peak theoretical performance of 20 teraFLOPS.
Petascale Data Analysis Facility (PDAF). The PDAF is a unique data-intensive computing system with nodes containing either 256 gigabytes or 512 gigabytes of main memory and eight AMD OpteronTM (Shanghai) processors (32 cores total) per node running at 2.5GHz. PDAF has a peak theoretical performance of 9 teraFLOPS and 9 terabytes of main memory.
Data Oasis. Planned for construction in late Fall 2009, Data Oasis is a large scale storage system with 2-4 petabytes of storage on 3000-6000 disks and 60-120GB/s throughput.
The system fabric is supplied by a Myricom Myrinet Multiprotocol Switch with 448 MX ports, 32 ten-gigabit Ethernet ports and 32 expansion ports. This gives the resource an approximate worst-case MPI latency of 2.4 microseconds and an achievable 1.2 gigabytes per second per network interconnection. The Myrinet fabric is a full-bisection Clos-topology two-level network. As configured, the bisection bandwidth of the switch exceeds 500 gigabytes per second (four terabits per second). The Triton Compute Cluster and PDAF are connected to each other at full-bisection bandwidth.