High Performance Computing, GPUs, and Cloud: A Perfect Match?

January 26, 2011

Benchmark shows Cloud hardware as fast as native.

By Jason Stowe

February 10, 2011 | Guest Commentary | In 2010, the high performance and high throughput computing (HPC/HTC) communities saw two major technologies go mainstream: the use of general purpose graphical processing units (GPGPU or just GPUs) and Cloud computing.

According to the latest supercomputer rankings published by Top500.org, the first and third top-performing supercomputers in late 2010 are using GPUs for math-heavy calculations (see, p. 8). In parallel, momentum is building for life science researchers as many applications from molecular dynamics to genome sequence searching are now enabled to run on GPUs. According to Duncan Poole, senior manager of life science markets at NVIDIA, “Bioscience has been one of the most profoundly-affected segments by the parallel processing capabilities of GPUs. Many important codes in this segment, such as AMBER, map elegantly to the architecture, and achieve speed increases—in some cases many times faster than their earlier CPU-bound versions.”

2010 has been a great year for HPC in the cloud as well. Last July, Amazon’s Elastic Compute Cloud (EC2) service released access to high-speed networking and high-performance CPUs with its Intel Nehalem-based Cluster Compute Instances. In our CycleCloud HPC Cluster as a Service offering, we’ve seen increased utilization by companies including Pfizer, Varian, Schrodinger, and other life science research organizations. GPU in the Cloud offerings debuted last summer from Peer1 Hosting, followed in November by Amazon with its Cluster GPU instance (CG1).

Access to on-demand HPC and GPU resources gives life science researchers serious options for running calculations in the cloud and on GPUs. The cloud model also reduces the capital expenditure for deploying this specialized equipment internally, and allows users to scale usage and cost to match needs. Even individual researchers can access a large 32-GPU cluster in the cloud for two hours when research calls for it and pay about $65 for the compute time.

When it comes to the marriage of the Cloud and GPU trends, the question arises: “How good is the performance of a virtual Cloud GPU instance as opposed to internal, bare metal hardware?” To find out, we benchmarked the performance of GPU applications running on Amazon EC2’s CG1 instances compared to internal systems.

SHOC and Awe

We wanted to measure the impact of Cloud virtualization on the compute performance of the GPU, the bandwidth going to the GPU, and the performance of specific applications like molecular dynamics (MD). Thankfully there is a benchmark called SHOC (scalable heterogeneous benchmarking suite), created by Jeremy Meredith’s team at Oak Ridge National Laboratory’s Future Technologies group, which measures these values.

We set out to compare SHOC results both on non-virtualized, in-house hardware and on EC2 Cluster Compute hardware with GPU acceleration, each using dual NVIDIA 2050-class GPUs. We created a CycleCloud cluster containing three GPU machines with the benchmark pre-installed. We ran separate runs of the benchmark on different instances and compared the median results against the values from internal hardware.

The results were impressive. The molecular dynamics benchmark clearly shows a surprising result: the performance between the native and Cloud hardware is effectively the same. The only differences are due to bandwidth performance characteristics that favored the cloud GPU instances. As to raw computational power, the max floating-point operations per second (FLOPS) were equivalent on native and cloud machines.

Cloud and GPUs are important trends in HPC, as they both offer faster time to results than alternatives. It is important to quantify the difference in performance between nodes deployed internally and those in the Cloud. The SHOC benchmark results show that individual node performance for GPUs in the cloud and those deployed internally are comparable, besides slight differences in the bandwidth to and from the GPUs.

From an HPC perspective, GPUs may be a very silver lining to cloud adoption, especially for life science applications like molecular modeling and genome sequence searching. •

Jason Stowe is CEO of Cycle Computing. For further information and results from the SHOC benchmark, please visit http://blog.cyclecomputing.com


This article also appeared in the January-February 2011 issue of Bio-IT World Magazine. Subscriptions are free for qualifying individuals. Apply today.