DDN Survey Finds Private Cloud Growth, Mixed Flash Use Among HPC Users
By Allison Proffitt
November 21, 2016 | Earlier this month, DataDirect Networks (DDN) announced the results of its annual High Performance Computing (HPC) Trends survey. The company queried 143 high performance compute end users responsible for high performance computing, networking and storage systems from financial services, government, higher education, life sciences, manufacturing, national labs, and oil and gas organizations.
DDN found that end users in the world’s most data-intense environments are increasing their use of cloud, but opting for private and hybrid clouds instead of public, and choosing to upgrade specific parts of their environments with flash. Managing mixed I/O performance and rapid data growth remain the biggest challenges for HPC organizations driving these infrastructure changes.
The cloud findings were some of the most interesting to DDN, a storage provider offering flash and object storage. “In general for the high-performance computing crowd… it’s mostly an issue of cost and speed of access,” explained Michael King, senior director of marketing at DDN. After deciding to move data to the Cloud, King said HPC users often make two realizations.
“The two things they found were that what they thought they were going to move, they couldn’t, because they just couldn’t get to it fast enough, so they had to pare down the amount of stuff they were truly going to archive out to the cloud, because of the inability to get it back in a timely fashion. And then number two… thought it is a very flexible spending resource—you can add and remove, and as you add and remove the cost goes up and down unlike purchased hardware—that monthly recurring charge was starting to be much bigger than they thought it was going to be... It goes on forever, and there’s never any depreciation,” King said.
Survey respondents planning to leverage a cloud for at least part of their data in 2017 rose to 37%, up almost 10 percentage points year-over-year. Of those, more than 80% are choosing private or hybrid clouds versus a public cloud option.
For King, a “private” or “hybrid” cloud would ideally mean large pools of object storage that people can use as a consolidated center to their working environment. In practice, he admitted, private clouds are generally file-based solutions, possibly making use of older hardware.
But King is hopeful for a future shift to on-site, object storage-based private cloud. “In high performance computing, you’re still going to have a tier one, rate-based disc system of some sort, maybe a flash-based system, that’s really the high performance workhub,” King said. “But as things start to age, you’re going to move them into your private cloud. More and more it’s starting to mean seamless integration of object storage.”
DDN’s pitch is a parallel file system that automatically migrates from a file-based solution to an object-based solution. “The user doesn’t even know,” King said. “If the user asks for [the data] back again, it appears to be in the file system. It pulls back a little bit slower than the things that are actually on the tier 1 storage, but there’s no work to be done.”
Flash & Burst
Use of flash in HPC data centers has intensified, the survey found, with more than 90% of respondents using flash storage at some level within their data centers today, but very few are using all-flash arrays. Only 10% of surveyed users reported using an all-flash array. “If you’re an IT person, the idea that you’d have an all-flash array is kind of the thing to do, it seems. What we’re finding is not a lot of people in the HPC world are still using all-flash arrays… The majority of their flash usage is still in some sort of hybrid device that has a lot of spinning media still,” King said. 80% of survey respondents reported using hybrid flash arrays either as an extension to storage-level cache, to accelerate metadata, or to accelerate data sets associated with key or problem I/O applications.
DDN asked survey takers what technology is most likely to take storage to the next level as users seek faster and more efficient ways to offload I/O from compute resources, to separate storage bandwidth acquisition from capacity acquisition, and to support parallel file systems to meet Exascale requirements; 60% of respondents listed burst buffers.
It was a surprisingly high percentage, King said. “We found more people than ever are interested in that particular technology,” he said. DDN has been building a burst buffer—the infinite memory engine, IME. “It’s an SSD-based burst buffer, and one of the things it does is absorb peak loads of input comping into the system… and then doles it off.” DDN’s IME also takes mixed I/O and aligns it, sending it to the parallel file system in a more organized fashion, allowing that parallel file system to absorb it more easily, King said.
“We’ve been building it for what we thought was a relatively small group people,” King said, users generating more than 60 GB/second or 120 GB/sec of data, what he called the high end of the supercomputing crowd. “What we found was that more and more people are really interested in the topic… We were excited about a real opportunity to come into the market and do some basic burst buffer education.” DDN’s smaller IME product, IME240, is a good fit for smaller work environments, King said.
But maybe one of King’s favorite findings from the survey were the reported data growth numbers. 73% of respondents manage more than one petabyte of storage; 30% manage mange more than 10 PBs of data storage, a five-point rise year over year.
“It doesn’t appear as though anybody is throwing anything away at the moment, and that was still exciting for us. The growth is accelerating kind of on path,” King said. “It’s nice to see your predictions come true.”