SwiftStack Announces 4.0 Delivering Scale and Metadata Search

By Allison Proffitt

April 26, 2016 | SwiftStack today announced the latest version of its object storage software with new customer-driven capabilities that deliver scale and metadata search as well as plans for file native access and cloud synchronization. These new features are designed to further simplify and accelerate customer journeys from traditional file systems to more flexible, scalable cloud infrastructures.

SwiftStack releases open source software foundations, then with feedback from the community releases commercial products. Joe Arnold, chief product officer and co-founder of SwiftStack, highlighted two new features coming in the next 90 days in the SwiftStack 4.0 release: integrated load balancer and metadata search will be available in the short term.

Later in 2016, SwiftStack will introduce universal access to scale-out object storage as well as synchronization with any Amazon S3-based public cloud, making it easier for applications, users and their customers to access data regardless of where it is created.

Trends from the Trenches

Object storage continues to garner buzz in the bio-IT community. In his annual Trends from the Trenches talk at the Bio-IT World Conference & Expo, Chris Dagdigian again highlighted object storage as the future of storage. “Object storage is the future of scientiﬁc data at rest. Period,” he said at the 2016 event.

And the future of object storage is metadata tracking, he insisted.

“This is what I would like to track on a per-file basis: What instrument produced this data? What funding source paid to produce this data? What revision was the instrument/flowcell at? Who is the primary PI or owner of this data? Secondary? What protocol was used to prepare the sample? Where did the sample come from? Where is the consent information? Can this data be used to identify an individual? What is the data retention classification for this file? What is the security classification for this file? Can this file be moved offsite? etc. etc. etc.”

SwiftStack means to meet exactly that need with the metadata tagging capabilities.

“This directly came from our customers in Life Sciences. They needed to get a handle on the data,” Arnold told Bio-IT World. “They have an influx of data coming off, say, sequencers—HudsonAlpha is a good example of this—As data is coming off, they can tag it with metadata,” Arnold explained, listing potential metadata tags that aligned almost exactly with Dagdigian’s slides: instrument ID, grant, researcher, protocol used for sample prep, sample source, consent info, security classification, and more.

“[Users] can tag all these things from the data as it’s going into the storage system. Then they can do two things: they can query that data later on as they’re building out an application or tools or dashboard they can have some insights on that data. The second thing is they can do data management around that… It helps them manage some of the data retention issues as they bring on these big sequencing clusters into their environment.”

Throughput at Scale

The other feature that Arnold highlighted in the 4.0 release is meant to solve life sciences’ grown problems of scale.

“One of the big reasons why we get pulled into projects—particularly in life sciences—is the amount of data movement. Throughput is extremely important particularly in moving large files around or lots of files around,” Arnold told Bio-IT World. “What we’ve done is built a load balancing technology that will allow our customers to burst up and sustain high throughput rates without having to buy additional networking equipment—a load balancer like they would traditionally have to—also reduce the cost and give them the throughput to move data around.”

The list of SwiftStack 4.0 innovations introduced today include:

● Integrated load balancing reduces the need for expensive dedicated network hardware and minimizes latency and bandwidth costs while scaling to larger numbers of nodes

● Metadata search increases business value by integrating with third-party indexing and search services to make stored object data analytics-ready

● SwiftStack Drive is an optional desktop client that enables access to cluster data directly from desktops or laptops

● Enhanced management with new IPv6 support, capacity planning and advanced data migration tools

As the use of cloud storage grows, ongoing software releases from SwiftStack will provide constant access to unstructured data regardless of how the files are created and as applications evolve. SwiftStack is currently leading open-source community development efforts to integrate scale-out file services alongside S3 and Swift-based object APIs, without the need for additional gateways. The company will also introduce centralized policies that replicate object data stored on-premises with any AWS S3-compatible cloud, allowing enterprises to access and protect data in private and public clouds using the same methods.