Courtagen Leverages Level 3 to Provide Direct Access to Amazon Cloud
By Kevin Davies
February 4, 2013 | WOBURN, MASS.—Although it didn’t require digging up any local roads in the end, a small biotech company has struck a partnership in life sciences with Level 3 Communications to create a seamless and secure data link that pipes genomic data directly from its laboratory just outside Boston to the Amazon Web Services (AWS) cloud facility in Ashburn, Northern Virginia.
“We have a dedicated EPL [Ethernet Private Line] that carries terabytes of genetic data into their servers and back again,” says Courtagen Life Sciences President and co-founder, Brendan McKernan.
Although the system only went live late last year, the early results could hardly be better. “Our informatics team is thrilled,” says McKernan. “Data is flowing and we’re getting patient results in a matter of minutes. It’s seamless; it’s perfect!”
Courtagen, founded by Brendan along with his brothers Kevin (Chief Technology Officer) and Brian (CEO), is a small firm of about 25 employees with a clinical laboratory generating patient genomic data for diagnostic purposes. Brendan’s forte is the implementation of world-class manufacturing concepts in running a laboratory, ideas and strategies honed over the past 15 years at the McKernan brothers’ previous company, Agencourt, and shared with partners such as the Broad Institute’s sequencing lab.
At Courtagen’s offices in Woburn, Mass., the CLIA-certified laboratory contains half-a-dozen Illumina MiSeq sequencers, but no trace of a data center. The incoming saliva (or blood and tissue) samples, referred by a growing network of physicians, are bar-coded and given a Genomic Profiling Project (GPP) number. “Once samples are accessioned and a GPP is assigned, no-one in the lab can see the Protected Health Information (PHI). PHI includes any information according to HIPAA laws that can identify a person,” says McKernan.
One of the key issues facing Courtagen today, and in the future, is how to process patient genomic data as efficiently and securely as possible. The McKernans needed a data processing approach that was both scalable—throughput is expected to grow sharply in the next 1-2 years—and yet conservative and secure, something that could withstand HIPAA regulations regarding the privacy protection of patient data.
Selecting Level 3’s network and the on-demand Amazon cloud was an obvious choice. “Amazon has the scale,” says McKernan. “Our expertise will be in interpreting scientific data to enable researchers and clinicians to make better decisions regarding patient care and drug development. We outsource everything else that’s a non-core competency. We don’t have any IT infrastructure in our facility. The data comes off the sequencers and goes right to Amazon via the Level 3 network for processing, where we utilize our ZiPhyr bioinformatics pipeline, which leverages standard industry algorithms in conjunction with our unique analysis workflows to generate results.”
“Amazon is one of the largest clouds in the world, so from a strategic standpoint, I don’t want to invest capital in something we’re not going to be number one at. The Amazon-Level 3 partnership gives us the ability to have global infrastructure that is scalable, cost effective, and extremely secure.”
How to push the data into the cloud? Until last year, Courtagen had two options, neither one ideal. One was to ship hard drives to Amazon’s facility in Virginia, but that took two days. Courtagen’s average sample-to-report cycle time is fast—just 12 days. “But adding two days for shipping is unacceptable. Our Informatics team wanted data processing in a matter of minutes,” says McKernan.
The other method was to use traditional Internet delivery through an “old, slow pipe” but delivery often stalled. “It would take days to move data up to the cloud, and if it failed for any reason, we’d have a pile-up. All the GPPs for the following week couldn’t get processed. From a scaling standpoint, we had to change,” says McKernan.
(While the data processed in the AWS cloud are de-identified, Courtagen stores and delivers patient records in a private patient portal hosted by NetSuite, a new emerging ERP system or through Courtagen’s ZiPhyr iPad application. The physician portal is managed in facilities that are both HIPAA- and SAS-700-Type II compliant.)
On the Level
McKernan began investigating the idea of a private line—off the public Internet—to transport data to AWS. In addition to avoiding pile-ups, it should provide additional security.
McKernan turned to Level 3 Communications, owner of an international fiber-optic network, and what he calls a “carrier of carriers.” Many of the major telecommunications firms run off Level 3. “Eventually everyone hits a Level 3 gateway,” says McKernan. “From there, it goes up to the cloud.”
Level 3 is one of the few global partners of Amazon’s that has “Direct Connect” capability, allowing clients to bypass the public Internet and go directly into the AWS servers.
The challenge was not so much how to transfer the data down to Virginia, but how to transmit it the 15 miles or so from Courtagen’s offices in Woburn to Level 3’s gateway on Bent Street in Cambridge, just behind the Broad Institute. “Eric Lander [Broad Institute director] must have been thinking about this 20 years ago, that’s smart. That’s one of the gateways to the Internet!” says McKernan.
As discussions with Level 3 progressed, McKernan was contemplating signing a purchase order to dig up roads and lay some new fiber-optic cable. “It was going to take a long time and cost a fair amount of money,” says McKernan.
At the last minute, another company entered the mix, providing the pipe for “the last mile.” Sidera—one of a number of companies that work with Level 3 to provide that local transmission—already had fiber in the Courtagen office building, with the all-important DWDM (Dense Wavelength Division Multiplexing) technology for scalability. This means that for Courtagen to upgrade the network from 10-gigabit to 100GE down the road, McKernan says it will only require changing a couple of cards. “Our network is now scalable to move [data on] 2,000 patients or more,” says McKernan.
Courtagen insisted on working with Level 3 as the carrier, so in the event of any network problems, Level 3 alone would be responsible for the end-to-end solution. In this instance, Sidera reports to Level 3.
Once the data connect from the Sidera pipe to the Level 3 gateway in Cambridge—one of 350 data centers Level 3 has across the world—it travels on a private line down to Ashburn. Courtagen pays Level 3 a monthly subscription fee for a minimum data commitment.
In addition to Sidera, Level 3, and AWS, Courtagen had to work with Amazon’s hosts, the Equinix facility in Virginia, as well as Check Point (a leader in securing the internet). “These relationships allowed us to combine fast networking technologies with the highest level of security for our employees and patient data,” says McKernan.
Although in the early days, McKernan says his colleagues are delighted with the way the network is working. Raw genome sequence data go in; what emerges is a rich analysis of a patient’s data with variant conservation and mutation prediction scores, which in many instances is helping Courtagen’s scientists and physicians identify deleterious mutations.
McKernan says Courtagen takes advantage of Amazon’s EC2 instances for sequencing analysis, primer design, and hosting of web servers. In addition, Courtagen utilizes StarCluster to dynamically start EC2 instances and stores their sequencing data in S3 buckets. Courtagen is also beginning to migrate long-term storage to Amazon’s Glacier platform to save money, and is evaluating AWS Elastic Beanstalk to deploy custom applications.