A Retrospective Of The 2017 Bio-IT World Conference & Expo

By Bio-IT World Staff

June 7, 2017 | The 16^th annual Bio-IT World Conference & Expo took place recently in Boston. Tracks for the conference included data and storage management; cloud computing; networking hardware; bioinformatics; next-gen sequencing informatics; data security; and more. Along with the conference tracks, the three-day conference was full of exhibit hall buzz, Best Practices awards announcements, the Benjamin Franklin Award, and judging and naming the 2017 Best of Show winners.

Here are a few of the things that were discussed on the exhibit floor and in conference sessions.

Five Steps to Vet and Manage Cloud Service Providers

During her session on Cloud Computing, Diane Pacheco, information security officer for IT at The Jackson Laboratory, spoke about why selecting a cloud service provider that you can trust is one of the most important decisions you can make when addressing the security of your company. One major reason for this, according to Pacheco, is because if there is a breach in security, the NIH will hold the institution—not the cloud provider—responsible. “This is what keeps me awake at night,” Pacheco said.

She then went on to list five basic steps that would help in the vetting process for cloud providers. First, a company must classify its data and determine which data are most at risk. According to Pacheco there are three kinds of data: confidential, sensitive, and public. Being able to sift through a company’s data and set it in one of these categories helps put into perspective what data need the greatest amount of attention.

The second step is to identify potential vendors and gather data. “The more sensitive the system and the more sensitive the data [that] the system will hold, [then] the more time you should spend investigating those vendors and making sure that they are stable and that your data and your system are safe,” said Pacheco.

Step three is to complete a security review. One should ask whether to store data with a given SaaS provider. Step four is to carefully negotiate master services agreements with vendors. Details are key when it comes to this step, according to Pacheco. Details such as invoicing and payment terms, data ownership and data extraction, and termination clauses are examples of details in the master services agreement that will save companies a lot of stress in the long run.

The fifth and final step is to monitor a vendor’s performance and schedule annual reviews with the vendor once one has been selected. Pacheco said that this will keep those vendors accountable as well as make sure both the vendor and the customer are on the same page. “You’ve got to make sure the people using your cloud services understand how to remove it, what they’re paying for, and make sure someone is monitoring that usage. You’ve got to educate them. You’ve got to put staff in front of them, and don’t let them do something on their own. That’s where the threat [is],” said Pacheco.

A Deeper Look at Variants

Tom Chittenden from WuXi NextCODE compared deep learning to “old fashioned machine learning” in a short talk during the Bioinformatics track. Chittenden’s team used the deepCODE artificial intelligence platform to assess pathogenicity for all genomic missense variants in ClinVar, training on a set of 1,000 ClinVar pathogenic variants and 1,000 NHLBI Exome Sequencing Project benign variants. The deepCODE variant models greatly improve the definition of functional missense variants, Chittenden said. “I don’t want to go as far today as to say variants are mis-annotated,” Chittenden said, “but we’re using deep learning to find new variant profiles.” Chittenden reported greater than 99% accuracy in missense annotation. Deep learning is also adept at gleaning information from combined datasets, Chittenden said. For example, the platform saw greatly improved tumor-subtype and drug response classification accuracy when combining tumor DNA-seq and RNA-seq.

To the Cloud(s): Keeping Up with DNA Sequencing

Stacey Gabriel, Director of the Genomics Platform at the Broad Institute, spoke about infrastructure requirements and how they have been changing in relation to the Broad’s work in genomics. In her presentation Gabriel announced a program which the Broad will release in the near future called “All of Us,” which is part of the Precision Medicine Initiative (PMI). “This will be a collection of millions of individuals who will be volunteers who will partake in the [PMI],” Gabriel said. “The goal of this program will eventually be to provide a complete human genome for the millions of volunteers.”

This program will be made possible with the Broad’s recent release of version 4 of the Genome Analysis Toolkit (GATK4). According to Gabriel, the GATK4 will allow new data technology to come into play, like the Apache Spark. She also mentioned that a few cloud vendors would be able to offer the newest version of GATK as a service, though she did not name specific vendors.

Gabriel was also quick to point out the application of this new data technology to important problems. She spoke on the Exome Aggregation Consortium (ExAC), a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a wide variety of large-scale sequencing projects, and its most recent version, the Genome Aggregation Database (gnomAD). The improved frequency of rare disease variants, the detection of functionally constrained genes, and the quantification of incomplete penetrance are all made possible through the aggregation of genome, said Gabriel. The next release of ExAC will have an extra 40,000 human genomes contributed to it. As Gabriel puts it, “This resource is expanding.”

Process Progress

On the Exhibit Hall floor, new products featured around every turn. ACD/Labs’ Graham McGibbon presented Luminata, a new product ACD/Labs introduced in March. The goal, McGibbon said, was to present more intelligent process chemistry, letting people work smarter. The Luminata informatics system helps companies scale up synthesis, optimizing for few steps, more efficient chemistry, and greener solvents. It helps chemists track impurities throughout synthesis linking spectral data and structures to each other and making them available across the organization. People are trying to keep track of impurities in an Excel list, McGibbon said. Luminata lets researchers track compounds and impurity classifications over time and across reactions. Seeing the whole process, McGibbon explained, lets scientists identify contaminated supply and improve chemistry. There’s a profound impact, McGibbon said, at late stages of seeing the whole picture. Early customers are checking in with the product daily, McGibbon said.

Round Ups Around the Web

Community is what makes the Bio-IT World Conference & Expo a special event. The collaborative and engaging mindset our partners bring to our annual meeting is what has prolonged the conference’s success. Here are some of the blogs and notes shared after the conference.

Core Informatics blog

GenomeWeb write up of the CIO panel

Chris Dwan’s blog post

Enlighten Bio blog