2017 Bio-IT World Best Of Show People's Choice Award Contenders
UPDATE: Voting for the Best of Show Awards is CLOSED.
May 4, 2017 | Bio-IT World is pleased to announce the 2017 Best of Show competition with the Bio-IT World People’s Choice award.
The Best of Show Awards offers exhibitors at the Bio-IT World Conference and Expo an opportunity to showcase their new products. A team of expert judges views entries on site and chooses winners in four categories based on the product’s technical merit, functionality, innovation, and in-person presentations. The judges’ finalists are highlighted in the list of contenders.
In addition to the four judges’ prizes, Bio-IT World presents a People’s Choice Award as well, which is chosen by votes from the Bio-IT World community. All of the Best of Show entries are eligible for the People’s Choice Award. Voting will open at 5:00 pm ET on Tuesday, May 23, and will remain open until 1:00 pm ET on Wednesday, May 24.
The four awards named by the judges and the People’s Choice Award will be announced at a live event on the Bio-IT World Expo floor at 5:30 pm on Wednesday, May 24.
We are excited to have the community’s input again this year on the best new products on display at Bio-IT World. Watch the Bio-IT World Twitter account @BioITWorld and #BestofShow17 for the voting link next Tuesday at 5:00 ET.
2017 Bio-IT World Best of Show Contenders
Aspera, an IBM Company, Booth 348 & 350
Product Name: Aspera Transfer Service
asperasoft.com/cloud/aspera-transfer-service/
Launched in Fall 2016, the new hosted multi-cloud Aspera High-Speed Transfer Service (ATS) is a hybrid multi-tenant service for fast and secure big data transfers to, from, and across cloud object storage systems, seamlessly connecting public and private clouds with on-premises infrastructure. ATS provides high availability and automatic scale-out, supporting data transfers with cloud storage at up to 20 Gbps per region from any distance, and providing back-of-cloud services to meet variable demands, significantly reducing operational complexity.
ATS delivers single stream transfers at maximum speed independent of network round-trip delay and packet loss, up to the I/O limits of the platform, 10-100 times faster than TCP transfers, with support for all cloud infrastructure regions and data centers. Unlike TCP-based transfers, which are highly variable, transfer times are highly predictable, and scale linearly with increases in bandwidth. Aggregate speeds scale without limit, e.g. 100 Gbps per cluster, and it supports concurrent transfers with extreme scaling to many thousands of sessions per cluster.
ATS supports single transfer session of files and directory sizes up to the largest object size supported by the particular cloud platform, and it transfers directories containing any number of individual files at high speed, even for very large numbers of very small files (100 Mbps transfers over WAN for file sets of 1-10KB in size).
Comprehensive API capabilities enable ATS to be integrated directly into third-party industry applications like the Bluebee genome analytics platform, enabling them to deliver results significantly faster than before.
Avere Systems, Booth 536
Product Name: Cloud-Core NAS (C2N)
averesystems.com/products/c2n-system
The Avere C2N (Cloud-Core NAS) system combines the familiarity of NAS with the efficiency of object storage to transform traditional storage environments into cloud infrastructures in a simple and economical manner. Scaling from 120 terabytes to more than 5 petabytes of usable capacity, Avere C2N delivers a complete, scale-out NAS system that enables starting small and scaling to large capacities for demanding applications in life sciences, media and entertainment, financial services, technology, and other high data-growth industries.
C2N technical specifications provided below:
C2N Software Features
- NAS protocols: NFSv3, SMB1, SMB2
- Management: Web GUI, SNMP, XML-RPC
- Statistics: Web GUI analytics, RRD
- Resiliency: Triple replication, erasure coding, geo-dispersal
- Data protection: Snapshots
- Encryption: AES-256, FIPS 140-2 compliance (military grade), KMIP
- Efficiency: LZ4 compression
- File system: Global namespace
- Data mobility: FlashMove (sold separately)
- Disaster recovery: FlashMirror (sold separately)
C2N System
- Usable capacity (EC): 480TB - 5.8PB
- Usable capacity (TR): 120TB - 2.9PB
- SSD capacity: 14TB - 480TB
- 10GE ports: 12 – 200
- 1GE ports: 12 - 200
CX200 Storage Node
- Raw capacity: 120TB (12x 10TB drives)
- Usable capacity (EC): 80TB (with 8+4 erasure coding)
- Usable capacity (TR): 40TB
- Height: 1U
Benchling, Booth 534
Product Name: Benchling Workflow Management System
benchling.com/enterprise
The Benchling Workflow Management system is a new, integrated module of the cloud-based Benchling platform. Together with Benchling's Lab Notebook, Molecular Biology tools, and Bioregistry, it forms the full complement of Benchling's end-to-end software platform for biologics R&D. It can also be extended through APIs to integrate with other software systems and offers instrument integration for automatic data extraction.
The Workflow Management system tracks R&D processes from start to finish. Managers can divvy up complex processes into assignable tasks that link to the files necessary to complete them, along with a templated Benchling Notebook entry, and even a protocol. Beyond knowing exactly what tasks they have to complete next, researchers can see how a lead has evolved through drug development, trace its genealogy, and see what analysis has already been done. Administrators can track progress and optimize R&D with dashboards that give an overview of pipeline progress.
With this full view of pipeline progress, researchers can also, for the first time, gain insights from work that's downstream of their own role. For example, early discovery researchers can start asking which lead molecules get the best preclinical results to help them develop better screening assays. This feedback loop multiplies the increases in efficiency gained from the system.
The Workflow Management system gives biotech companies complete visibility into their R&D processes. All data is accounted for and tied to a particular experiment, and results can be easily extracted and analyzed so that every decision is driven by data.
BioFortis, Booth 413
Product Name: Labmatrix Clinical Trial Edition (CTE)
biofortis.com
In biomarker-driven clinical trials, patient samples (biospecimens) are as important as patients themselves: sample assay results often determine patient segmentation and support primary and key secondary objectives. Collected samples, with proper consent, can also be used to provide researchers with greater insights into biological pathways and related diseases in future studies.
However, there has been minimal intersection between the development of Risk Based Monitoring (RBM) plans and the increased need for better sample tracking and issue resolution across the entire clinical development ecosystem of sponsors, CROs, sites, labs, and biorepositories. Since sample-based data usually represent the majority of study data in regulatory submissions, an RBM plan for a biomarker-driven trial can hardly account for the full spectrum of study risks without having a sample-centric component. For example, patient sample issues can lead to critical study milestone risks such as database lock and study closure.
Labmatrix Clinical Trial Edition, a web-based SaaS offering from BioFortis, enables sponsors, CROs, and other clinical trial partners to 1) reduce the risk of sample logistics becoming the bottleneck in clinical trial execution, 2) acquire actionable and predictive insights into the “health” of their clinical trial operations from a sample-centric perspective, 3) discover and resolve sample issues earlier and more effectively, 4) ensure regulatory compliance to patient informed consent regarding retention, use, and destruction, 5) extend utilization of banked samples beyond the current clinical study (e.g. translational research and new study proof-of-concept), and 6) reduce storage costs and optimize storage capacity.
Biomatters, Booth 353
Product Name: Geneious Biologics
geneious.com/biologics
Geneious Biologics is a next generation cloud software service developed, in cooperation with industry partners, as an enterprise solution for commercial antibody discovery and screening. All functionality is delivered via a modern, easy to use, responsive web interface.
Geneious Biologics is highly scalable and customizable to integrate with specific customer workflows and functional data requirements. It has preconfigured workflows for analysis of mAb, TCR or synthetically constructed antibody libraries and is highly configurable in supporting peptides or other molecules not included in these categories.
Geneious Biologics incorporates integration points that allow a seamless flow of data across a customer’s business, lifting the burden of expensive, time consuming customization work. The platform provides the ability to consolidate data from multiple sources at scale and apply advanced analytics.
Leveraging data from antibody binding assays or other functional assays, together with state of the art visual representation of sequence data, Geneious Biologics enables researchers to accurately qualify the best candidates for further in-depth analysis.
Aggregating mission critical data into a unified central database, Geneious Biologics makes company-wide collaboration easy, removing duplication of effort and the tedious task of managing disparate spreadsheets without a visible audit trail.
Powerful filtering and query language allows scientists to quickly and accurately find data of interest regardless of its origin. A RESTful API allows for advanced integration of Geneious Biologics with an organization's legacy data systems. Geneious Biologics also supports well known data formats such as Genbank and Excel for direct import and export.
Bluebee, Booth 425 & 427
Product Name: Bluebee Genomics Platform 1.6
bluebee.com
Bluebee has pioneered how life science researchers and clinicians in genomics navigate DNA data management, processing, analysis, and collaboration. The Bluebee solution is a robust, integratable platform that is scalable, fast, and highly secure. It supports an entire cross-functional team of lab managers, lab technicians, biologists, and bioinformaticians; catering to both research and clinical communities enabling faster research and clinical data delivery, which in turn will have a direct positive impact on improved medical treatment, through personalized medicine.
As an extension to the Bluebee genomics platform, which is primarily targeted towards the expert user; Bluebee has now launched a service to create and deploy dedicated satellite applications for kit providers, focusing on a convenient and integrated end-user experience. This enables providers to have a fast go-to-market and a global rollout of their kit solution with the following attributes:
- Delivery of complete end-to-end solutions with integrated, optimized, and accelerated data analysis for proprietary data analysis algorithms.
- Data and actionable reports or delivered via a web portal or an app.
- Ability to enter new markets by removing the analysis headache for the kit customers.
- Compliance with local data regulations via a roll-out in 29 datacenter across the globe, while preserving an integrated view and management layer.
- Delivery of quality assurance with the kits.
- Reduction of technical data analysis support for kits customers.
At Bio-IT World Bluebee will showcase several pioneering customer applications, highlighting the technological innovation, fast deployment capabilities, and the business value for (diagnostic) kit providers.
Bright Computing, Booth 519
Product Name: Bright Cluster Manager 8.0
brightcomputing.com
Bright Cluster Manager enables customers to deploy complete clusters over bare metal and manage them effectively. It provides single-pane-of-glass management for the hardware, the operating system, the HPC software, and the users. With Bright Cluster Manager for HPC, IT/system administrators can quickly get clusters up and running and keep them running reliably throughout their lifecycle — all with the ease and elegance of a full-featured, enterprise-grade cluster manager.
Bright Cluster Manager lets you administer clusters as a single entity, provisioning the hardware, operating system, and workload manager from a unified interface. This makes it easier to build a reliable cluster. Once your cluster is up and running, the Bright cluster management daemon keeps an eye on virtually every aspect of every node, and reports any problems it detects in the software or the hardware, so that you can take action and keep your cluster healthy. The intuitive Bright management console lets you see and respond to what’s going on in your cluster anywhere, anytime.
Finalist: ByteGrid, Booth 352 & 354
Product Name: BioShare File Sync (version 1)
bytegrid.com/bioshare
BioShare is a cloud based, compliant document storage solution for Life Sciences companies. It’s a validated, secure and reliable file synchronization platform which allows users to collaborate, back up vital files and folders, and synchronize content with one another. Available on ByteGrid’s ultra-secure and GxP validated Cloud Infrastructure to meet compliance mandates.
BioShare is simple and quick to set up and deploy and offers a complete range of features just like other document sharing software. Unlike consumer-grade file sync services, BioShare has been validated to be compliant with regulations such as 21 CFR part 11, and HIPAA/HITECH.
Quick Features:
- Anywhere Access: Keep all your files with you wherever you may be – on any device mobile or desktop. The need for VPN is eliminated with our validated cloud based solution.
- Secure/Easy Sharing: Share files and folder with colleagues and third parties with the share link feature. You can enforce limits on downloads, expiration dates, etc. to protect your shared data.
- Collaborate Easily: Make collaboration more efficient with easy access to the files users need when they need them, eliminating emailing sensitive info. File locking also prevents accidental data overwrites.
- Control Your Data: Set permissions to limit who has access to what and track user activity using the Activity Log. Ensure total data security with remote wipe and enforced two factor authentication.
- Protect your Data: Keep all your files, even those locally stored, backed up and safe. Features like revision rollbacks, restoring previous versions of existing or deleted files possible.
Finalist: Cambridge Semantics, Booth 344
Product Name: Anzo Smart Data Lake 4.0
cambridgesemantics.com
The Smart Patient Data Lake, based on Anzo Smart Data Lake 4.0 allows users across the organization to ask any question across the full universe of patient data. ASDL delivers on-demand access to data for real-time analysis, exploration and reporting – right within the data lake. In addition, users can off-ramp answer sets into “last mile” analytics or visualization tools of choice.
· Clinicians can ask questions across one more studies rapidly and securely
· Translational research can look across clinical and RWE to ask exploratory questions
· AE Case Reports are processed quickly and automatically with only baseline human review required
· Medical data including patient and doctor feedback is available for interactive analysis including sentiment and trends.
Answers and insights found with ASDL carry trust - rooted in the rich provenance associated with models, data and analytics. Collaborative analyses are captured in ASDL’s active metadata store so that answers trace back to source data, models, rules and transformations.
Organizations adopting ASDL move forward with confidence and trust – both in the technology and partnership with Cambridge Semantics.
Certara, Booth 435
Product Name: D360 Express
certara.com
D360 Express Scientific Data Informatics Hub is a self-service integrated solution specifically designed for discovery scientists at smaller pharmaceutical research organizations that do not require an enterprise solution for data integration. D360 Express differs from other data integration tools by providing an end-user accessible scientific data network that goes beyond data retrieval to make better informed decisions - from interactive data filtering, exploration and visualization to high performance data query, integrated analysis tools and support for virtual compounds.
Scientific searches are facilitated with an intuitive drag-and-drop query-building interface that allows users to generate project dashboards without having to know where the data resides and in what format. With a few simple mouse clicks, queries can be created or executed that retrieve, transform, and present the most up-to-date relevant data in the required format for analysis without the need for administrative support.
The D360 Express integrated visualizations, virtual compound capabilities, and chemistry analytics tools provide scientists the ability to easily explore structure activity relationships, resulting in faster, more objective project decisions. D360 Express comes with standard connectors for your data sources and can be quickly deployed without changes to your existing IT infrastructure. D360 Express also connects to a wide range of data analysis, chemistry sketchers, productivity and presentation tools providing quick and efficient transition from data analysis to scientific understanding.
ChemAxon, Booth 245
Product Name: BioEddie
chemaxon.com/products/bioeddie/
Chemists and biologists view entities for the pharma/biotech industry from different perspectives. Bioinformatic tools focus on the abstraction of repeating units of biopolymers, but they fail to describe anything beyond the natural sets of amino acids/nucleotides. Cheminformatic applications have demonstrated capability to handle a variety of molecules due to a lower level abstraction at the atom/bond level, however, the sequence information is obscured.
The Pistoia Alliance, a not-for-profit alliance of life science organizations has bridged the gap between chemistry and biology by establishing an open source community; as well as, a software ecosystem around Pfizer’s Hierarchical Editing Language for Macromolecules (HELM), which captures both sequence information and chemical structure simultaneously.
This exciting innovation at the core of the chemistry-biology interface inspired us to design a new web-based sketching tool for biomolecules. From fully characterized to fully unknown, from exact chemical descriptions to heterogeneous products, and from biological sequences to exotic bioconjugates. BioEddie allows users to define, visualize, and share structurally complex biomolecules with ease. This editor not only translates complex biomolecules in a digital environment, but adds chemical intelligence to the structures. The editor uses its own customizable monomer library, but renders HELM in respects to the source. It enables ad-hoc structural modifications and effortless inter-conversion of chem-bio file formats.
BioEddie, combines state-of-the-art JavaScript technology, with an intuitive user interface, along a simple API to create an easy-to-use, versatile, browser-based tool, that is readily integrated with existing web-enabled software tools- particularly ChemAxon’s Biomolecule Toolkit.
Finalist: Congenica, Booth 421
Product Name: Sapientia 1.5
congenica.com
Sapientia integrates a suite of powerful analytical tools enabling rapid, accurate and scalable interpretation of a patient’s genotypic and phenotypic data to facilitate a more comprehensive diagnosis.
Sapientia leverages gold-standard reference databases such as ClinVAR and DECHIPHER. These are integrated alongside customer databases, and Congenica’s own internal knowledgebase of annotated variants – curated by clinical experts.
Sapientia contains features such as a fully integrated, customizable Genome Browser, which allows customers to interrogate and visualize the variants in a patient’s genome. Sapientia also includes a karyogram display to navigate chromosomes to see SNPs, small insertions and deletions, CNVs and larger structural variants in a single view.
Using the Exomiser function, users can prioritize variants according to their relevance to patients’ phenotype whilst filtering out common and synonymous variants.
Alongside this, Sapientia allows clinical scientists to connect with users across the globe in Multi-Disciplinary Team (MDT) meetings, to see if their patient’s variant has been seen before, and the diagnoses given in similar cases. Sapientia gives MDTs a platform to share their diagnoses and discoveries for the benefit of all users, whilst creating a full audit trail of decisions.
Sapientia allows users to customize and configure projects to meet their lab and workflow needs. By offering a range of configurable features and services, it can facilitate a range of workflows, and supports clinicians in a wide range of projects.
Using these features alongside its knowledgebase of variants, Sapientia enables clinicians to interpret a patient’s genome in as little as 45 minutes.
Core Informatics, Booth 160 & 162
Product Name: OData API
coreinformatics.com
Core’s OData API (application programming interface) was built to meet the needs of developers and end users in scientific industries who need to integrate data across a variety of proprietary databases, programming languages, and applications. The OData API effectively unlocks data within the Platform for Science to be consumed by other web services – especially OData services – improving interoperability and easing integrations.
The OData API was released in October 2016 and is available in version 5.2 and above of Core’s Platform for Science. The API leverages the Open Data Protocol (OData) standard for RESTful web-services, and removes the burden of dealing with ad hoc, point-to-point integrations using custom APIs and manual workarounds for systems which do not have APIs. Development time and costs are lowered by leveraging OData libraries available in development languages including Java, .NET, JavaScript, and others.
The OData API enables out of the box use of powerful integration tools and enterprise service bus (ESB) middleware such as Mulesoft. It also enables users to be connected to numerous tools that already support using data from an OData producer – such as Excel and Tableau – without having to write a single line of code. The OData specification allows for true plug-and-play integration between systems.
Moving to Core’s OData API can increase efficiency by freeing up programmers to work on new projects – rather than having to learn multiple one-off APIs and maintain integrations between them. Core Informatics is the first lab informatics vendor to offer an OData API.
Cray Inc. , Booth 452
Product Name: Cray Urika-GX
cray.com/products/analytics/urika-gx
The Urika-GX big data appliance delivers on this need for high-frequency insights through a potent combination of system agility and pervasive speed:
- System flexibility from open, enterprise standards and the ability to run multiple workloads concurrently, including Apache Spark, Hadoop and a graph database
- Supercomputing technology and approach for unmatched speed and performance
- Breakthrough insights with iterative and interactive agile analytics plus the versatility to repurpose and tune workflows
DocuSign, Booth 156
Product Name: Part 11 Module, v2.0
docusign.com/solutions/industries/life-sciences
The Part 11 Module allows customers to sign, send, and manage important documents anywhere, anytime, on any device while adhering to regulatory standards. DocuSign's Part 11 module contains industry-designed capabilities that include:
- Pre-packaged account configuration
- Signature-level credentialing
- Signature-level Signing Reason
- Signature manifestation (Unique ID, Printed Name, Date/Time Stamp, and Signature Reason)
- Detailed audit trail
- Tamper-evident digital seal using open PKI standards
- REST and SOAP APIs to integrate regulated workflows into clinical, quality, manufacturing, and content management systems
The product allows customers to automate workflows, authenticate signers, and collect signatures and approvals from individuals who are both internal and external to the organization. Documents can be sourced from content management systems and completed documents can be stored digitally behind a company's firewall.
Finalist: Eagle Genomics Ltd, Booth 347
Product Name: eaglecurate v1
eaglegenomics.com/eaglecurate/
Data curation is an investment that is not always cost-effective. The challenges are: when should we curate data? how much data curation is enough?
The automation of the data curation process and semantic enrichment will allow dramatic productivity improvements in the innovation process and the targeted use of data assets for relevant analysis.
Building on the foundations of our best of show award-winning eaglediscover product, eaglecurate enables value-driven data curation. It represents the maturation of many years of experience and thorough ethnographic observations of how biocurators actually work. When biocuration is focused on semantic enrichment, it is mainly a contextual activity that consists in constructing an “entailment mesh” of data sources, context and underlying entities of interest rather than merely structural or syntaxical transformation. “Context” is often tacit and implicit in the mind of the biocurator and in the case of life science data it consists often in the experimental and design studies used to generate data.
By combining advanced machine learning and data measurement capabilities, eaglecurate allows for reverse-engineering of data into a graphical representation of the context, in the form of a process-oriented graph. The graph represents the experimental processes, study designs, observational studies and assays, etc., employed to generate the data.
eaglecurate delivers a self-service platform to biocurators, bioinformaticians and scientists for curating their datasets by exploiting a complex graph structure connecting process-elements and artefacts and, at the same time, it provides executive management with an advanced dashboard for oversight that enables data governance by design.
Finalist: Genedata, Booth 238 & 240
Product Name: Genedata Bioprocess
genedata.com/products/bioprocess/
Genedata Bioprocess is an off-the-shelf enterprise software solution developed to make large-molecule bioprocess development and CMC activities more efficient and to improve process quality. The system streamlines and automates complex workflows in cell line development, upstream and downstream process development (USP and DSP) workflows, drug formulation, and analytical development. By supporting the complete large-molecule development workflow, it enables biopharmaceutical, biotechnology, as well as contract manufacturing organizations (CMOs), to develop and manufacture novel and generic protein-based therapeutics (biosimilars) more efficiently. The system can be used to develop manufacturing processes for both antibodies (IgGs), therapeutic proteins (e.g., FVIII variants, and fusion proteins), as well as highly engineered therapeutics (e.g., bispecifics and multi-specific antibodies, antibody drug conjugates, novel scaffolds).
Increasing throughput from the automation of key steps in the bioprocess R&D workflow is at the heart of Genedata Bioprocess. The platform tracks and documents the full history of each production cell line and collects all relevant characterization data, such as productivity and quality parameters. Built-in business logic also streamlines the underlying laboratory workflows. The system can be directly integrated with laboratory equipment, such as colony pickers, screening robots, bioreactors, and analytical devices. Data originating from different sources and systems is automatically processed, analyzed, and shared between different teams (e.g., cell line development, expression, purification, formulation, and analytical development groups). Special analysis tools provide robust assessments of developability and manufacturability risk. The platform auto-generates a variety of reports, which can be used for regulatory documents, such as for Biologics License Application (BLA) submissions.
Genedata, Booth 238 & 240
Product Name: Genedata Profiler 10.1
genedata.com/products/profiler/
Co-developed with major Pharma organizations, Genedata Profiler is a comprehensive enterprise software solution for the processing, management and analysis of massive amounts of raw data from next-generation sequencing (NGS), microarray, real time-PCR, mass spectrometry and other omics technologies in the context of phenotypic data. The innovative software platform combines high-performance raw omic data processing pipelines, sophisticated data analyses and unparalleled data visualizations coupled to an advanced distributed data management infrastructure.
Genedata Profiler is fast becoming the software of choice for efficient and effective omic-based translational and clinical research, empowering researchers to generate valuable scientific insights, at scale while enabling compliance with increasingly complex data privacy laws and regulatory requirements.
Genedata Profiler offers the following key benefits
- Organize and manage data: Leverage rich omic, phenotypic and patient data using a powerful data management infrastructure that integrates, federates, and curates data from internal, external, and public data sources.
- Automate data processing: Rapidly build scalable raw data processing pipelines supporting all major HPC clusters, and integrate proprietary and public domain tools.
- Gain scientific insights: Make discoveries using peer-reviewed methods and algorithms, interactive data analysis tools, a a rich statistical toolbox and intuitive visualizations.
- Collaborate securely: Securely share data, methods and results across global research organizations and external collaborators
- Harmonize data analysis: Standardize data processing and analysis pipelines using a comprehensive method lifecycle management and quality reporting system
- Ensure regulatory compliance: Facilitate compliance with data privacy and regulatory requirements through comprehensive chain of custody, data provenance, and audit trail capabilities
Genedata, Booth 238 & 240
Product Name: Genedata Selector 5
genedata.com/
Genedata Selector 5 is the most significant update in the history of our highly-successful genomic knowledge management platform, which has seen continuous development since its inception in 2010. By investing in modern browser-based technologies, Selector 5 delivers a completely new biologist-friendly interface with powerful collaboration tools. This allows bioinformaticians, strain engineers, and other scientists to work together and share information more easily than ever before. A new project-centric design allows diverse team members to generate and share detailed views on genomic and analytical data. Community Annotation features allow for users to easily associate their observations with specific content or genomic regions so that others do not duplicate work or misinterpret information. Users can send and receive notifications, allowing even occasional users of the platforms to have 1-click access to critical content that is shared by a colleague.
In addition to providing biologist-friendly tools for interpreting genomic information, Genedata Selector is packaged with robust APIs and a rich bioinformatics toolbox to allow data scientists to perform complex analyses and calculations across genomes. Raw DNA-seq and RNA-seq data can be processed, analyzed, and reported and sequence comparisons can be efficiently calculated across proteomes and assembled genomes and microbiomes. All of this information is centrally managed and easily available for interdisciplinary teams to access and interpret as it relates to their project of interest.
Finalist: Genestack, Booth 237
Product Name: The Genestack Platform
genestack.com/platform
The Genestack bioinformatics platform brings together a powerful data and metadata management infrastructure, a full suite of pipelines and a range of interactive visual analytics tools. The platform provides users with an extensible and flexible infrastructure capturing raw multi-omics data, metadata, pipeline results, intermediate files and analysis reports. There are numerous features of the platform that enable organizing datasets at scale: metadata validation through templates with predefined attributes, ontologies and controlled vocabularies, a context-sensitive metadata editor and a fine-grained data access control model.
Genestack enables seamless integration of private, shared and public datasets. Public data from major repositories worldwide is indexed on the platform and harmonized by ontology mapping.
Our search and browse data interface makes faceted search across enterprise and public data possible, with meta-study construction and analyses a click away. Our data curation tools include an interactive ontology-enabled spreadsheet interface with dictionary mapping and extension capability.
On Genestack, you will find a growing toolbox of genomics applications. These include gold-standard open source tools for processing WGS/WES, RNA-seq, microarray data and more, as well as interactive tools we develop in house. Interactive apps built and available on Genestack include a genome browser with computable tracks, interactive variation explorer for on-the-fly filtering and analysis of variants across populations, and visual tools for quality control assessment and outlier detection. The pipelines are customizable and easily used by researchers who want to perform data analysis without having to write code or use the Unix command line.
Genotech Matrix, Booth 125
Product Name: BioMed Miner
genotechmatrix.com
BioMed Miner is an intuitive, multi-dimensional text mining platform that allows users to accurately and efficiently locate highly relevant research among the expansive library of available biomedical literature. Through a proprietary algorithm, BioMed Miner enables users to navigate their topics of interest among the tens of millions of publicly available articles by combining search terms through various lenses, such as functions, publication years, to help them drill down to useful articles and develop a realistic and useful list of relevant research. It generates a heat map of articles which is organized and categorized based on concepts and diseases, tissues and organs and by year to help researchers drill down into their topics of interest in a much more relevant way than what is currently available. This web-based platform is available to users through an annual licensing per site or user fee at a significant cost savings to other proprietary text mining platforms.
Genotech Matrix, Booth 125
Product Name: Cytofkit
genotechmatrix.com
Cytofkit software solves the problem by providing automated, objective and unsupervised analysis of FCM, CyTOF and FACS data. Cytofkit offers a wide range of dimension reduction (PCA, tSNE, ISOMAP, diffusion map) and clustering algorithms (DENSVM, phonograph, ClusterX, flowSOM). Dimension reduction methods compress the high dimensional information to 2 to 3 dimensions, enabling visualization of cell subsets. Unsupervised clustering algorithms objectively group cells into subsets. Cytofkit provides biologists with friendly graphical user interface and interactive visualization of analysis results. Cytofkit incorporates automated analysis results as additional parameters into clusters with manually gated cell populations. Cytofkit is currently used by more than 4000 users worldwide. It will greatly benefit bench scientists and unblock the bottleneck of bioinformatician shortage; and will also standardize cytometry analysis for industrial and clinical applications.
Finalist: Genotech Matrix, Booth 125
Product Name: Precision Medicine Platform & iCMDB Knowledgebase
genotechmatrix.com
In addition to NGS analytical toolkits and infrastructure hardware enhancers, Genotech Matrix offers a customized precision medicine workflow and knowledgebase that can be installed into a local server and continuously updated with new information.
iCMDB (intelligence in Clinical Medicine for Decision-Making and Best practices) is a manually curated knowledgebase featuring comprehensive coverage of genomic variants as well as gene and protein expression alterations. Scientific mechanisms of these variants/alterations and clinical interpretations of disease association, diagnosis, prognosis, treatment and drug response are obtained from global databases and literature and stored in a relationship database to allow computerized organization and knowledge retrieval. All expertly curated evidence is accompanied with EBM evidence level. Treatment options (approved/investigational) and best practice guidelines have been integrated to facilitate patient stratification.
The precision medicine platform, including Origo the platform for NGS-based molecular oncology diagnosis, provides automated QC, analysis, informative data visualization, variant assessment and integrative lab reporting functions. With a seamless integration with the LIMS system, the laboratory could minimize steps to create patient profiles and test requests. Automatic data syncing and batch processing functions streamline the laboratory’s workflow. Graphic data presentation allows users to visually examine variant quality and explore variant information from the public database. With the supplement of the iCMDB knowledgebase, pathologists or geneticists can quickly populate the clinical report by extracting annotation from the knowledgebase and adding their own interpretation and summary of the case. Clinical trial matching is powered by intelligent condition and molecular marker matching and refined by manual filtration.
Genotech Matrix, Booth 125
Product Name: T/B Cell Receptor Repertoire Analysis
genotechmatrix.com
To overcome these problems, an adaptor-ligation mediated PCR technology was developed. An adaptor sequence is added at the 5’ end of the template DNA. One pair or primers – one for the added adaptor sequence and another for the C region sequence – is used in the PCR reaction thereby eliminating mismatching, no-priming and other potential amplification problems. Combined with a dedicated software (Repertoire Genesis Software) to process the data output from the next-generation sequencing, the immune repertoire data can be easily analyzed. This adaptor ligation-PCR (AL-PCR) technology and Repertoire Genesis Software with Illumina MiSeq is combined to create a platform for Immune Repertoire Analysis. This integrated technology enables an unbiased analysis of the TCR or BCR repertoire quantitatively and accurately and is superior to other conventional technologies such as multiplex PCR. This service can be used in a wide variety of applications, e.g., identification of antigen-specific TCR/BCR, evaluation of efficacy of immune checkpoint blockers, and cancer immune therapy.
Finalist: The Hyve, Booth 417
Product Name: MatchMiner v1.0
matchminer.org
Developed by Dana Farber Cancer Institute, in close collaboration with The Hyve, MatchMiner is an open source computational platform for algorithmically matching patient-specific genomic profiles to precision medicine clinical trials. The input is two fold: patient-specific genomic and clinical records, and structured eligibility criteria for clinical trials. Patient-specific information includes somatic genomic events, such as mutations, copy number alterations, and structural variants. Basic clinical data such as cancer type, age, and sex extracted from the Electronic Medical Record (EMR) are also transmitted. Structured clinical trial eligibility criteria includes the genomic and basic clinical criteria outlined in the trial protocol documents. The MatchMiner platform matches patient-specific genomic events to clinical trials, and makes the results available to trial investigators and clinicians via a web-based platform.
The software architecture of MatchMiner is divided into two main components to increase developmental flexibility. The MatchEngine is written in Python using the Eve framework to expose a RESTful API. All data is stored and indexed in several MongoDB collections. The frontend is written in AngularJS 1.5 using the Material Design components and philosophy. ElasticSearch is also used to facilitate searching of the data, enabling users to create customized aggregate search queries. MatchMiner supports all major browsers.
MatchMiner has currently won the Harvard Business School Kraft Precision Trials Challenge: http://www.hbs.edu/news/releases/Pages/matchmaker-wins-hbs-kraft-challenge.aspx
Finalist: The Hyve, Booth 417
Product Name: RADAR
thehyve.nl
The key components of our software stack include: Data Ingestion and Schematisation, Data Storage and Data Interface, Data Analytics, Front-end Ecosystem, Privacy and Security.
The RADAR-CNS data pipeline architecture is built around Apache Kafka, the central part of the Confluent streaming tools. Data is remotely collected from passive sources, like accelerometer and heart rate sensors, and active data sources, like patient questionnaires. These data streams are ingested via a REST proxy that produces native Kafka calls. The data is schematized with Apache Avro before it enters the system, enforcing a consistent data stream. Also, at the end of the project this provides metadata on all fields in the data. The backend does real-time analysis using Kafka Streams to monitor the data ingestion and compute aggregates. It simultaneously persists terabytes of raw data. By using Hadoop file system, these data are redundantly stored, reducing the chance of data loss. Two data storage layers are deployed: a cold and hot storage for historical raw data persistence and low latency aggregated data access respectively.
Data is exposed via secured REST APIs for post analysis and monitoring. A web-based dashboard integrates these REST APIs to provide a monitoring and visualization platform for researchers and clinicians. The dashboard provides live insights of gathered data and statuses of device data collection and storage. Raw data from the cold storage can be structured and exported in Avro, JSON and CSV formats to develop novel algorithms.
Innoplexus AG, Booth 355
Product Name: iPlexus, version 1
iplexus.co
iPlexus is an end to end platform for Life Sciences which leverages artificial intelligence to generate continuous intelligence and insights across discovery, clinical development, regulatory and commercial stages of drug development, spanning all major therapeutic areas and indications. It has modules for Competitive Intelligence, Clinical Intelligence, Regulatory Intelligence, Gene and Intervention landscapes.
The platform leverages our proprietary CAAV framework to crawl, aggregate, analyze, index, and visualize hundreds of terabytes of scientific data across hundreds of clinical trial databases, biological databases, major patent offices, congresses, theses, forums and regulatory bodies. It offers a confidence that the researcher will not miss any significant information that might impact his/her research later. There is value in getting information readily and as soon as its available - more so in research, as it can totally change the path of the work that happens next. A researcher, now with its help, can get real time access and understanding of the wide variety of research data, and benefit from understanding their cross-pollination.
It offers an easy to use, intuitive, secure, cloud based system which generates intelligence across datasets using machine learning models, natural language processing and advanced text analytics. Our patent pending data extraction algorithms leverage our Ontologise framework to transform raw unstructured data into a structured form. Making data and insights instantly and continuously available takes the pain out of data collection, curation and analysis, eventually helping take informed decisions better and faster.
inviCRO, Booth 528
Product Name: iPACS Clinical 2.0
invicro.com/
The iPACS Clinical is currently the only solution in the marketplace serving the technical and user needs for a clinical imaging trial. The iPACS Clinical product enables secure, intelligent and efficient submissions of clinical and nonclinical image data from an imaging center and/or CRO. Smart-Transfer protocols provide wizard-like data transfer workflows with minimal software footprint required from the sending site. Electronic data transfers are well supported via data chunking, and resumable transfer features to augment current HTTP(s) limitations. iPACS users can manage in real-time the transfer and organization of study data from multiple imaging trials across multiple imaging centers through intuitive administrative dashboard pages. The iPACS offers flexible access-controls for data submitters, data reviewers and general iPACS users. Automated processing modules facilitate batch DICOM format validation, DICOM tag modification, DICOM anonymization and DICOM tag smart filtering routines. Users can associate an unlimited number of metadata fields to a data set throughout the data life cycle via flexible data point forms. The iPACS utilizes an internal reporting engine to produce flexible reports (Excel, PPT, PDF, XML, TXT) on any iPACS object (imaging objects, file system, users, data points) providing boundless data aggregation and data mining applications. To further enhance security and performance, iPACS can be deployed in any on-prem, cloud or hybrid-cloud architecture. In addition, the iPACS can be implemented in a 21 CFR Part 11 and GxP compliant manner with full audit trail support.
Jisto, Booth 325
Product Name: Jisto Elastic Resource Scheduler
jisto.com/
The average utilization of server resources today, whether in the cloud or on-premises, is 10–20%, meaning most of the money spent on them is going to waste.
By leveraging elastic resource scheduling methods to automate better usage of server resources, it is possible to run many more applications on fewer server resources, doubling or tripling the utilization, while still maintaining the ability for applications to have access to peak-demand resources when they need them.
The bio-IT community runs massive data sets and many compute-intensive applications that leverage (cloud or on-premises) data center and high-performance computing capacity. Jisto is able to make much better usage of those resources, doubling or tripling their usage. This allows you to run many more applications on those resources, and to save money by requiring reduced resources on which these applications need to run.
Finalist: Komprise, Booth 149
Product Name: Komprise Data Management v2.0
komprise.com
Komprise is intelligent data management software that enables biotech and genomics companies to manage explosive data growth across on premise and/or cloud storage while cutting 70%+ costs.
Komprise delivers:
- Data insights: Komprise works across your storage (on premise and cloud) and runs as a hybrid cloud service - simply download the Komprise Observer virtual machine, point it at your storage (any NFS, SMB/CIFS storage including NetApp, EMC, IBM filers, Windows File Servers) and in under 15 minutes, even on petabytes of data, you will start receiving analytics into your data to understand how it’s growing, how it’s being used, and how much and what data is hot or cold.
- Interactive ROI Analysis: Simply set policies on how you want data to be moved and replicated, and specify to what targets. The solution can accommodate a variety of targets from multiple vendors as long as they adhere to standards such as NFS, SMB/CIFS, Cloud, S3/REST. Komprise interactively shows ROI and provides insights into your storage cost savings.
- Transparent Data Movement: When ready, you can activate your plan and Komprise will move your data while preserving transparent access from the source. Users and applications continue to see and access moved data exactly as before (as files) at the source, even though the data may be stored as objects at the target.
- Scale On-Demand: Komprise runs as a virtual machine, and more can be added on-demand to address massive scale without upfront costs or dedicated infrastructure.
Lab7 Systems, Booth 563
Product Name: Lab7/IBM Genomic Cloud
lab7.io
To overcome the shortcomings of commodity cloud solutions, Lab7 Systems and IBM have purpose-built the Genomic Cloud using proven high-performance computing best practices for genomics. Highlights of the Cloud’s infrastructure include:
- Tightly coupled IBM Spectrum Scale (GPFS) storage, high-memory compute nodes, and a fast internal network designed for high-throughput genomics
- Supports up to 500 compute nodes and more than 10 Petabytes of storage
- Can be configured to support remote, on premise, or hybrid cloud models in isolated single-tenant or multi-tenant instances
Purpose-Built Hardware:
- Virtual machines and networks are great if you’re spinning up websites, but not so much for scientific applications
- Our hardware was designed from the ground up for scientific computing
Secure-Virtual network infrastructure:
- Dedicated VPN, VLAN, and single sign-on support ensures secure access, no additional setup or costs
Comprehensive/complete software stack:
- The included software streamlines deployment and operations
- A complete, supported software stack, including BioBuilds, Lab7 ESP, and IBM Spectrum LSF scheduler, is ready to run, with no additional setup costs
- Everything is in place to be operational within 2 weeks. Equivalent commodity cloud functionality could take up to six months to fully deploy
Automated-Lab7 ESP automates analysis and reporting
- Automating common tasks using pipelines, Docker containers, and other execution engines frees up valuable bioinformatics and IT resources, leaving more time for science
LabKey, Booth 543
Product Name: LabKey Biologics 1.0
labkey.com/products-services/labkey-biologics/
LabKey Biologics is a software application designed to accelerate large molecule development by providing research teams with a suite of intuitive tools for entity registration, workflow management, and data exploration. LabKey Biologics leverages the robust data management capabilities of LabKey’s core scientific collaboration platform, LabKey Server, and builds on that foundation with new capabilities supporting biological entity registration, lineage tracking and visualization, and analytical data querying.
LabKey Biologics was developed in partnership with flagship customer, Just Biotherapeutics, and an advisory board of biotech organizations that helped shape the application to effectively support biologics R&D workflows. The application provides tools to standardize common research tasks like facilitating assay and sample requests as well as comprehensive dashboards that give teams visibility into their complete workflow.
LabKey Biologics is built on top of Apache Tomcat and stores data in either PostgreSQL or Microsoft SQL Server. It uses modern web development techniques including ReactJS, Typescript, SCSS, and other technologies to deliver a clean and user-friendly interface. Data can be easily imported and exported from the browser interface by uploading files or by using the programmatic interface available in R, Java, JavaScript, or Python.
Linguamatics, Booth 345
Product Name: Linguamatics I2E 5.0
linguamatics.com/products-services/about-i2e
Linguamatics I2E transforms unstructured text into structured data. Its advanced Natural Language Processing (NLP) enables you to answer business-critical questions by rapidly extracting relevant facts and relationships from large document collections.
I2E can be used to transform and annotate documents, or provide real-time querying over massive data sets. For example, a query over electronic health records might ask for cancer patients under 65 with a BMI over 30.
Query results are returned with relevant context, and you can easily modify and compare queries to gain the balance of precision and recall you need. You are not only presented with structured search results and assertions, but can also easily drill down to the underlying evidence.
I2E 5.0, the latest of release of I2E, delivers major new enhancements, including normalization of concepts, advanced range search, and a new query language (EASL). These capabilities tackle the variety in big data, to provide insights from both the estimated 80% of data trapped in unstructured text, as well as from semi-structured and structured data sources.
Nexsan, Booth 539
Product Name: Unity
nexsan.com/products/unified-storage-unity/
Nexsan’s Unity is the first and only hyper-unified storage platform. Unity offers support for advanced block and file workloads while also adding the rich capabilities IT teams demand and users require at no additional charge. The product levels offered are Unity2000 (entry level), Unity4000 (mid-range) and Unity6000 (high-end).
With Unity’s private cloud, all data is stored, managed and synchronized in a private, on-premises cloud environment for security, privacy and compliancy.
Positioning
- Unified Storage PLUS file sync & share, multi-site sync, and secure archive.
- Ability to scale as you grow.
Specifications
- Hybrid (SSD/HDD)
- SSD Caching
- Unified storage: NAS/SAN
- NAS Protocols: NFS, SMB, FTP
- SAN Protocols: FC, iSCSI
- Enterprise File Sync Share
- Mobile Devices (iOS & Android)
- Desktops & Servers (Windows & MacOS)
- Browser Access
- Multi-site (n-Way) Synchronization
- Async Replication
- Snapshots
- Thin Provisioning
- Compression
- Encryption (SED and FIPS available)
Scalability
- up to 5PB
Portfolio
- Unity2000: 2U24 | 3U16 | 168TB Max
- Unity4000: 2U18 | 4U48 | 4U60 | 2.1PB Max
- Unity6000: 2U18 | 2U24 | 4U48 | 4U60 | 5PB Max
NODEUM, Booth 557
Product Name: NODEUM v1.3
nodeum.io/
NODEUM is a unique software product designed to perfectly manage exponential data storage and usage. It is based on commodity hardware with no vendor lock-in. Offering a hybrid storage and archival system starting at 50TB, but scalable up to 72PB if needed.
Scalability without Complexity.
NODEUM allows access to the infrastructure via a Virtualized File System based on a client specific mix of storage components combining Flash, Disk, LTFS tapes and Cloud. Hereby providing an unprecedented easy-to-use solution. Moreover, all data is reachable and usable by the end-users within the confines of their role and the access policies.
Hybrid storage reduces your storage TCO.
80% to 90% of your data is rarely consulted whereas 10% to 20% is regularly used. This is why NODEUM allows to store the data on that part of the Storage infrastructure that is most cost effective. For instance, moving rarely used files automatically to Tape or to the Cloud. This combined with an easy to manage solution lowers the OPEX costs significantly.
Furthermore, NODEUM is a Software Defined Storage platform that simplifies access to content in a very transparent manner. The integrated Smart Metadata Catalogue structures the storage of content efficiently and makes metadata searching easy and fast.
Lastly, NODEUM’s “State of the Art” and native REST API enables an easy and comprehensive access and control of the solution. End-users appreciate the intuitive search tool to find the data they need. Also, giving trend analysis on the storage use and statistics.
Prysm, Booth 129
Product Name: Prysm Visual Workplace
prysm.com/
Prysm Visual Workplace is a combination of the Prysm Application Suite, Prysm’s device agnostic software interface that allows for virtually unlimited content sharing from any device, and Prysm’s deeply immersive interactive displays that can meet the requirements of any room from small huddle spaces up to large auditoriums.
Depending on a customers need, the Prysm Application Suite can be delivered via Microsoft’s secure ISO 27001-certified Azure cloud platform or hosted on an end customer’s infrastructure. Regardless of the delivery mechanism, users can access digital workspaces through interactive touch displays or any mobile device. These workspaces are persistent, highly flexible and configurable to fit into any existing technology ecosystem. This means content is always saved so it can be referenced in the future, even when the meeting is over, so ideas, work, and momentum is never lost in search of a solution.
Further, the solutions ability to mirror content to different locations and devices means the actions and views of data and content (such as annotations, new data views, and screen shares) within one of Prysm’s digital workspaces is duplicated in real-time on other’s screens around the world and supports up to 25 users on mobile devices actively editing and interacting with it.
Pure Storage, Booth 537
Product Name: FlashBlade
purestorage.com/flashblade
FlashBlade is an elastic scale-out system that delivers all-flash performance to petabyte-scale data sets with overall economics comparable to legacy hybrid arrays. Optimized for high concurrency, high bandwidth, high IOPS and consistently low latency in a small, 4U form factor, FlashBlade is equipped to handle the most demanding workloads generated by today’s high-performance applications. FlashBlade is simple to install, deploy and operate, which accelerates customer time to meaningful data analysis. Together, Pure Storage FlashBlade and the Pure Storage FlashArray offer a complete platform for organizations to build their all-flash cloud and gain advantage from their data.
The product is shipping today in both 8TB and 52TB blade capacities with Elasticity 1.2 software, all in an unbelievably small yet high performance all-flash footprint.
Qumulo, Booth 333
Product Name: QC360
qumulo.com
The QC360 is a scale-out storage system designed for web-scale IT environments. It lets customers achieve maximum capacity and cooling efficiency of their data centers, while also maintaining tier-one storage performance. The QC360 delivers three petabytes of usable storage and 10GB/s per rack of throughput at less than $0.01 per gigabyte per month — making it the industry’s leading choice for optimal density, performance and cost. The company’s recently-updated software solution, Qumulo Core 2.6, allows for hardware independence that gives customers the freedom to deploy scale-out storage across a wide range of Qumulo’s hardware platforms (including the QC360) as well as other third party hardware for both on-premises and public cloud deployments.
Finalist: SciBite, Booth 532
Product Name: SciBite LaunchPad
scibite.com/
SciBite offers Semantics as a Service, available through its Java based restful API, with the following capabilities:
- Highly curated scientific ontologies, built upon open standards
- Formal based Named Entity Recognition
- Relationship mapping and extraction, identifying patterns
- Elastic Search of semantically rich data
- Live enrichment of browser based content
- Seamless connectivity to third party applications, providing search and connectivity
SciBite provides pluggable technology, so that you can integrate semantic enrichment exactly where you need it.
Fast, lightweight and simple to use, we transform data by providing technologies that understand the scientific content they process.
Seagate, Booth 143
Product Name: ClusteStor G300N
seagate.com/enterprise-storage/
Powered by the software-based Nytro Intelligent I/O Manager, the ClusterStor G300N with Spectrum Scale seamlessly runs multiple mixed workloads simultaneously on the same storage platform, eliminating performance bottlenecks that can result when data demands outpace what the existing storage architecture can accommodate. As a result, organizations can use it to automatically support multiple applications that generate a diverse range of I/O workloads on the same storage platform without negatively impacting performance. It’s particularly suitable for the kinds of mixed and unpredictable workloads found in many of today’s most demanding, data-intensive HPC applications like seismic processing, financial transition modeling, machine learning, geospatial intelligence and fluid dynamics.
Ideal for organizations seeking both peak performance and cost efficiency when managing large data sets at scale with unpredictable workloads, the ClusterStor 300N represents the convergence of Seagate’s market leading enterprise class hard drives, innovative solid state designs and the industry’s most sophisticated system software within a platform purpose built to help organizations manage and move massive amounts of critical data while maintaining workload efficiency and minimizing the cost per terabyte. The Nytro Intelligent I/O Manager software delivers up to 1,000 percent input/output workload acceleration over traditional HPC storage systems and can quickly scale to accommodate any workload at any time.
Finalist: Seven Bridges, Booth 432 & 434
Product Name: CAVATICA
sevenbridges.com
CAVATICA lets researchers collaboratively manage and analyze data related to a number of rare diseases and cancers. Access control features allow for sharing of datasets of all scales. Researchers can search and view information about available datasets. When datasets of interest are found, users can request access directly from the data owners. Making data available to share and analyze in a single environment means that researchers can broaden their focus and make new connections beyond one disease.
CAVATICA promotes open standards using the Common Workflow Language (CWL). By integrating with CWL, analyses in CAVATICA are completely reproducible, and can be replicated in any environment with a CWL Executor.
CAVATICA is designed to appeal to users from a variety of clinical and research backgrounds. Users who prefer a graphic interface can use the platform via an intuitive website interface. A RestFul API allows for programmatic operation. Users can also bring their own tools to the platform using the open Rabix Software Developer Kit (SDK).
The SDK allows users to easily wrap existing tools for use in CAVATICA such that tools become fully portable, by first installing them inside Docker containers and the describing their behavior in accordance with CWL. Every facet of an application, including its command line arguments, runtime environment, parameters, and computational requirements are captured. Developing software with the Rabix SDK meant that there is no need to reconfirgure existing command line tools to meet a proprietary format, and the tools remain runnable across a diverse range of infrastructures.
Finalist: SolveBio , Booth 337
Product Name: SolveBio Operating System for Molecular Information
solvebio.com
SolveBio is a cloud-based operating system for molecular information that enables cross-disciplinary R&D groups to use complex multi-omics data from disparate sources to find biomarkers, stratify populations, and design clinical trials. SolveBio’s mapping technology transforms disparate internal and external data sources locked away in filesystems and databases into biomedical concepts such as variants, genes, patients, samples, phenotypes through SolveBio’s entity extraction process. Data is formatted, indexed, and maintained by SolveBio’s cloud solution, then delivered by applications and APIs to the right people and the right workflows at the biopharma customer organization.
As a result, scientists are empowered to directly engage in data exploration and incorporate molecular data into decision-making without bioinformatics and IT support for data querying and visualization. Scientists can answer day-to-day questions on their data with easy-to-use web interfaces and familiar tools like Microsoft Excel and Google Sheets (through SolveBio plugins). SolveBio also seamlessly draws data from workflow engines such as DNAnexus and Seven Bridges, and moves clean, filtered data to visualization tools such as Spotfire and Tableau.
Using the SolveBio API clients for R, Python, Ruby, and JavaScript, developers can easily query internal and external data and deploy applications that run on complex molecular data. Bioinformatics scientists can now engage in mission-critical research and scale up their support of research colleagues by automating tasks on SolveBio.
Finalist: Starfish Storage, Booth 332
Product Name: Starfish V4
starfishstorage.com
Starfish is a suite of software modules that interact to create a holistic, managed storage environment.
The Starfish Core Catalog tracks the contents of conventional file storage devices and cloud-style object stores. Users and applications are able to associate metadata with files and directories, building awareness of the business and scientific value of the files. Starfish employs state of the art techniques to synchronize its database with file systems scaling into the billions of objects and Petabytes of capacity.
Modules:
Rules Manager / Jobs Engine -- Starfish enforces rules by running scheduled batch jobs based on metadata values in the catalog. Jobs are run in parallel across a multitude of agents. Agents have built in functionality such as data migration and hash calculation, but they can also run custom code.
Report Engine -- Starfish has an interactive GUI allowing you to treewalk your filesytems, see recursive totals and aggregates at various levels of the tree. You can also query, find, and kick off jobs from within the GUI or generate more meaningful reports with utilization and trending details that are much more detailed that traditional tools as a result of the metadata values that can be used to define the result set.
Namespace Solutions -- Starfish metadata is the foundation for a global namespace that provides unique identifiers for all files and directories.
All modules interact through RESTful API. Thus, Starfish serves as a middleware for any data management solution that references files stored on conventional file systems and object stores.
Finalist: Twigkit, Booth 157
Product Name: Twigkit App Studio
twigkit.com
Previously the sole remit of data scientists and business analysts; Twigkit allows scientists, researchers and clinicians to find answers and collaborate through highly targeted (purpose-built) applications that can be created in hours and days instead of weeks and months.
Proven within research, product development, drug discovery and clinical trials, Twigkit self-service research and discovery applications seamlessly and securely bring together all data: from any source (such as big data infrastructures), anywhere, and on any device.
As more organizations standardize their applications on Twigkit technology it has become imperative to provide them with a tool and infrastructure that allows data and design teams to quickly create new apps.
Our new Data Self-Service infrastructure coupled with the App Studio we are announcing at the show makes it easier than ever to rapidly create search, discovery and analytics applications that accurately meet the requirements of users. Using this intuitive visual editor anyone can orchestra data acquisition and then build a user experience optimized application that not only looks great but accurately matches the business case in question.
Twigkit's Data Insights will automatically offer precise, picture perfect analysis of the data available and suggest pre-built application templates that match the information available. This brings an increased level of speed and efficiency to the process while allowing the organization to maintain standards of design and corporate branding across all their applications.
Veeam, Booth 2
Product Name: Veeam Availability Suite 9.5
veeam.com
Veeam turned each challenge GHS faced into opportunities for 24.7.365 Availability. Veeam Availability Suite delivers something fundamentally different — Availability for the Always-On Enterprise — leveraging IT investments in server virtualization, modern storage and the cloud to help organizations meet today’s service-level objectives.
“The systems that help us provide patients with the finest care and protect their privacy are up and running at all times, thanks to Veeam,” Johnson said. “Including Epic, the driving force of our healthcare system and the centerpiece of our digital transformation strategy, as well as document management, patient identification tracking and laptop encryption.”
During proof-of-concept, Veeam proved its worth time and time again, beginning with quick recovery of virtual machines running Microsoft SQL Server. Veeam backed up and recovered a VM supporting a patient identification tracking system, despite the legacy backup vendor telling GHS that backup wasn’t possible. The system is critical to patient safety because it enables clinicians to use wireless, hand-held devices that monitor patients’ locations within the hospital and verify their identification before dispensing and administering medication.
“We save 1,300 hours each year in troubleshooting time, which saved $70,000 because we didn’t have to hire someone to focus on backup,” Shuford said. We saved $250,000 in backup storage because utilization dropped from 90% to 60%— Veeam deletes outdated restore points.” During a three-year period, GHS saved nearly 100% of the cost of legacy backup.
Shuford uses the hours he saves in troubleshooting to fine-tune the backup environment with Veeam monitoring, reporting and capacity planning.
Western Digital Corporation, Booth 560
Product Name: ActiveScale x100 and P100
hgst.com/products/systems/activescale-x100-system
The ActiveScale System is a fully self-contained, scale-out object storage system that can exceed the scale and TCO benefits of traditional cloud or tape infrastructure. ActiveScale can easily keep up with growing data by both scaling-up and scaling-out delivering 840 terabytes (TB) of raw data storage in a single rack and scaling-out to over 52 petabytes (PB) of raw storage all of which can be managed from a single pane of glass for high IT productivity. The ActiveScale System is an easy to implement solution that helps data centers easily evolve from silo-ed data storage to cloud-scale active archiving with extreme data durability for high data integrity. Your existing cloud applications use the same access protocol as the ActiveScale which has native support for Amazon S3, and gateways are available to accommodate your file-based laboratory devices.
Additionally, the ActiveScale System uses advanced erasure coding, strong consistency and BitDynamics to deliver extreme data durability of up to 17 nines (99.999999999999999% durability) to ensure valuable research results are well protected and always available. In a multi-geo implementation, data remains consistent and accessible even during a full data center outage. Through background data integrity checking, the system automatically and transparently detects and corrects data degradation, eliminating the risks and media management activities associated with tape-based archives.
Wiley, Booth 429
Product Name: Wiley Metlin/XCMS Plus
wiley.com
Metlin is the original and most comprehensive MS/MS reference library featuring high quality experimental data from known metabolites. Developed by the Scripps Center for Metabolomics, Metlin features more than 961,829 molecules, covering lipids, steroids, plant & bacteria metabolites, small peptides, carbohydrates, exogenous drugs/metabolites, central carbon metabolites and toxicants. Metlin contains over 14,000 metabolites, with over 200,000 in-silico MS/MS data. Metlin is the largest MS/MS collection of data, with multiple collision energies in positive and negative ionization modes. Special features such as “fragment search” or “neutral loss search” perform a search for any uploaded user data that does not match a compound in the database by searching for characteristic fragments that can be used for molecular classification. This collection allows investigators to compare MS2 data from their research samples to MS2 data from compounds in the database using automated matching, resulting in improved speed, efficiency, and cost effectiveness of untargeted studies.
METLIN provides links and information for every one of its 960,000 compounds. These include name, systematic name, structure, elemental formula, mass, CAS number, KEGG ID and link, HMDB ID and link, PubChem ID and link, commercial availability and direct search options on the molecule itself. Data were generated using multiple instruments, including Agilent, Bruker and Waters QTOF mass spectrometers.